Every time I read a blog post, or open a magazine article about virtualization and disaster recovery I see the same thing….VMware has a more robust DR solution than Microsoft. Well, I’d like to challenge that assumption. From the view where I sit, this is actually one of the areas where Microsoft has a major competitive advantage at the moment. Here is how I see it.
VMware Site Recovery Manager
This is an optional additional add on that rides on the back of Array based replication solutions. While the recovery point objective is good due to the array based replication, the RTO is measured in hours, not minutes. Add in the fact that moving back to the primary data center is a very manual procedure which basically requires that you re-create your jobs in the opposite direction; the complete end to end recovery operation of failover and failback could take the better part of a day or longer.
Microsoft Multi-Site Cluster
Virtual machine HA clustering is included with the free version of Hyper-V Server 2008 R2, as well as with Windows Server 2008 Enterprise and Datacenter editions. In order to do multi-site clusters, it requires array based replication or host based replication solutions that integrate with Windows Server Failover Clustering. With a multi-site cluster, failover is measured in minutes (just about the time it takes to start a VM) and can be used with array based replication solutions such as EMC SRDF CE or HP MSA CLX or the much less expensive host based replication solutions such as SteelEye DataKeeper Cluster Edition.
Not only is failover quick with Hyper-V multi-site clusters, measured in just a few minutes, failback is also quick and seamless as well. Add in support for Live Migrations or Quick Migration across Data Centers, I think this is one area that Microsoft actually has a much more robust solution than VMware. Maybe it does not included automated DR tests, but when you consider you can failover and failback all in under 10 minutes, maybe an actual DR test performed monthly would give you a much better indication of what to expect in an actual disaster?
If you want a Hyper-V solution more like SRM, then there is an option there as well, it is called Citrix Essential for Hyper-V. But much like SRM, it is an optional add-on feature and really doesn’t even match the RPO and RTO features that you can achieve with basic multi-site clusters for Hyper-V.
What do you think? Am I wrong or is there something I just don’t get? From my view, Hyper-V is heads and shoulders above vSphere in terms of disaster recovery features.
16 thoughts on “Are VMware’s vSphere Disaster Recovery Options Really Better than Microsoft’s options for Hyper-V?”
What does Citrix Essentials for Hyper-V provide that you can’t do with Hyper-V WFCS ? More importantly, does it fill in the gap of having to cluster your (live migration enabled) VMs at the application level to overcome the BSOD etc. Does it do what in theory SRM can i.e failover to a snapshot ?
[…] Are VMware’s vSphere Disaster Recovery Options Really Better than Microsoft’s options for Hyper-… […]
What you say here is, from personal experience, entirely true.
I am, or rather was, a VMware consultant for a bunch of years and I have to say that with Hyper-V R2 i am beginning to run out of arguments. Sure, you can’t really overcommit, but in my opinion the only time you should do that is with VDI. In all other cases I count overcommitting VMware hosts as poorly designed.
Anyway, the article was about clustering and yes. Hyper-V R2 is perfectly capable of live-migrating an entire cluster between datacenters, and there was no light-weight guests running ontop of that cluster at the time either. The servicedesk didn’t notice that their entire management environment has switched to another data-center.
And unless I’ve missed something with vSphere 4.1, VMware still hasn’t released any support for CLX, right?
Hasn’t the overcommit been addressed with dynamic memory management released with SP1 (currently in beta) for 08 R2 for Hyper-V ?
Yes Rick, you are correct, memory overcommit is included with 2008 R2 SP1 currently in beta.
It’s my understanding that you cannot use multi-site clusters with Cluster Shared Volumes. Is that correct?
You have to check with your storage vendor, I’ve seen various vendors claim support but I don’t know for sure which ones do a true “multisite cluster”. I’ve seen NetApps demo with CSV but from what I could tell they have their own failover mechanism and basically reprovision the VM in the DR site instead of doing a true failover with WSFC. I think I have seen Hitachi claim support and various storage virtualization vendors claim support. I do not know of any host based replication solutions supporting CSV at the moment.
The next release of Site Recovery Manager caters for protocol-neutral host-based replication. This is a big step forward and removes the requirement for SAN-based replication altogether. Also if it’s true fault tolerance you’re after have a look at VMware’s Fault Tolerant mode, where cross-site failover is measured literally in milliseconds.
Yes, HBR looks exciting! First integrated with SRM and then I hear vMotion, VMwareHA, etc. However, I don’t think you will see VMwareFT failing across data centers as shared storage is still a requirement.
from experiencing it firsthand, I still think VMware is ahead, especially for the SMB market (and let’s face it, that’s the first market MS targets with HyperV).
It all comes down to the ability to be able to reimport the VM directly from the VM files.
In VMware it is perfectly possible, whereas in HyperV you can’t because you need to export a VM in order to be able to reimport it.
Although I don’t understand this weird limitation, it exists.
1- you can’t use only array replication with Hyper-V because VM configuration is lost
2- multisite cluster can’t be used with CSV, which makes them useless in most but smallest environment
3- hence you need either array support, which is not available for entry and mid-market arrays or
4- host-replication software (Steel Eye, Double Take), which adds cost AND complexity. As these softwares installs themselves deeply in HyperV (driver filters, correct me if I’m wrong), it creates dependencies : updating the hosts or troubleshooting issues will be more difficult. Besides, are those compatible with backup solutions as DPM 2010 ?
On the other side, with VMware you just need to attach the replicated volume and then register the VMs. It is even possible with free ESXi.
If you have paid licences, you can use software like Veeam that adds application consistency, but is still less complex and intrusive than Double Take or Steel Eyes solutions : Veeam doesn’t install itself in the ESX hosts, it runs entirely outside the hosts thanks to the remote API (vStorage API) VMware provides.
Now don’t get me wrong : I believe valid solutions exist with HyperV, but they are ill-suited for SMB which may lack the money/skills to maintain them.
And isn’t cost/simplicity the main reasons Microsoft pushes against VMware ?
Edit to my previous posts : it seems SteelEye and Double Take can’t replicate CSV.
The SteelEye one is from 2009 so it may have changed, but DoubleTake definitely doesn’t support it at the moment.
You are correct, there is no host based replication that works with CSV at the moment, you have to use regular LUNs.
I think you make some valid points, specifically about lack of CSV support. Early adopters of CSV found themselves with lack of support from many different software vendors, particularly the backup vendors. As the technology matured it certainly is being adopted more often than not. For those who want to do multi-site clusters, they simply must make the choice to not use CSV. CSV is good and certainly makes storage management easier, but there is a misperception that it is required for Live Migration. Live Migration works just fine on a standard physical disk or even a replicated volume.
The cost of host based replication is a small fraction of what and array based replication solution will cost and the complexity is minimal, but of course that is all a matter of opinion.
The point of my article is that a multi-site cluster with Hyper-V is a better DR solution than SRM when you consider price, RTO and RPO. When VMware release HBR I may have a different story to tell 🙂
Nice discussion –
From a Double-Take’s point of view I’d like to add some information about the possibility to replicate Hyper-V in DR with/without a CVS implementation.
A * Without CSV – we have two possibilities:
A1. a stretch cluster implementing with DT GeoCluster as extention of the MSFT Multi-Site Cluster
A2. a host level replication for Hyper-V hosts (like VEEAM for VMware). DT Virtual Host for Hyper-V is a specific product constructed to replicate the active VM from an Hyper-V to another Hyper-V. Cluster without CVS is supported and you can also use Hyper-V Server or Windows Server Standard in stand-alone mode.
B * with CSV * Double-Take has a guest2host level real-time replication: from a CVS is always possible to reconstruct the virtual machines in DR into an Hyper-V infrastructure. The technology is similar to Double-Take Virtual Recovery Assistant (VRA) used in VMware enviroment for DR continuos replication http://www.doubletake.com/english/resources/videos/pages/default.aspx?ResourceID=97&SiteType=Global&w=550&h=450
No need to change your infrastructure, RPO & RTO at the minimum, no active vm in the target without failover, integration with SCVMM. It’s always a real-time replication for WAN and for heterogeneous hardware.
Right Antonio. Any solution that runs within the VM will be unaware of the CSV and will function just fine. So clustering within the guest or bare metal recovery solutions like DoubleTake, PlateSpin, AppAssure and others are all certainly options that can be used when CSV eliminates the possibility of replicating at the Hypervisor level.
I’m catching on the latest comments. I agree that CSV is not a requirement, but when using specific technology it can get mandatory quickly.
For instance, if I want to use a cheaper version of SAN storage I might need to keep iSCSI connections down to a certain level, easily reached when using standard LUNs.
@Antonio : about the solution B you’re talking about:
can you clarify how the solution works?
does it create a VM in the DR site out of the data it can replicate from within the guest ?
if yes, does it eliminate the need to reconfigure virtual network interfaces (recreating a VM does this using standard HyperV tools) ?
Also, what is the solution name ? the link you provide points to a general section of your website.
@dave : I’m trying to reach you by mail, but with limited success. We’re currently building cloud offerings based on Hyper-V and we are kind of stuck on the DR side at the moment (our biggest offerings requires CSV). I would love to have some insights on future CSV support from Datakeeper.