High Availability Options for Microsoft SQL Server in the Google Cloud

I was recently interviewed by VMblog about high availability options for SQL Server. You can check out the interview here http://vmblog.com/

For the step by step guide I previously published, check it out here https://clusteringformeremortals.com/2018/01/10/how-to-build-a-sanless-sql-server-failover-cluster-instance-in-google-cloud-platform/

High Availability Options for Microsoft SQL Server in the Google Cloud


With the introduction of Windows Server 2016 Datacenter Edition a new feature called Storage Spaces Direct (S2D) was introduced. At a very high level, this solution allows you to pool together locally attached storage and present it to the cluster as a CSV for use in a Scale Out File Server, which can then be accessed over SMB 3 and used to hold cluster data such as Hyper-V VMDK files. This can also be configured in a hyper-converged (HCI) fashion such that the application and data can all run on the same set of servers.  This is a grossly over-simplified description, but for details, you will want to look here.


Storage Spaces Direct StackImage taken from https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-overview

The main use case targeted is hyper-converged infrastructure for Hyper-V deployments. However, there are other use cases, including leveraging this SMB storage to store SQL Server Data to be used in a SQL Server Failover Cluster Instance

Why would anyone want to do that? Well, for starters you can now build a highly available 2-node SQL Server Failover Cluster Instance (FCI) with SQL Server Standard Edition, without the need for shared storage. Previously, if you wanted HA without a SAN you pretty much were driven to buy SQL Server Enterprise Edition and make use of Always On Availability Groups or purchase SIOS DataKeeper and leverage the 3rd party solution which lets you build SANless clusters with any version of Windows or SQL Server. SQL Server Enterprise Edition can really drive up the cost of your project, especially if you were only buying it for the Availability Groups feature.

In addition to the cost associated with Availability Groups, there are a number of other technical reasons why you might prefer a Failover Cluster over an AG. Application compatibility, instance vs. database level protection, large number of databases, DTC support, trained staff, etc., are just some of the technical reasons why you may want to stick with a Failover Cluster Instance.

Microsoft lists both the SIOS DataKeeper solution and the S2D solution as two of the supported solutions for SQL Server FCI in their documentation here.



When comparing the two solutions, you have to take into account that SIOS has been allowing you to build SANless Clusters since 1999, while the S2D solution is still in its infancy.  Having said that, there are bound to be some areas where S2D has some catching up to do, or simply features that they will never support simply due to the limitations with the technology.

Have a look at the following table for an overview of some of the things you should consider before you choose your SANless cluster solution.


If we go through this chart, we see that SIOS DataKeeper clearly has some significant advantages. For one, DataKeeper supports a much wider range of platforms, going all the way back to Windows Server 2008 R2 and SQL Server 2008 R2. The S2D solution only supports the latest releases of Windows and SQL Server 2016/2017. S2D also requires the  Datacenter Edition of Windows, which can add significantly to the cost of your deployment. In addition, SIOS delivers the ONLY HA/DR solution for SQL Server on Linux that works both on-prem and in the cloud.

But beyond the cost and platform limitations, I think the most glaring gap comes when we start to consider disaster recovery options for your SANless cluster. Allan Hirt, SQL Server Cluster guru and fellow Microsoft Cloud and Datacenter Management MVP, recently posted about this S2D limitation. In his article Revisiting Storage Spaces Direct and SQL Server FCIs  Allan points out that due to the lack of support for stretching S2D clusters across sites or including an S2D based cluster as a leg in an Always On Availability Group, the best option for DR in the S2D scenario is log shipping!

Don’t get me wrong, log shipping has been around forever and will probably be around long after I’m gone, but that is taking a HUGE step backwards when we think about all the disaster recovery solutions we have become accustomed to, like multi-site clusters, Availability Groups, etc.

In contrast, the SIOS DataKeeper solution fully supports Always On Availability Groups, and better yet – it can allow you to stretch your FCI across sites to give you the best HA/DR solution you could hope to achieve in terms of RTO/RPO. In an Azure environment, DataKeeper also support Azure Site Recovery (ASR), giving you even more options for disaster recovery.

The rest of this chart is pretty self explanatory. It basically consist of a list hardware, storage and networking requirements that must be met before you can deploy an S2D cluster. An exhaustive list of S2D requirements is maintained here.  https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-direct-hardware-requirements

The SIOS DataKeeper solution is much more lenient. It supports any locally attached storage and as long as the hardware passes cluster validation, it is a supported cluster configuration. The block level replication solution has been working great ever since 1 Gbps was considered a fast LAN and a T1 WAN connection was considered a luxury.

SANless clustering is particularly interesting for cloud deployments. The cloud does not offer traditional shared storage options for clusters. So for users in the middle of a “lift and shift” to the cloud that want to take their clusters with them they must look at alternate storage solutions. For cloud deployments, SIOS is certified for AzureAWS and Google and available in the relevant cloud marketplace. While there doesn’t appear to be anything blocking deployment of S2D based clusters in Azure or Google, there is a conspicuous lack of documentation or supportability statements from Microsoft for those platforms.

SIOS DataKeeper has been doing this since 1999. SIOS has heard all the feature requests, uncovered all the bugs, and has a rock solid solution for SANless clusters that is time tested and proven. While Microsoft S2D is a promising technology, as a 1st generation product I would wait until the dust settles and some of the feature gap closes before I would consider it for my business critical applications.


How to Build a SANless SQL Server Failover Cluster Instance in Google Cloud Platform

If you are going to host SQL Server on the Google Cloud Platform (GCP) you will want to make sure it is highly available. One of the best and most economical ways to do that is to build a SQL Server Failover Cluster Instance (FCI). Since SQL Server Standard Edition supports Failover Clustering, we can avoid the cost associated with SQL Server Enterprise Edition which is required for Always On Availability Groups. In addition, SQL Server Failover Clustering is a much more robust solution as it protects the entire instance of SQL Server, has no limitations in terms of DTC (Distributed Transaction Coordinator) support and is easier to manage. Plus, it supports earlier versions of SQL Server that you may still have, such as SQL 2012 through the latest SQL 2017. Unfortunately, SQL 2008 R2 is not supported due to the lack of support for cross-subnet failover.

Traditionally, SQL Server FCI requires that you have a SAN or some type of shared storage device. In the cloud, there is no cluster-aware shared storage. In place of a SAN, we will build a SANless cluster using SIOS DataKeeper Cluster Edition (DKCE). DKCE uses block-level replication to ensure that the locally attached storage on each instance remains in sync with one other. It also integrates with Windows Server Failover Clustering through its own storage class resource called a DataKeeper Volume which takes the place of the physical disk resource. As far as the cluster is concerned the SIOS DataKeeper volume looks like a physical disk, but instead of controlling SCSI reservations, it controls the mirror direction, ensuring that only the active server writes to the disk and that the passive server(s) receive all the changes either synchronously or asynchronously.

In this guide, we will walk through the steps to build a two-node failover cluster between two instances in the same region, but in different Zones, within the GCP as shown in Figure 1.

Google Cloud Diagram

Download the entire white paper at https://us.sios.com/san-sanless-clusters-resources/white-paper-build-sql-server-failover-cluster-gcp/

How to Build a SANless SQL Server Failover Cluster Instance in Google Cloud Platform

Deploying a Highly Available File Server in Azure IaaS (ARM) with SIOS DataKeeper

In this post we will detail the specific steps required to deploy a 2-node File Server Failover Cluster in a single region of Azure using Azure Resource Manager. I will assume you are familiar with basic Azure concepts as well as basic Failover Cluster concepts and will focus this article on what is unique about deploying a File Server Failover Cluster in Azure.

With DataKeeper Cluster Edition you are able to take the locally attached storage, whether it is Premium or Standard Disks, and replicate those disks either synchronously, asynchronously or a mix or both, between two or more cluster nodes. In addition, a DataKeeper Volume resource is registered in Windows Server Failover Clustering which takes the place of a Physical Disk resource. Instead of controlling SCSI-3 reservations like a Physical Disk Resource, the DataKeeper Volume controls the mirror direction, ensuring the active node is always the source of the mirror. As far as Failover Clustering is concerned, it looks, feels and smells like a Physical Disk and is used the same way Physical Disk Resource would be used.


  • You have used the Azure Portal before and are comfortable deploying virtual machines in Azure IaaS.
  • Have obtained a license or eval license of SIOS DataKeeper

Deploying a File Server Failover Cluster Instance using the Azure Portal

To build a 2-node File Server Failover Cluster Instance in Azure, we are going to assume you have a basic Virtual Network based on Azure Resource Manager and you have at least one virtual machine up and running and configured as a Domain Controller. Once you have a Virtual Network and a Domain configured, you are going to provision two new virtual machines which will act as the two nodes in our cluster.

Our environment will look like this:

DC1 – Our Domain Controller and File Share Witness
SQL1 and SQL2 – The two nodes of our File Server Cluster

Provisioning the two cluster nodes (SQL1 and SQL2)

Using the Azure Portal, we will provision both SQL1 and SQL2 exactly the same way. There are numerous options to choose from including instance size, storage options, etc. This guide is not meant to be an exhaustive guide to deploying Servers in Azure as there are some really good resources out there and more published every day. However, there are a few key things to keep in mind when creating your instances, especially in a clustered environment.

Availability Set – It is important that both SQL1, SQL2 AND DC1 reside in the same availability set. By putting them in the same Availability Set we are ensuring that each cluster node and the file share witness reside in a different Fault Domain and Update Domain. This helps guarantee that during both planned maintenance and unplanned maintenance the cluster will continue to be able to maintain quorum and avoid downtime.

Figure 3 – Be sure to add both cluster nodes and the file share witness to the same Availability Set

Static IP Address

Once each VM is provisioned, you will want to go into the setting and change the settings so that the IP address is Static. We do not want the IP address of our cluster nodes to change.

Figure 4 – Make sure each cluster node uses a static IP


As far as Storage is concerned, you will want to consult Performance best practices for SQL Server in Azure Virtual Machines. In any case, you will minimally need to add at least one additional disk to each of your cluster nodes. DataKeeper can use Basic Disk, Premium Storage or even Storage Pools consisting of multiple disks in a storage pool. Just be sure to add the same amount of storage to each cluster node and configure it identically. Also, be sure to use a different storage account for each virtual machine to ensure that a problem with one Storage Account does not impact both virtual machines at the same time.

Figure 5 – make sure to add additional storage to each cluster node

Create the Cluster

Assuming both cluster nodes (SQL1 and SQL2) have been provisioned as described above and added to your existing domain, we are ready to create the cluster. Before we create the cluster, there are a few Features that need to be enabled. These features are .Net Framework 3.5 and Failover Clustering. These features need to be enabled on both cluster nodes. You will also need to enable the FIle Server Role.

Figure 6 – enable both .Net Framework 3.5 and Failover Clustering features and the File Server on both cluster nodes

Once that role and those features have been enabled, you are ready to build your cluster. Most of the steps I’m about to show you can be performed both via PowerShell and the GUI. However, I’m going to recommend that for this very first step you use PowerShell to create your cluster. If you choose to use the Failover Cluster Manager GUI to create the cluster you will find that you wind up with the cluster being issues a duplicate IP address.

Without going into great detail, what you will find is that Azure VMs have to use DHCP. By specifying a “Static IP” when we create the VM in the Azure portal all we did was create sort of a DHCP reservation. It is not exactly a DHCP reservation because a true DHCP reservation would remove that IP address from the DHCP pool. Instead, this specifying a Static IP in the Azure portal simply means that if that IP address is still available when the VM requests it, Azure will issue that IP to it. However, if your VM is offline and another host comes online in that same subnet it very well could be issued that same IP address.

There is another strange side effect to the way Azure has implemented DHCP. When creating a cluster with the Windows Server Failover Cluster GUI when hosts use DHCP (which they have to), there is not option to specify a cluster IP address. Instead it relies on DHCP to obtain an address. The strange thing is, DHCP will issue a duplicate IP address, usually the same IP address as the host requesting a new IP address. The cluster will usually complete, but you may have some strange errors and you may need to run the Windows Server Failover Cluster GUI from a different node in order to get it to run. Once you get it to run you will want to change the cluster IP address to an address that is not currently in use on the network.

You can avoid that whole mess by simply creating the cluster via Powershell and specifying the cluster IP address as part of the PowerShell command to create the cluster.

You can create the cluster using the New-Cluster command as follows:

New-Cluster -Name cluster1 -Node sql1,sql2 -StaticAddress -NoStorage

After the cluster creation completes, you will also want to run the cluster validation by running the following command:


Figure 7 – The output of the cluster creation and the cluster validation commands

Create File Share Witness

Because there is no shared storage, you will need to create a file share witness on another server in the same Availability Set as the two cluster nodes. By putting it in the same availability set you can be sure that you only lose one vote from your quorum at any given time. If you are unsure how to create a File Share Witness you can review this article http://www.howtonetworking.com/server/cluster12.htm. In my demo I put the file share witness on domain controller. I have published an exhaustive explanation of cluster quorums at https://blogs.msdn.microsoft.com/microsoft_press/2014/04/28/from-the-mvps-understanding-the-windows-server-failover-cluster-quorum-in-windows-server-2012-r2/

Install DataKeeper

After the cluster is created it is time to install DataKeeper. It is important to install DataKeeper after the initial cluster is created so the custom cluster resource type can be registered with the cluster. If you installed DataKeeper before the cluster is created you will simply need to run the install again and do a repair installation.

Figure 8 – Install DataKeeper after the cluster is created

During the installation you can take all of the default options.  The service account you use must be a domain account and be in the local administrators group on each node in the cluster.

Figure 9 – the service account must be a domain account that is in the Local Admins group on each node

Once DataKeeper is installed and licensed on each node you will need to reboot the servers.

Create the DataKeeper Volume Resource

To create the DataKeeper Volume Resource you will need to start the DataKeeper UI and connect to both of the servers.
10Connect to SQL1

Connect to SQL2

Once you are connected to each server, you are ready to create your DataKeeper Volume. Right click on Jobs and choose “Create Job”

Give the Job a name and description.

Choose your source server, IP and volume. The IP address is whether the replication traffic will travel.

Choose your target server.

Choose your options. For our purposes where the two VMs are in the same geographic region we will choose synchronous replication. For longer distance replication you will want to use asynchronous and enable some compression.

By clicking yes at the last pop-up you will register a new DataKeeper Volume Resource in Available Storage in Failover Clustering.

You will see the new DataKeeper Volume Resource in Available Storage.

Create the File Server Cluster Resource

To create the File Server Cluster Resource we will use Powershell once again rather than the Failover Cluster interface. The reason being is that once again because the virtual machines are configured to use DHCP, the GUI based wizard will not prompt us to enter a cluster IP address and instead will issue a duplicate IP address. To avoid this we will use a simple powershell command to create the FIle Server Cluster Resource and specify the IP Address

Add-ClusterFileServerRole -Storage "DataKeeper Volume E" -Name FS2 -StaticAddress

Make note of the IP address you specify here. It must be a unique IP address on your network. We will use this same IP address later when we create our Internal Load Balancer.

Create the Internal Load Balancer

Here is where failover clustering in Azure is different than traditional infrastructures. The Azure network stack does not support gratuitous ARPS, so clients cannot connect directly to the cluster IP address. Instead, clients connect to an internal load balancer and are redirected to the active cluster node. What we need to do is create an internal load balancer. This can all be done through the Azure Portal as shown below.

First, create a new Load Balancer

You can use an Public Load Balancer if your client connects over the public internet, but assuming your clients reside in the same vNet, we will create an Internal Load Balancer. The important thing to take note of here is that the Virtual Network is the same as the network where your cluster nodes reside. Also, the Private IP address that you specify will be EXACTLY the same as the address you used to create the SQL Cluster Resource.

After the Internal Load Balancer (ILB) is created, you will need to edit it. The first thing we will do is to add a backend pool. Through this process you will choose the Availability Set where your SQL Cluster VMs reside. However, when you choose the actual VMs to add to the Backend Pool, be sure you do not choose your file share witness. We do not want to redirect SQL traffic to your file share witness.

The next thing we will do is add a Probe. The probe we add will probe Port 59999. This probe determines which node is active in our cluster.

And then finally, we need a load balancing rule to redirect the SMB traffic, TCP port 445 The important thing to notice in the screenshot below is the Direct Server Return is Enabled. Make sure you make that change.


Fix the File Server IP Resource

The final step in the configuration is to run the following PowerShell script on one of your cluster nodes. This will allow the Cluster IP Address to respond to the ILB probes and ensure that there is no IP address conflict between the Cluster IP Address and the ILB. Please take note; you will need to edit this script to fit your environment. The subnet mask is set to, this is not a mistake, leave it as is. This creates a host specific route to avoid IP address conflicts with the ILB.

# Define variables
$ClusterNetworkName = “” 
# the cluster network name (Use Get-ClusterNetwork on Windows Server 2012 of higher to find the name)
$IPResourceName = “” 
# the IP Address resource name 
$ILBIP = “” 
# the IP Address of the Internal Load Balancer (ILB)
Import-Module FailoverClusters
# If you are using Windows Server 2012 or higher:
Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{Address=$ILBIP;ProbePort=59999;SubnetMask="";Network=$ClusterNetworkName;EnableDhcp=0}
# If you are using Windows Server 2008 R2 use this: 
#cluster res $IPResourceName /priv enabledhcp=0 address=$ILBIP probeport=59999  subnetmask=

Creating File Shares

You will find that using the File Share Wizard in Failover Cluster Manager does not work. Instead, you will simply create the file shares in Windows Explorer on the active node. Failover clustering automatically picks up those shares and puts them in the cluster.

Note that the”Continuous Availability” option of a file share is not supported in this configuration.


You should now have a functioning File Server Failover Cluster. If you have ANY problems, please reach out to me on Twitter @daveberm and I will be glad to assist. If you need a DataKeeper evaluation key fill out the form at http://us.sios.com/clustersyourway/cta/14-day-trial and SIOS will send an evaluation key sent out to you.

Deploying a Highly Available File Server in Azure IaaS (ARM) with SIOS DataKeeper

Replicating a 2-node SQL Server 2012/2014 Standard Edition Cluster to a 3rd Server for Disaster Recovery

Many people have found themselves settling for SQL Server Standard Edition due to the cost of SQL Server Enterprise Edition. SQL Server Standard Edition has many of the same features, but has a few limitations. One limitation is that it does not support AlwaysOn Availability Groups. Also, it only supports two nodes in a cluster. With Database Mirroring being deprecated and only supporting synchronous replication in Standard Edition, you really have limited disaster recovery options.

One of those options is SIOS DataKeeper Cluster Edition. DataKeeper will work with your existing shared storage cluster and allow you to extend it to a 3rd node using either synchronous or asynchronous replication. If you are using SQL Server Enterprise you can simply add that 3rd node as another cluster member and you have a true multisite cluster. However, since we are talking about SQL Server Standard Edition you can’t add a 3rd node directly to the cluster. The good news is that DataKeeper will allow you to replicate data to a 3rd node so your data is protected.

Recovery in the event of a disaster simply means you are going to use DataKeeper to bring that 3rd node online as the source of the mirror and then use SQL Server Management Studio to mount the databases that are on the replicated volumes. You clients will also need to be redirected to this 3rd node, but it is a very cost effective solution with an excellent RPO and reasonable RTO.

The SIOS documentation talks about how to do this, but I have summarized the steps recently for one of my clients.


  • Stop the SQL Resource
  • Remove the Physical Disk Resource From The SQL Cluster Resource
  • Remove the Physical Disk from Available Storage
  • Online Physical Disk on SECONDARY server, add the drive letter (if not there)
  • Run emcmd . setconfiguration <drive letter> 256
    and Reboot Secondary Server. This will cause the SECONDARY server to block access to the E drive which is important because you don’t want two servers having access to the E drive at the same time if you can avoid it.
  • Online the disk on PRIMARY server
  • Add the Drive letter if needed
  • Create a DataKeeper Mirror from Primary to DR
    You may have to wait a minute for the E drive to appear available in the DataKeeper Server Overview Report on all the servers before you can create the mirror properly. If done properly you will create a mirror from PRIMARY to DR and as part of that process DataKeeper will ask you about the SECONDARY server which shares the volume you are replicating.

In the event of a disaster….

On DR Node

  • Run EMCMD . switchovervolume <drive letter>
  • The first time make sure the SQL Service account has read/write access to all data and log files. You WILL have to explicitly grant this access the very first time you try to mount the databases.
  • Use SQL Management Studio to mount the databases
  • Redirect all clients to the server in the DR site, or better yet have the applications that reside in the DR site pre-configured to point to the SQL Server instance in the DR site.

After disaster is over

  • Power the servers (PRIMAY, SECONDARY) in the main site back on
  • Wait for mirror to reach mirroring state
  • Determine which node was previous source (run PowerShell as an administrator)
    get-clusterresource -Name “<DataKeeper Volume Resource name>” | get-clusterparameter
  • Make sure no DataKeeper Volume Resources are online in the cluster
  • Start the DataKeeper GUI on one cluster node. Resolve any split brain conditions (most likely there are none) ensuring the DR node is selected as the source during any split-brain recovery procedures
  • On the node that was reported as the previous source run EMCMD . switchovervolume <drive letter>
  • Bring SQL Server online in Failover Cluster Manager

The above steps assume you have SIOS DataKeeper Cluster Edition installed on all three servers (PRIMARY, SECONDARY, DR) and that PRIMARY and SECONDARY are a two node shared storage cluster and you are replicating data to DR which is just a standalone SQL Server instance (not part of the cluster) with just local attached storage. The DR Server will have a volume(s) that is the same size and drive letter as the shared cluster volume(s). This works rather well and will even let you replicate to a target that is in the cloud if you don’t have your own DR site configured.

You can also build the same configuration using all replicated storage if you want to eliminate the SAN completely.

Here is a nice short video that illustrates the some of the possible configurations. http://videos.us.sios.com/medias/aula05u2fl

Replicating a 2-node SQL Server 2012/2014 Standard Edition Cluster to a 3rd Server for Disaster Recovery

Highly Available SQL Server Storage Options in #Azure: SMB 3.0 File Service or Premium Storage, a look at performance differences

When looking at storage options for deployments of SQL Server deployments in Azure you have a few options as described in the article Windows Server Failover Cluster on Azure IAAS VM – Part 1 (Storage). The article also references the newly released Azure File Service which can be used to host your SQL Server cluster data over SMB 3.0. As of today the Azure File Service does not support Premium Storage, so you are limited to about 1000 IOPS or 60 MB/s per file share. With those limits in place I see Azure File Service really only being an option for databases that have minimal IO demands. We will see why that holds true based on my tests results below.


I wanted to test a few configurations, so I provisioned a DS4 VM and attached some premium storage to it. I also attached a SMB 3.0 File share using Azure File Service. The storage was configured as follows:

F:\ – Three 1 TB P30 Premium Storage Disks added to a single 3TB pool

G:\ – One 1 TB P30 Premium Storage Disk (no Storage Pool)

Z:\ – SMB 3.0 File share on Azure File Services

To configure the Storage Pool for use in a cluster you have to be careful how you proceed. You either have to create the Storage Pool before you create the cluster or you have to use the Powershell script described in Sql Alwayson with Windows 2012 R2 Storage Spaces if the cluster is already created. The pool I created was a Simple mirror (RAID 0) for increased performance. I’m not concerned about redundancy since the Azure storage on the backend has triple redundancy.

With three disk in the Storage Pool in a RAID 0 I expect I should get up to three times the performance of a single disk. Adding even more disk to the pool should give me even performance. A single P30 disk gives me 5000 IOPS and 200 MB/S, so for my pool I should expect up to 15000 IOPS and 600 MB/S throughput.

Now that I have the storage configured I configured Dskspd to run the same test on each of the different volumes. The parameters I used with Dskspd are as follows:

Diskspd.exe -b8K -d60 -h -L -o8 -t16 -r -w30 -c50M F:\io.dat

Diskspd.exe -b8K -d60 -h -L -o8 -t16 -r -w30 -c50M G:\io.dat

Diskspd.exe -b8K -d60 -h -L -o8 -t16 -r -w30 -c50M Z:\io.dat

The results were pretty predictable and summarized below

DskspdAs you can see, while this particular job did not push the upper limits of the theoretical maximum of any of these storage solutions, the latency had a significant impact on the overall performance of this particular test. The test used 8k blocks in a mix of 30% writes and 70% reads to simulate a typical SQL Server OLTP workload.

Of course the more money you want to spend the more performance you can expect to achieve. As of November 24, 2015 the price for the best solution shown here (F:\) would cost you $1,216/month and give you full access to 3 TB of storage with unlimited reads/writes. The second best solution (G:\) would give you 1 TB of storage at 1/3 the price, $405/month. The Azure File Share is priced at $0.10/GB plus additional charges for read/write operations. You are only charged for the actual usage so estimating the actual cost will be very dependent on your usage, but before the additional charges for read/write operations you are at about 25% of the cost of Premium Storage.

Prices, like everything else in the Cloud, tend to change rapidly to address the market demands. Have a look at the latest price information at https://azure.microsoft.com/en-us/pricing/details/storage/ for the latest price information.

In summary, while Azure File Services looks enticing from a price perspective, the latency at this point does not make it a viable option for any serious SQL Server workload. Instead, have a look at utilizing premium storage and leveraging either host based replication solutions such as SIOS DataKeeper to build SQL Server Failover Cluster Instances (SQL Standard or Enterprise) or look at SQL Server Enterprise Edition and AlwaysOn AG.

Highly Available SQL Server Storage Options in #Azure: SMB 3.0 File Service or Premium Storage, a look at performance differences

Configuring the SQL Server AlwaysOn ILB for the Client Listener in Azure Resource Manager (ARM) deployment model #SQLPASS

In preparation for my talk at PASS Summit this Friday, I’d figure I document something that I finally got working that I want to talk about during my session on highly available SQL Server in Azure. Now I just need to update my slide deck and my demo environment.

In case you didn’t know, Azure has two deployment models: Resource Manager (ARM) and Classic Deployment. Classic deployment is the “old” way of doing things and ARM is the new way of doing things. There are numerous benefits to using ARM as described in the Azure article Understanding Resource Manager deployment and classic deployment . However, one of my favorite new features of ARM is the ability to have three Fault Domains per Availability Set rather than just the two Fault Domains you get with the Classic deployment model. This is a critical feature for SQL High Availability.

With three fault domains you can ensure that each cluster node in a two node cluster and the file share witness all reside in different fault domains. This eliminates the possibility that the failure of a single Fault Domain would impact more than one quorum vote in your cluster. In the Classic deployment model with two fault domains you could only put two cluster nodes in an availability set. For maximum availability you really needed to put your file share witness in a different geographic location as there was no guarantee that it wouldn’t wind up in the same fault domain as one of your cluster nodes if you kept it in the same geographic location, meaning that the failure of a single fault domain could impact 2 out of your 3 quorum votes, bringing down your entire cluster. ARM’s three Fault Domains eliminates that possibility.

ARM is definitely the way to go as new Azure features are only being introduced in ARM. However, the documentation is light and some features are not quite there yet, including such things as documented support for ExpressRoute. Both of these problems get better almost daily, but early adopters really have to work extra hard until Azure catches up. One other issue is that you can’t mix Classic and ARM deployments, so if you started down the road with Classic deployments you are basically going to have to start from the ground up with Resource Manager when you make the switch. If you can manage it a little pain now will help you avoid a larger headache next year when you find that you want some new feature only available in ARM.

I hope this article helps you in at least one of aspect of your ARM deployment – getting highly available SQL Server deployed. As I have documented in earlier articles, deploying both AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances in Azure “Classic” requires the use of an Azure Load Balancer (internal or external) for client redirection. Getting that configured in Classic Azure is not exactly straight forward, but it is documented well enough that any administrator reasonably familiar with Azure, Failover Clustering, SQL Server and PowerShell can get it to work.

AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances using the ARM deployment model still requires the use of an Azure Load Balancer for client redirection, however the steps on creating and configuring that load balancer are completely different and as of today not exactly documented very well. In this article I’m going to highlight the steps required to configure the ILB and update the SQL cluster IP Resource. In the next article I will walk you through the entire process step by step from the ground up from creating the vNet to installing SQL and creating the cluster.

Before we begin, I’m making the following assumptions:

  • You have created a vNet using ARM
  • You have provisioned 3 ARM based VMs (DC, SQL1, SQL2)
  • You put DC, SQL1 and SQL2 in the same Availability Set and Resource Group
  • You have created a cluster with SQL1 and SQL2 and used the DC for the file share witness
  • You have either created an AlwaysOn Availability Group or AlwaysOn Failover Cluster instance with SIOS DataKeeper Cluster Edition. In either case you will wind up with a client listener, consisting of a name resource and IP resource. The AlwaysOn AG and FCI configuration up to the point of creating the load balancer is exactly the same as it is in the Azure Classic deployment model and documented on the web in many places including my own blog post

Now that you have a fully configured AlwaysOn AG or FCI, you probably notice that you can’t connect to the cluster name from any server other than the node that currently hosts the SQL cluster name resource. I’ve been told that this is because Azure networking does not support gratuitous ARPS so clients can’t communicate directly with the cluster IP address. Instead the clients need to communicate with the ILB and the ILB will redirect traffic to the active node. So step 1 is to create the ILB. As of now this can’t be done through the Azure Portal so we will use the following Azure PowerShell command.

[1/6/2016 Update – The directions below assume you are using Azure PowerShell pre-version 1. The script if you are using Azure PowerShell Version 1 or later is detailed in my blog post here.]

Switch-AzureMode -Name AzureResourceManager

Select-AzureSubscription -SubscriptionName "MSDN Azure"
# name whichever subscription you used to create your vNet and VMs

#Declare your variables using values relevant to your deployment

$ResourceGroupName ='SIOS-EAST-RG'
# Resource Group Name in which the SQL nodes are deployed

$FrontEndConfigurationName ='FE'
#Call it whatever you like

$BackendConfiguratioName ='BE'
#Call it whatever you like

$LoadBalancerName ='ILB'
#Provide a Name for the Internal Local balance object

$Location ='eastus2'
# Input the data center location of your SQL VMs

$subname ='PUBLIC'
# Provide the Subnet name in which the SQL Nodes are placed

$ILBIP = ''
# Provide the IP address for the Listener or Load Balancer

$subnet = Get-AzureVirtualNetwork -ResourceGroupName $ResourceGroupName|Get-AzureVirtualNetworkSubnetConfig –name $subname

$FEConfig = New-AzureLoadBalancerFrontendIpConfig -Name $FrontEndConfigurationName -PrivateIpAddress $ILBIP -SubnetId $subnet.Id

$BackendConfig = New-AzureLoadBalancerBackendAddressPoolConfig -Name $BackendConfiguratioName

#create the ILB
New-AzureLoadBalancer -Name $LoadBalancerName -ResourceGroupName $ResourceGroupName -Location $Location
-FrontendIpConfiguration $FEConfig -BackendAddressPool $BackendConfig


Now that the ILB is created, we should see it in the Azure Portal if we list all the objects in our Resource Group as shown below.

The rest of the configuration I’m sure can also be done through PowerShell, but I’m going to use the GUI in my example. If you want to use PowerShell you could probably piece together the script by looking at the article Get started configuring internal load balancer using Azure Resource Manager but honestly that article gives me a headache. I’ll figure it out some day and try to document it in a user friendly format, but for now I think the GUI is fine for the next steps.

Follow along with the screen shots below. If you get lost, follow the navigation hints at the top of the Azure Portal to figure out where we are.

Click Backend Pool setting tab and selects the backend pool to update the Availability Set and Virtual Machines. Save your changes.

Configure Load Balancer’s Probe by clicking Add on the Probe tab. Give the probe a name and configure it to use TCP Port 59999. I have left the probe interval and the unhealthy threshold set to the default settings, which means it will take 10 seconds before the ILB removes the passive node from the list of active nodes after a failover, meaning your clients may take up to 10 seconds to be redirected to the new active node. Be sure to save your changes.

Navigate to the Load Balancing Rule Tab and add a new rule. Give the rule a sensible name (SQL1433 or something) and choose TCP protocol port 1433 (assuming you are using the default instance of SQL Server). Choose 1433 for the Backend port as well. For the Backend Pool we will choose the Backend Pool we created earlier (BE) and for the Probe we will also choose the Probe we created earlier. We do not want to enable Session persistence but we do want to enable Floating IP (Direct Server Return). I have left the idle timeout set to the default setting, but you might want to consider increasing that to the maximum value as I have seen some applications such as SAP log error messages each time the connection is dropped and needs to be re-established.

At this point the ILB is configured and there is only one final step that needs to take place. We need to update the SQL IP Cluster Resource just the exact same way we had to in the Classic deployment model. To do that you will need to run the following PowerShell script on just one of the cluster nodes. And make note, SubnetMask=“” is not a mistake, use the 32 bit mask regardless of what your actual subnet mask is.

# This script should be run on the primary cluster node after the internal load balancer is created
# Define variables

$ClusterNetworkName = "Cluster Network 1"
# the cluster network name

$IPResourceName = "SQL IP Address 1 (SQLCluster1)"
# the IP Address resource name

$CloudServiceIP = ""
# IP address of your Internal Load Balancer

Import-Module FailoverClusters

# If you are using Windows 2012 or higher, use the Get-Cluster Resource command. If you are using Windows 2008 R2, use the cluster res command which is commented out.

Get-ClusterResource $IPResourceName
Set-ClusterParameter -Multiple @{"Address"="$CloudServiceIP";"ProbePort"="59999";SubnetMask="";"Network"="$ClusterNetworkName";"OverrideAddressMatch"=1;"EnableDhcp"=0}

# cluster res $IPResourceName /priv enabledhcp=0 overrideaddressmatch=1 address=$CloudServiceIP probeport=59999 subnetmask=

I have just one final note. In my initial test I still was not able to connect to the SQL Resource name even after I completed all of the above steps. After banging my head against the wall for a few hours I discovered that for some reason the SQL Cluster Name Resource was not registered in DNS. I'm not sure how that happened or whether it will happen consistently, but if you are having trouble connecting I would definitely check DNS and add the SQL cluster name and IP address as a new A record if it is not already in there.

And of course don't forget the good ole Windows Firewall. You will have to make exceptions for 1433 and 59999 or just turn it off until you get everything configured properly like I did. You probably want to leverage Azure Network Security Groups anyway instead of the local Windows Firewall for a more unified experience across all your Azure resources.

Good luck and let me know how you make out.

Configuring the SQL Server AlwaysOn ILB for the Client Listener in Azure Resource Manager (ARM) deployment model #SQLPASS