Step-by-Step: iSCSI Target Server Cluster in Azure

I recently helped someone build an iSCSI target server cluster in Azure and realized that I never wrote a step-by-step guide for that particular configuration. So to remedy that, here are the step-by-step instructions in case you need to do this yourself.


I’m going to assume you are fairly familiar with Azure and Windows Server, so I’m going to spare you some of the details. Let’s assume you have at least done the following as a pre-requisite

  • Provision two servers (SQL1, SQL2) each in a different Availability Zone (Availability Set is also possible, but Availability Zones have a better SLA)
  • Assign static IP addresses to them through the Azure portal
  • Joined the servers to an existing domain
  • Enabled the Failover Clustering feature and the iSCSI Target Server feature on both nodes
  • Add three Azure Premium Disk to each node.
    NOTE: this is optional, one disk is the minimum required. For increased IOPS we are going to stripe three Premium Azure Disks together in a storage pool and create a simple (RAID 0) virtual disk
  • SIOS DataKeeper is going to be used to provided the replicated storage used in the cluster. If you need DataKeeper you can request a trial here.

Create local Storage Pool

Once again, this step is completely optional, but for increased IOPS we are going to stripe together three Azure Premium Disks into a single Storage Pool. You might be tempted to use Dynamic Disk and a spanned volume instead, but don’t do that! If you use dynamic disks you will find out that there is some general incompatibility that will prevent you from creating iSCSI targets later.

Don’t worry, creating a local Storage Pool is pretty straight forward if you are aware of the pitfalls you might encounter as described below. The official documentation can be found here.

Pitfall #1 – although the documentation says the minimum size for a volume to be used in a storage pool is 4 GB, I found that the P1 Premium Disk (4GB) was NOT recognized. So in my lab I used 16GB P3 Premium Disks.

Pitfall #2 – you HAVE to have at least three disks to create a Storage Pool.

Pitfall #3 – create your Storage Pool before you create your cluster. If you try to do it after you create your cluster you are going to wind up with a big mess as Microsoft tries to create a clustered storage pool for you. We are NOT going to create a clustered storage pool, so avoid that mess by creating your Storage Pool before you create the cluster. If you have to add a Storage Pool after the cluster is created you will first have to evict the node from the cluster, then create the Storage Pool.

Based on the documentation found here, below are the screenshots that represent what you should see when you build your local Storage Pool on each of the two cluster nodes. Complete these steps on both servers BEFORE you build the cluster.

You should see the Primordial pool on both servers.
Right-click and choose New Storage Pool…
Choose Create a virtual disk when this wizard closes
Notice here you could create storage tiers if you decided to use a combination of Standard, Premium and Ultra SSD
For best performance use Simple storage layout (RAID 0). Don’t be concerned about reliability since Azure Managed Disks have triple redundancy on the backend. Simple is required for optimal performance.
For performance purposes use Fixed provisioning. You are already paying for the full Premium disk anyway, so no need not to use it all.
Now you will have a 45 GB X drive on your first node. Repeat this entire process for the second node.

Create your Cluster

Now that each server each have their own 45 GB X drive, we are going to create the basic cluster. Creating a cluster in Azure is best done via Powershell so that we can specify a static IP address. If you do it through the GUI you will soon realize that Azure assigns your cluster IP a duplicate IP address that you will have to clean up, so don’t do that!

Here is an example Powershell code to create a new cluster.

 New-Cluster -Name mycluster -NoStorage -StaticAddress -Node sql1, sql2

The output will look something like this.

PS C:\Users\dave.DATAKEEPER> New-Cluster -Name mycluster -NoStorage -StaticAddress -Node sql1, sql2
WARNING: There were issues while creating the clustered role that may prevent it from starting. For more information view the report file below.
WARNING: Report file location: C:\windows\cluster\Reports\Create Cluster Wizard mycluster on 2020.05.20 At 16.54.45.htm


The warning in the report will tell you that there is no witness. Because there is no shared storage in this cluster you will have to create either a Cloud Witness or a File Share Witness. I’m not going to walk you through that process as it is pretty well documented at those links.

Don’t put this off, go ahead and create the witness now before you move to the next step!

You now should have a basic 2-node cluster that looks something like this.

Configure a Load Balancer for the Cluster Core IP Address

Clusters in Azure are unique in that the Azure virtual network does not support gratuitous ARP. Don’t worry if you don’t know what that means, all you have to really know is that cluster IP addresses can’t be reached directly. Instead, you have to use an Azure Load Balancer, which redirects the client connection to the active cluster node.

There are two steps to getting a load balancer configured for a cluster in Azure. The first step is to create the load balancer. The second step is to update the cluster IP address so that it listens for the load balancer’s health probe and uses a subnet mask which enables you to avoid IP address conflicts with the ILB.

We will first create a load balancer for the cluster core IP address. Later we will edit the load balancer to also address the iSCSI cluster resource IP address that we will be created at the end of this document.

Step 1 – Create a Standard Load Balancer

Notice that the static IP address we are using is the same address that we used to create the core cluster IP resource.

Once the load balancer is created you will edit the load balancer as shown below

Add the two cluster nodes to the backend pool
Add a health probe. In this example we use 59999 as the port. Remember that port, we will need it in the next step.
Create a new rue to redirect all HA ports, Make sure Floating IP is enabled.

Step 2 – Edit to cluster core IP address to work with the load balancer

As I mentioned earlier, there are two steps to getting the load balancer configured to work properly. Now that we have a load balancer, we have to run a Powershell script on one of the cluster nodes. The following is an example script that needs to be run on one of the cluster nodes.

$ClusterNetworkName = “Cluster Network 1” 
$IPResourceName = “Cluster IP Address” 
$ILBIP = “” 
Import-Module FailoverClusters
Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{Address=$ILBIP;ProbePort=59998;SubnetMask="";Network=$ClusterNetworkName;EnableDhcp=0} 

The important thing about the script above, besides getting all the variables correct for your environment, is making sure the ProbePort is set to the same port you defined in your load balancer settings for this particular IP address. You will see later that we will create a 2nd health probe for the iSCSI cluster IP resource that will use a different port. The other important thing is making sure you leave the subnet set at It may look wrong, but that is what it needs to be set to.

After you run it the output should look like this.

 PS C:\Users\dave.DATAKEEPER> $ClusterNetworkName = “Cluster Network 1” 
$IPResourceName = “Cluster IP Address” 
$ILBIP = “” 
Import-Module FailoverClusters
Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{Address=$ILBIP;ProbePort=59999;SubnetMask="";Network=$ClusterNetworkName;EnableDhcp=0}
WARNING: The properties were stored, but not all changes will take effect until Cluster IP Address is taken offline and then online again.

You will need to take the core cluster IP resource offline and bring it back online again before it will function properly with the load balancer.

Assuming you did everything right in creating your load balancer, your Server Manager on both servers should list your cluster as Online as shown below.

Check Server Manager on both cluster nodes. Your cluster should show as “Online” under Manageability.

Install DataKeeper

I won’t go through all the steps here, but basically at this point you are ready to install SIOS DataKeeper on both cluster nodes. It’s a pretty simple setup, just run the setup and choose all the defaults. If you run into any problems with DataKeeper it is usually one of two things. The first issue is the service account. You need to make sure the account you are using to run the DataKeeper service is in the Local Administrators Group on each node.

The second issue is in regards to firewalls. Although the DataKeeper install will update the local Windows Firewall automatically, if your network is locked down you will need to make sure the cluster nodes can communicate with each other across the required DataKeeper ports. In addition, you need to make sure the ILB health probe can reach your servers.

Once DataKeeper is installed you are ready to create your first DataKeeper job. Complete the following steps for each volume you want to replicate using the DataKeeper interface

Use the DataKeeper interface to connect to both servers
Click on create new job and give it a name
Click Yes to register the DataKeeper volume in the cluster
Once the volume is registered it will appear in Available Storage in Failover CLuster Manager

Create the iSCSI target server cluster

In this next step we will create the iSCSI target server role in our cluster. In an ideal world I would have a Powershell script that does all this for you, but for sake of time for now I’m just going to show you how to do it through the GUI. If you happen to write the Powershell code please feel free to share with the rest of us!

There is one problem with the GUI method. ou will wind up with a duplicate IP address in when the IP Resource is created, which will cause your cluster resource to fail until we fix it. I’ll walk through that process as well.

Go to the Properties of the failed IP Address resource and choose Static IP and select an IP address that is not in use on your network. Remember this address, we will use it in our next step when we update the load balancer.

You should now be able to bring the iSCSI cluster resource online.

Update load balancer for iSCSI target server cluster resource

Like I mentioned earlier, clients can’t connect directly to the cluster IP address ( we just created for the iSCSI target server cluster. We will have to update the load balancer we created earlier as shown below.

Start by adding a new frontend IP address that uses the same IP address that the iSCSI Target cluster IP resource uses.
Add a second health probe on a different port. Remember this port number, we will use it again in the powershell script we run next
We add one more load balancing rule. Make sure to change the Frontend IP address and Health probe to use the ones we just created. Also make sure direct server return is enabled.

The final step to allow the load balancer to work is to run the following Powershell script on one of the cluster nodes. Make sure you use the new Healthprobe port, IP address and IP Resource name.

 $ClusterNetworkName = “Cluster Network 1” 
$IPResourceName = “IP Address” 
$ILBIP = “” 
Import-Module FailoverClusters
Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{Address=$ILBIP;ProbePort=59998;SubnetMask="";Network=$ClusterNetworkName;EnableDhcp=0} 

Your output should look like this.

 PS C:\Users\dave.DATAKEEPER> $ClusterNetworkName = “Cluster Network 1” 
$IPResourceName = “IP Address” 
$ILBIP = “” 
Import-Module FailoverClusters
Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{Address=$ILBIP;ProbePort=59998;SubnetMask="";Network=$ClusterNetworkName;EnableDhcp=0}
WARNING: The properties were stored, but not all changes will take effect until IP Address is taken offline and then online again.

Make sure to take the resource offline and online for the settings to take effect.

Create your clustered iSCSI targets

Before you begin, it is best to check to make sure Server Manager from BOTH servers can see the two cluster nodes, plus the two cluster name resources, and they both appear “Online” under manageability as shown below.

If either server has an issue querying either of those cluster names then the next steps will fail. If there is a problem I would double check all the steps you took to create the load balancer and the Powershell scripts you ran.

We are now ready to create our first clustered iSCSI targets. From either of the cluster nodes, follows the steps illustrated below as an example on how to create iSCSI targets.

Of course assign this to whichever server or servers will be connecting to this iSSI target.

And there you have it, you now have a functioning iSCSI target server in Azure.

If you build this leave a comment and me know how you plan to use it!

Step-by-Step: iSCSI Target Server Cluster in Azure

Storage Considerations for Running SQL Server in Azure

If you are deploying SQL Server in Azure, or any Cloud platform for that matter, instead of just provisioning storage like you did for your on-premises deployments for many years, you may consider that storage in the Azure isn’t exactly like the storage you may have had access to on-premises. Some traditional “best practices” may wind up costing you additional money and give you less than optimal performance, all while not providing you any of the intended benefits. Much of what I am about to discuss is also described in Performance Guidelines for Azure in SQL Server Virtual Machines.

Disk Types

I’m not here to tell you that you must use UltraSSD, Premium Storage, or any other disk type. You just need to be aware that you have options, and what each disk type brings to the table. Of course, like anything else in the cloud, the more money you spend, the more power, speed, throughput, etc., you will achieve. The trick is finding the optimal configuration so that you spend just enough to achieve the desired results.

Size DOES Matters

Like many things in the cloud, certain specs are tied together. For servers if you want more RAM you often get more CPU, even if you didn’t NEED more CPU. For storage, IOPS, throughput and size are all tied together. If you want more IOPS, you need a bigger disk. If you need more space, you also get more IOPS. Of course you can jump between storage classes to circumvent that to some extent, but it still holds true that if you need more IOPS, you also get more space on any of the different storage types.

The size of your virtual machine instance also matters. Regardless of what storage configuration you eventually go with, the overall throughput will be capped at whatever the instance size allows. So once again, you may need to pay for more RAM and CPU than you need, just to achieve your desired storage performance. Make sure you understand what your instance size can support in terms of max IOPS and MBps throughput. Many times the instance size will turn out to be the bottleneck in a perceived storage performance problem in Azure.

Use RAID 0

RAID 0 is traditionally the 3rd rail of storage configuration options. Although it provides the best combination of performance and storage utilization of any RAID option, it does so at the risk of a catastrophic failure. If just a single disk in a RAID 0 stripe set should fail, the entire stripe set fails. So traditionally RAID 0 is only used in scenarios where data loss is acceptable and high performance is desirable.

However, in Azure software RAID 0 is desirable and even recommended in many situations. How can we get away with RAID 0 in Azure? The answer is easy. Each disk you present to an Azure virtual machine instance already has triple redundancy on the backend, meaning you would need to have multiple failures before you would lose your stripe set. By using RAID 0, you can combine multiple disks and the overall performance of the combined stripe set will increase by 100% for each additional disk you add to the stripe set.

So for example, if you had a requirement of 10,000 IOPS, you might think that you need UltraSSD since Premium Storage maxes out at 7,500 IOPS with a P50. However, if you put two P50s in a RAID 0, you now have the potential to achieve up to 15,000 IOPS, assuming you are running a Standard_F16s_v2 or similarly large instance size that supports that many IOPS.

In Windows 2012 and later, RAID 0 is achieved by creating a Simple Storage Space. In Windows Server 2008 R2 you can use Dynamic Disks to create a RAID 0 Striped Volume. Just a word of caution, if you are going to use a local Storage Space and also configure Availability Groups or a SANless Failover Cluster Instance with DataKeeper, it is best to configure your storage BEFORE you create a cluster.

Just a reminder, you only have about two more months to move your SQL Server 2008 R2 instances to Azure. Check out my post on how to deploy a SQL Server 2008 R2 FCI on Azure to ensure high availability.

Don’t bother separating log and data files

Traditionally log and data files would reside on different physical disks. Log files tend to have a lot of write activity and data files tend to have more read activity, so sometimes storage would be optimized based on those characteristics. It was also desirable to keep log and data files on different disks for recovery purposes. If you should lose one or the other, with a proper backup strategy in place you could recover your database with no data loss.

With cloud based storage, the likelihood of losing just a single volume is EXTREMELY low. If by chance you lose storage, it is likely your entire storage cluster, along with the triple redundancy, went to lunch. So while it may feel right to put logs in E:\ logs and data in F:\data, you really are doing yourself a disservice. For example, if you provision a P20 for logs and a P20 for data, each volume will be 512 GiB in size and capped at 2,300 IOPS. And just think, you may not need all that size for log files, but it might not give you much room to grow for your data files, which will eventually require moving to a more expensive P30 just for the extra space.

Wouldn’t it be much nicer to simply stripe those two volumes together into a nice large 1 TB volume that supports 4,600 IOPS? By doing that both the log and data files can take advantage of the increased IOPS and you have also just optimized your storage utilization and decreased your cloud storage cost by putting off the move to a P30 disk for your data file.

The same holds true files and filegroups. Really think hard about what you are doing and whether it still makes sense once you move to the cloud.  What makes sense might be counter intuitive to what you have done in the past. When in doubt, follow the KISS rule, Keep It Simple Stupid! The beauty of the cloud is you can always add more storage, increase instance size, or do whatever it takes to optimize performance vs. cost.

What to do about TempDB

Use the local SSD, aka, the D: drive. The D drive is going to be the best location for your tempdb. Because it is a local drive the data is considered “temporary”, meaning it can be lost if a server is moved, rebooted, etc. That’s okay, tempdb is recreated each time SQL starts anyway. The local SSD is going to be fast and have low latency, but because it is local the reads and writes to it do not contribute to the overall storage IOPS limit of the instance size, so effectively it is FREE IOPS, so why not take advantage? If you are building a SANless SQL Server FCI with SIOS DataKeeper, be sure to create a non-mirrored volume resource of the D drive so you don’t needlessly replicate TempDB.

Mount Points Become Obsolete

Mount Points are commonly used in SQL Server FCI configurations when multiple instances of SQL Server are installed on the same Windows Cluster. This reduces the overall cost of SQL Server licenses and can help save cost by driving higher server utilization. As we discussed in the past, typically there might be five or more drives associated with each SQL Server instance. If each of those drives had to consume a drive letter you would run out of letters in just about three to four instances. So instead of giving each drive a letter, mount points were used so that each instance could just be serviced by a single drive letter, the root drive. The root drive has mount points that map to separate physical disks that don’t have drive letters.

However, as we discussed above, the concept of using a bunch of individual disks really doesn’t make a lot of sense in the cloud, hence mount points become obsolete in the cloud. Instead, create a RAID 0 stripe we as described and each clustered instance SQL Server will simply have its own individual volume that is optimised for space, performance and cost. This solves the problem of running out of drive letters and gives you much better storage utilization and performance while also reducing the cost of your cloud storage.


This post is meant as a jumping off point, not a definitive guide. The main point of the post is to get you thinking differently about cloud and storage as it pertains to running SQL Server. Don’t simply take what you did on-premise and recreate it in the cloud, that will almost always result in less than optimal performance and a much larger storage bill than necessary.

Storage Considerations for Running SQL Server in Azure

Step-by-Step: Configuring a File Server Cluster in Azure that Spans Availability Zones

In this post we will detail the specific steps required to deploy a 2-node File Server Failover Cluster that spans the new Availability Zones a single region of Azure. I will assume you are familiar with basic Azure concepts as well as basic Failover Cluster concepts and will focus this article on what is unique about deploying a File Server Failover Cluster in Azure across Availability Zones.  If your Azure region doesn’t support Availability Zones yet you will have to use Fault Domains instead as described in an earlier post.

With DataKeeper Cluster Edition you are able to take the locally attached Managed Disks, whether it is Premium or Standard Disks, and replicate those disks either synchronously, asynchronously or a mix or both, between two or more cluster nodes. In addition, a DataKeeper Volume resource is registered in Windows Server Failover Clustering which takes the place of a Physical Disk resource. Instead of controlling SCSI-3 reservations like a Physical Disk Resource, the DataKeeper Volume controls the mirror direction, ensuring the active node is always the source of the mirror. As far as Failover Clustering is concerned, it looks, feels and smells like a Physical Disk and is used the same way Physical Disk Resource would be used.


  • You have used the Azure Portal before and are comfortable deploying virtual machines in Azure IaaS.
  • Have obtained a license or eval license of SIOS DataKeeper

Deploying a File Server Failover Cluster Instance using the Azure Portal

To build a 2-node File Server Failover Cluster Instance in Azure, we are going to assume you have a basic Virtual Network based on Azure Resource Manager and you have at least one virtual machine up and running and configured as a Domain Controller. Once you have a Virtual Network and a Domain configured, you are going to provision two new virtual machines which will act as the two nodes in our cluster.

Our environment will look like this:

DC1 – Our Domain Controller and File Share Witness
SQL1 and SQL2 – The two nodes of our File Server Cluster. Don’t let the names confuse you, we are building a File Server Cluster in this guide. In my next post I will demonstrate a SQL Server cluster configuration.

Provisioning the two cluster nodes

Using the Azure Portal, we will provision both SQL1 and SQL2 exactly the same way.  There are numerous options to choose from including instance size, storage options, etc. This guide is not meant to be an exhaustive guide to deploying Servers in Azure as there are some really good resources out there and more published every day. However, there are a few key things to keep in mind when creating your instances, especially in a clustered environment.

Availability Zones – It is important that both SQL1, SQL2 reside in different Availability Zones. For the sake of this guide we will assume you are using Windows 2016 and will use a Cloud Witness for the Cluster Quorum. If you use Windows 2012 R2 or Windows Server 2008 R2 instead of Windows 2016 you will instead need to configure a File Share Witness in the 3rd Availability Zone as Cloud Witness was not introduced until Windows Server 2016.

By putting the cluster nodes in different Availability Zones we are ensuring that each cluster node resides in a different Azure datacenter in the same region. Leveraging Availability Zones rather than the older Fault Domains isolates you from the types of outages that occured just a few weeks ago that brought down the entire South Central region for multiple days.

Availability Zones
Be sure to add each cluster node to a different Availability Zone. If you leverage a File Share Witness it should reside in the 3rd Availability Zone.

Static IP Address

Once each VM is provisioned, you will want to go into the setting and change the settings so that the IP address is Static. We do not want the IP address of our cluster nodes to change.

Static IP
Make sure each cluster node uses a static IP


As far as Storage is concerned, you will want to consult Performance best practices for SQL Server in Azure Virtual Machines. In any case, you will minimally need to add at least one additional Managed Disk to each of your cluster nodes. DataKeeper can use Basic Disk, Premium Storage or even multiple disks striped together in a local Storage Space. If you do want to use a local Storage Space just be aware that you should create the Storage Space BEFORE you do any cluster configuration due to a known issue with Failover Clustering and local Storage Spaces. All disks should be formatted NTFS.

Create the Cluster

Assuming both cluster nodes (SQL1 and SQL2) have been provisioned as described above and added to your existing domain, we are ready to create the cluster. Before we create the cluster, there are a few Features that need to be enabled. These features are .Net Framework 3.5 and Failover Clustering. These features need to be enabled on both cluster nodes. You will also need to enable the FIle Server Role.

Enable both .Net Framework 3.5 and Failover Clustering features and the File Server on both cluster nodes

Once that role and those features have been enabled, you are ready to build your cluster. Most of the steps I’m about to show you can be performed both via PowerShell and the GUI. However, I’m going to recommend that for this very first step you use PowerShell to create your cluster. If you choose to use the Failover Cluster Manager GUI to create the cluster you will find that you wind up with the cluster being issued a duplicate IP address.

Without going into great detail, what you will find is that Azure VMs have to use DHCP. By specifying a “Static IP” when we create the VM in the Azure portal all we did was create sort of a DHCP reservation. It is not exactly a DHCP reservation because a true DHCP reservation would remove that IP address from the DHCP pool. Instead, this specifying a Static IP in the Azure portal simply means that if that IP address is still available when the VM requests it, Azure will issue that IP to it. However, if your VM is offline and another host comes online in that same subnet it very well could be issued that same IP address.

There is another strange side effect to the way Azure has implemented DHCP. When creating a cluster with the Windows Server Failover Cluster GUI, there is not option to specify a cluster IP address. Instead it relies on DHCP to obtain an address. The strange thing is, DHCP will issue a duplicate IP address, usually the same IP address as the host requesting a new IP address. The cluster install will complete, but you may have some strange errors and you may need to run the Windows Server Failover Cluster GUI from a different node in order to get it to run. Once you get it to run you will need to change the core cluster IP address to an address that is not currently in use on the network.

You can avoid that whole mess by simply creating the cluster via Powershell and specifying the cluster IP address as part of the PowerShell command to create the cluster.

You can create the cluster using the New-Cluster command as follows:

New-Cluster -Name cluster1 -Node sql1,sql2 -StaticAddress -NoStorage

After the cluster creation completes, you will also want to run the cluster validation by running the following command. You should expect to see some warnings about storage and network, but that is expected in Azure and you can ignore those warnings. If any errors are reported you will need to address those before you move on.


Create a Quorum Witness

if you are running Windows 2016 or 2019 you will need to create a Cloud Witness for the cluster quorum. If you are running Windows Server 2012 R2 or 2008 R2 you will need to create a File Share Witness. The detailed instruction on witness creation can be found here.

Install DataKeeper

After the cluster is created it is time to install DataKeeper. It is important to install DataKeeper after the initial cluster is created so the custom cluster resource type can be registered with the cluster. If you installed DataKeeper before the cluster is created you will simply need to run the install again and do a repair installation.

Install DataKeeper after the cluster is created

During the installation you can take all of the default options.  The service account you use must be a domain account and be in the local administrators group on each node in the cluster.

The service account must be a domain account that is in the Local Admins group on each node

Once DataKeeper is installed and licensed on each node you will need to reboot the servers.

Create the DataKeeper Volume Resource

To create the DataKeeper Volume Resource you will need to start the DataKeeper UI and connect to both of the servers.
10Connect to SQL1

Connect to SQL2

Once you are connected to each server, you are ready to create your DataKeeper Volume. Right click on Jobs and choose “Create Job”

Give the Job a name and description.

Choose your source server, IP and volume. The IP address is whether the replication traffic will travel.

Choose your target server.

Choose your options. For our purposes where the two VMs are in the same geographic region we will choose synchronous replication. For longer distance replication you will want to use asynchronous and enable some compression.

By clicking yes at the last pop-up you will register a new DataKeeper Volume Resource in Available Storage in Failover Clustering.

You will see the new DataKeeper Volume Resource in Available Storage.

Create the File Server Cluster Resource

To create the File Server Cluster Resource we will use Powershell once again rather than the Failover Cluster interface. The reason being is that once again because the virtual machines are configured to use DHCP, the GUI based wizard will not prompt us to enter a cluster IP address and instead will issue a duplicate IP address. To avoid this we will use a simple powershell command to create the FIle Server Cluster Resource and specify the IP Address

Add-ClusterFileServerRole -Storage "DataKeeper Volume E" -Name FS2 -StaticAddress

Make note of the IP address you specify here. It must be a unique IP address on your network. We will use this same IP address later when we create our Internal Load Balancer.

Create the Internal Load Balancer

Here is where failover clustering in Azure is different than traditional infrastructures. The Azure network stack does not support gratuitous ARPS, so clients cannot connect directly to the cluster IP address. Instead, clients connect to an internal load balancer and are redirected to the active cluster node. What we need to do is create an internal load balancer. This can all be done through the Azure Portal as shown below.

You can use an Public Load Balancer if your client connects over the public internet, but assuming your clients reside in the same vNet, we will create an Internal Load Balancer. The important thing to take note of here is that the Virtual Network is the same as the network where your cluster nodes reside. Also, the Private IP address that you specify will be EXACTLY the same as the address you used to create the File Server Cluster Resource. Also, because we are using Availability Zones we will be creating a Zone Redundant Standard Load Balancer as shown in the picture below.

Load Balancer

After the Internal Load Balancer (ILB) is created, you will need to edit it. The first thing we will do is to add a backend pool. Through this process you will choose the two cluster nodes.

Backend Pools

The next thing we will do is add a Probe. The probe we add will probe Port 59999. This probe determines which node is active in our cluster.

And then finally, we need a load balancing rule to redirect the SMB traffic, TCP port 445 The important thing to notice in the screenshot below is the Direct Server Return is Enabled. Make sure you make that change.


Fix the File Server IP Resource

The final step in the configuration is to run the following PowerShell script on one of your cluster nodes. This will allow the Cluster IP Address to respond to the ILB probes and ensure that there is no IP address conflict between the Cluster IP Address and the ILB. Please take note; you will need to edit this script to fit your environment. The subnet mask is set to, this is not a mistake, leave it as is. This creates a host specific route to avoid IP address conflicts with the ILB.

# Define variables
$ClusterNetworkName = “” 
# the cluster network name (Use Get-ClusterNetwork on Windows Server 2012 of higher to find the name)
$IPResourceName = “” 
# the IP Address resource name 
$ILBIP = “” 
# the IP Address of the Internal Load Balancer (ILB)
Import-Module FailoverClusters
# If you are using Windows Server 2012 or higher:
Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{Address=$ILBIP;ProbePort=59999;SubnetMask="";Network=$ClusterNetworkName;EnableDhcp=0}
# If you are using Windows Server 2008 R2 use this: 
#cluster res $IPResourceName /priv enabledhcp=0 address=$ILBIP probeport=59999  subnetmask=

Creating File Shares

You will find that using the File Share Wizard in Failover Cluster Manager does not work. Instead, you will simply create the file shares in Windows Explorer on the active node. Failover clustering automatically picks up those shares and puts them in the cluster.

Note that the”Continuous Availability” option of a file share is not supported in this configuration.


You should now have a functioning File Server Failover Cluster in Azure that spans Availability Zones. If you have ANY problems, please reach out to me on Twitter @daveberm and I will be glad to assist. If you need a DataKeeper evaluation key fill out the form at and SIOS will send an evaluation key sent out to you.

For more information visit

Step-by-Step: Configuring a File Server Cluster in Azure that Spans Availability Zones

Windows Azure High Availability Options for SQL Server #Azure #Cloud #IaaS

Protecting against downtime associated with cloud outages is something that anyone deploying on ANY cloud service needs to address. While it could be easy to simply deploy your app in “the cloud” and assume that it is someone else’s problem to manage now, the reality is that while cloud providers probably have more resources and expertise to ensure your servers stay up, the ultimate responsibility to ensure that your critical application is available rests squarely on your shoulders.

Believe it or not, simply deploying your SQL Server in Windows Azure does not make it “highly available”. To make it highly available you must use traditional tools and techniques that you might use in your own datacenter. While there is some varying of opinion on this topic, I believe that SQL Server 2012/2014 high availability options are as follows:

Regardless of which option you choose, you are going to want to become familiar with the Windows Azure Fault Domain as described below:

“Nonetheless, in Windows Azure a rack of computers is indeed identified as a fault domain. And the allocation of a fault domain is determined by Windows Azure at deployment time. A service owner cannot control the allocation of a fault domain, however can programmatically find out which fault domain a service is running within. Windows Azure Compute service SLA guarantees the level of connectivity uptime for a deployed service only if two or more instances of each role of a service are deployed”

So before you get started, when you deploy your Windows Azure VMs, you must make sure that each SQL Server and any “witness” servers reside in different Fault Domains. You do this by putting all of the VMs in the same “Availability Set”. This ensures that each server in the same Availability Set resides in a different Fault Domain, hopefully eliminating all single points of failure.

1 – VMs in the same Availability Set will be provisioned in different Fault Domains

By putting all of your VMs in different Fault Domains and configuring a SQL Server Failover Cluster or Availability Group, you are protecting against the usual types of outages that might be localized to a single rack of servers, AKA, Fault Domain. I’ve written a step-by-step article entitled Creating a SQL Server 2014 AlwaysOn Failover Cluster (FCI) Instance in Windows Azure IaaS which should help in your endeavor to build resiliency within the Azure cloud for your SQL Server.

But what happens if Windows Azure has a major outage that takes out a whole region? Natural disaster or human error would likely be the cause of such an outage. Unfortunately, at this point there is no way to stretch an Azure Virtual Private Network between two different Azure Regions, which includes: West US, West Europe, Southeast Asia, South Central US, North Europe, North Central US, East Asia, and East US. However, the Azure Virtual Private Network can support a site-to-site VPN connection with a limited number of VPN devices from Cisco, Juniper and even Microsoft RRAS.

That leads us to thinking about alternate locations outside of Azure, even our own private data center. I recently wrote a step-by-step article that explains how to extend your on premise datacenter to the Azure Cloud. Once you have your datacenter connected to Windows Azure, you can either configure AlwaysOn Availability Groups or AlwaysOn Failover Clustering (multisite) for protection from a catastrophic Azure failure. I’ve written previously about the Advantages of Multisite Clustering vs. Availability Groups, so in my lab I decided to create a 2-node SQL Failover Cluster Instance up in Azure and then add a 3rd node in my primary data center. I’ve written the detailed configuration steps in my blog post entitled Creating a Multisite Cluster in Windows Azure for Disaster Recovery.

If you rather use AlwaysOn Availability Groups, you probably want to visit the tutorials called AlwaysOn Availability Groups in Windows Azure (GUI) and Listener Configuration for AlwaysOn Availability Groups in Windows Azure. If you are using SQL 2008 R2 or earlier I’m sure you could configure database mirroring, but at this point if your are moving to Azure I’m assuming you are probably deploy SQL Server 2012 or 2014. Other technology like Log Shipping and Replication are options for moving data, but I don’t consider them high availability solutions.

If you are deploying highly available SQL Server in Windows Azure IaaS please leave me a comment; I’d love to know what you are doing. If you have any questions please leave a comment as well and I will be sure to get back to you.



Windows Azure High Availability Options for SQL Server #Azure #Cloud #IaaS

Creating a multi-site cluster in Windows Azure for Disaster Recovery #Azure #Cloud

This is the 4th post in my series on High Availability and Disaster Recovery for Windows Azure. This is a step-by-step post, or a “how to” post that will build upon the Azure configuration that we built during my first three articles…

  1. How to Create a Site-to-Site VPN Tunnel to the Windows Azure Cloud Using a Window Server 2012 R2 Routing and Remote Access (RRAS) Server
  2. Extending Your Datacenter to the Azure Cloud #Azure
  3. Creating a SQL Server 2014 AlwaysOn Failover Cluster (FCI) Instance in Windows Azure IaaS #Azure #Cloud

We are now going to extend the existing cluster (SQL1 and SQL2) to your local data center, SQL3. This configuration will give you both high availability for your application within the Azure Cloud, as well as a disaster recovery solution should Azure suffer a major outage. You could configure this in reverse as well with your on premise datacenter as your primary site and use Windows Azure as your disaster recovery site. And of course this solution illustrates SQL Server as the application, but any cluster aware application can be protected in the same fashion.

At this point, if you have been following along your network should look like the illustration below.

Add SQL3 to the cluster

To add SQL3 to the cluster the first thing we need to do is make sure SQL3 is up and running, fully patched and added to the domain. We also need to make sure that it has an F:\ drive attached that is of the same size as the F:\ drives in use in Azure. And finally, if you relocated tempdb on the SQL cluster, make sure you have the directory structure where tempdb is located pre-configured on SQL1 as well.

Next we will add the Failover Cluster feature to SQL3.

With failover clustering installed on SQL3, we will open Failover Cluster Manager on SQL1 and click Add Node

Select SQL3 and click Next

Run all the validation tests on SQL3

Let’s take a look at some of the warnings in the validation report. The RegisterAllProvidersIP property is set to 1, which can be good in a multisite cluster. You can read more about this setting here:

This next warning talks about only having a single network between the cluster nodes. At this time Azure only supports a single network interface between VMs, so there is nothing you can do about this warning. However, this network interface is fully redundant behind the scenes, so you can safely ignore this message.

Of course you are going to see a lot of warnings around storage. That’s because this cluster has no shared storage. Instead it relies on replicated storage by SIOS DataKeeper Cluster Edition. As stated below, this is perfectly fine as the database will be kept in sync with the replication software.

We are now ready to add SQL3 to the cluster.

Once you click Finish, SQL3 will be added to the cluster as shown below.

However, there are a few things we need to do to complete this installation. Next we will work of the following steps:

  • Add an additional IP address to the Cluster Name Object
  • Tune the heartbeat settings
  • Extend the DataKeeper mirror to SQL3
  • Install SQL 2014 on SQL3

Add an additional IP address to the Cluster Name Object

When we added SQL3 to the cluster it went from a single site cluster to a multi-subnet cluster. If the cluster was originally created as a single site cluster and you later add a node that resides in a different subnet, you have to manually add a second IP address to the Cluster Name Object and create an OR dependency. For more information on this topic, view the following article.

To add a second IP address to the Cluster Name Object (CNO), we must use the PowerShell commands described in the article mentioned above.

Now if you are following along with the MSDN article I referenced, you would expect to see these “NewIP” somewhere in the GUI. However, at least with Windows 2012 R2 I am not currently seeing this resource in the GUI.

However, if I right click on the SQLCLUSTER name and choose properties and try to add NewIP as a dependency, I see it is listed as a possible resource.

Choose “NewIP” and also make the dependency type “OR” as shown below.

Once you click OK, it now appears in the GUI as an IP Address that needs to be configured.

We can now choose the properties of this IP Address and configure the address to use an IP address that is not currently in use in the subnet, which is the same subnet where SQL3 resides.

Tune the Heartbeat Settings

We now are ready to tune the heartbeat settings. Essentially, we are going to be a little more tolerant with network communication, since SQL3 is located across a VPN connection with some latency on the line and we only have the single network interface on the cluster nodes. I highly recommend you read this article by Elden Christensen to help you decide what the right settings for your requirements are:

For our environment, we are going go to what he is calling the “Relaxed” setting by setting the SameSubnetThreshold to 10 heartbeats and the CrossSubnetThreshold to 20 heartbeats.

The commands are:

(get-cluster).SameSubnetThreshold = 10

(get-cluster).CrossSubnetThreshold = 20

What this means is that heartbeats will continue to be sent every 1 second, but a SQL1 and SQL2 will only be considered dead after 10 missed heartbeats. SQL3 will be dead after 20 missed heartbeats. This will increase your Recovery Time Objective slightly (5-10 seconds), but it will also eliminate potential false failovers.

Extend the DataKeeper mirror to SQL3

Before we can install SQL 2014 on SQL3 we must extend the DataKeeper mirror so that it includes SQL3 as a replication target. Of course you must install DataKeeper Cluster Edition on SQL3 first, and make sure that is has a F:\ drive at least as big as the source of the mirror. Once DataKeeper is installed

Install SQL 2014 on SQL3

Now it is time to install SQL 2014 onto the 3rd node. The process is exactly the same as it was to install in on SQL2. Start by launching SQL Setup on SQL3.

Run through all the steps…

At this point in the installation you have to pick an IP address that is valid for SQL3’s subnet. The cluster will add this IP address with an “OR” dependency to the client access point.

Enter the passwords for your service accounts

After you complete the installation let the fun begin. You now have a multisite SQL Server cluster that should look something like this.

For more information visit

Creating a multi-site cluster in Windows Azure for Disaster Recovery #Azure #Cloud

Creating a SQL Server 2014 AlwaysOn Failover Cluster (FCI) Instance in Windows Azure IaaS #Azure #Cloud

1/27/2015 UPDATE – Due to new features introduced, I have updated my guidance on deploying SQL Server clusters on Azure. The latest article can be found here:



This is the 3rd post in the series on High Availability and Disaster Recovery in Windows Azure. This post contains step-by-step instructions for implementing a Windows Server Failover Cluster in the Windows Azure IaaS Cloud between two cluster nodes in different Fault Domains. While this post focuses on building a SQL Server 2014 Failover Cluster Instance, you could protect any cluster aware application with just making some minor adjustments to the steps below. In my next post I will show you how to extend this cluster to a third node in a different datacenter for a very robust disaster recovery plan. Because Azure does not have a clustered storage option, we will use the 3rd party solution called DataKeeper Cluster Edition for our cluster storage.

This post assumes you have created a Virtual Network in Azure and you have your first DC already provisioned in Azure. If you haven’t done that yet, you will want to go ahead and have a look at the first two posts on this topic.

The high levels steps which we will illustrate in this post are as follows:

  • Provision two Windows Server 2012 R2 Servers
  • Add the servers to the domain
  • Enable the Failover Clustering feature
  • Create the cluster
  • Create a replicated volume cluster resource with DataKeeper Cluster Edition
  • Install SQL 2014 Failover Cluster Instance

Provision two Windows Server 2012 R2 Servers

Click on the Virtual Machine tab in the left column and then click the New button in the bottom left corner.

Choose New Virtual Machine From Gallarey

For our cluster we are going to choose Windows 2012 R2 Datacenter

Choose the latest Version Release Date, Name the VM and Size. The user name and password will be the local administrator account that you will use to log in to the VM to complete the configuration.

On this next page you will choose the following:

Cloud Service: I choose the same Cloud Service that I created when I provisioned my first VM. Cloud Service documentation says that it is used for load balancing, but I see no harm in putting all of the cluster VMs and DCs in the same Cloud Service for easier management. By choosing an existing Cloud Service my Virtual Network and Subnets are automatically selected for me.

Storage Account: I choose an existing Storage Account

Availability Set: This is EXTEMELY important. You want to make sure all of your VMs reside in the same Availability Set. By put putting all of your VMs in the same Availability Set you guarantee that the VMs all run in a different Fault Domain.

The last page shows the ports where this VM can be reached.

Once the VM is created you will see it as a new VM in the Azure Portal

The next step is to add additional storage to the VM. Azure best practices would have you put your databases and log files on the same volume, otherwise you must disable the Geo-replication feature that is enabled by default. The following article describes this issue in more detail:

To add additional storage to your VM, click on the VM and then Dashboard to get to the VMs dashboard. Once there, click on Attach.

There are lots of things to consider when considering storage options for SQL Server. The safest and easiest method is the one we will use in this post. We will use a single volume for our data and log files and have caching disabled. You will want to read this article for the latest information on SQL Server Performance Considerations and best practices for Azure.

After you add this additional volume, you will need to open each VM and use Disk Management to initialize and format the volumes. For the purpose of this demo we will format this volume as the “F:\” drive.

You now have one VM called SQL1. You will want to complete the same process as described about to provision another VM and call it SQL2, making sure you put it in the same Cloud Service, Availability Set and Storage Account. Also make sure to attach another volume to SQL2 just as you have done for SQL1 and format it as the F:\ drive.

When you have finished provisioning both VMs we will move forward to the next step, adding them to the domain.

Add them to domain

Adding SQL1 and SQL2 to the domain is a simple process. Assuming you have been following along with my previous posts, you have already created your domain and have a DC called DC2 provisioned in the same Cloud Service as SQ1 and SQL2. Adding them to the domain is as simple as connecting to the VMs and adding the VMs to the domain just as you would for in a regular on-premise network. If you configured the Virtual Network properly the new VMs should boot with an IP address assigned by DHCP which specifies the local DC2 and the domain controller.

Click Connect to open an RDP session to SQL1 and SQL2

IPconfig /all shows the current IP configuration. Windows Azure requires that you leave the addresses set to use the DHCP server, however the IP address will not change for the life of the VM. You should notice that your DNS server is set to the local DNS server that you created in the previous article previously.

Add SQL1 and SQL2 to the domain and continue with the next steps.

Enable Failover Clustering feature

On both SQL1 and SQL2 you will enable the Failover Clustering feature

Create Cluster

If you are familiar with clustering then the following steps should be very familiar to you with just a few exceptions, so pay close attention to avoid problems that are specific to deploying clusters in Windows Azure.

We will start by creating a single node cluster, this will allow us to make the necessary adjustment to the cluster name resource before we add the second node to the cluster. Use Failover Cluster Manager and start by choosing Create Cluster. Add SQL1 to the selected servers and click Next.

In order for us to install SQL Server 2014 into the cluster at the later steps, we will need to complete cluster Validation

Step through the rest of the cluster creation process as shown below. We will call this cluster SQLCLUSTER, which is simply the name we use to manage the cluster. This is NOT the name that you client applications will eventually connect to.

Once the cluster create process completes, you will notice that the cluster name resource fails to come online, this is expected.

The name resource failed to come online because the IP resource failed to come online. The IP address failed to come online because the address that the DHCP server handed out is the same as the physical address of the server, in this case, so there is a duplicate IP address conflict.

In order to fix this, we will need to go into the properties of the IP Address resource and change the address to another address in the same subnet that is not currently in use. I would select an address that is at the higher end of the subnet range in order to reduce the possibility that in the future you might deploy a new VM and Azure will hand out that cluster IP address, causing an IP address conflict. In order to eliminate this possibility, Microsoft will have to allow us more control over the DHCP address pool. For now, the only way to completely eliminate that possibility is to create a new subnet in the Virtual Private Network for any new VMs that you might deploy later, so only this cluster resides in this subnet. If you DO plan to deploy more VMs in this subnet, you might as well deploy them all at the same time so you know which IP addresses they will use, that way you can use whatever IP addresses are left of for the cluster(s).

To change the IP address, choose the Properties of the IP Address cluster resource and specify the new address.

Once the address is changed, right click on the Cluster Name resource and tell it to come online.

We are now ready to add the the second node to the cluster. In the Failover Cluster Manager, select Add Node

Browse out to your second node and click Add.

Run all the validation tests once again.

When you click finish, you will see that the node was added successfully, but because there is no shared storage in Azure, no disk witness for the quorum could be created. We will fix that next.

We now need to add a File Share Witness to our cluster to ensure the quorum requirements for two node cluster are satisfied. The file share witness will be configured on the DC2 server, the domain controller that is also running in the Azure Cloud.

Open up a RDP session to the domain controller in your Azure Private Cloud

Connect to your domain controller and create a file share called “Quorum”. You will need to give the Cluster Computer Name Object (which we called SQLCluster in this example) read/write permissions at both the Share level and Security (NTFS) level. If you are not familiar with creating a file share witness, you may want to review my previous post for more detail.

Once the file share witness folder is created on the domain controller, we need to add the witness in the cluster configuration using the Failover Cluster Manager on SQL1

The File Share Witness should now be configured as shown below.

Create Replicated Volume Cluster Resource with DataKeeper Cluster Edition

A traditional failover cluster requires a shared storage device, like a SAN. The Azure IaaS cloud does not offer a storage solution that is capable of being used as a cluster disk, so we will use the 3rd party data replication solution called DataKeeper Cluster Edition which will allow us to create a replicate volume resource which can be used in place of a shared disk. A 14-day trial license is generally available for testing upon request.

Once you download DataKeeper, install it and license it on both SQL1 and SQL2 and reboot the servers. Once the servers reboot, connect to SQL1, launch the DataKeeper UI and complete the steps below.

Connect” to both SQL1 and SQL2

Now click on “Create Job” and follow the steps illustrated below to create the mirror and DataKeeper Volume cluster resource.

Choose the source of the mirror. When you choose the IP Address for the source and target, be sure to choose IP address of the server itself, DO NOT CHOOSE THE CLUSTER IP ADDRESS!

For this implementation where both nodes are in the Azure Cloud, choose synchronous replication with no compression, as shown below.

Click Done and you will be asked if you want to register this mirror in Windows Server Failover Clustering. Click Yes.

You will now see there is DataKeeper Volume Resource in Available Storage when you open the Windows Server Failover Cluster GUI

You are now ready to install SQL Server into the cluster.

Install SQL 2014 Failover Cluster Instance

To start the SQL Server 2014 cluster installation, you must download the SQL 2014 ISO to SQL1 and SQL2. You can use SQL Server 2014 Standard Edition for a simple two node cluster. If you want to extend this cluster to a 3rd site for disaster recovery as we will discuss in the next post, then you will need the Enterprise Edition because the Standard Edition only supports a 2-node cluster. If you are only looking for a simple two node solution than SQL Server Standard Edition can be a much more economical solution.

Once SQL Server 2014 is downloaded to the servers, mount the ISO and run the setup. The option that we want is to open is in the Advanced tab. Open the Advanced tab and run the “Advanced cluster preparation“. My good friend and fellow Cluster MVP, Robert Smit, told me about using the Advanced option. Basically, the Advanced option lets you split the install into two different processes, preparation and completion. Many things can go wrong with cluster installations, usually related to active directory and privileges. If you use the standard install method you may wait 20 minutes or longer for the installation to complete, only to find out that at the last minute the cluster was unable to register the CNO in active directory and the whole installation fails. Not only did the whole installation fail, now you may have a partially installed SQL Server cluster and you have a mess to clean up. By using the Advanced method you are able to minimize the risk by putting the risky section just at the end during cluster completion. If cluster completion fails, you simply need to diagnose the problem and re-run just the cluster completion process once again.

If you really want to save some time, check out Robert’s article on installing SQL Cluster with a configuration file, it is pretty easy to do and saves a bunch of time if you are doing multiple installations. However, for our purposes we will walk through the SQL install with the GUI as shown below.

For demo purposes, I just used the administrator account for each of the services. In production you will want to create separate accounts for each service as a best practice.

Once the install completes it looks like this.

Now we are ready to move forward with part two of the installation, Advanced Cluster Completion.

Give the SQL instance a name. This is the name the clients will connect to. In this case I called it SQLINSTANCE1.

This is where the magic happens. If you configured the mirror in DataKeeper as described earlier, you will see the DataKeeper Volume listed here as an Available Shared Disk, when actually it is simply a replicated volume pair.

One the Cluster Network Configuration page, it is important to choose IPv4 and to specify an address that is not in use in your subnet. As stated before, this address should be at the higher end of the DHCP range to help minimize the risk that Azure will assign that address to another VM in the future. I highly suggest that you have a subnet that is dedicated to your cluster to avoid possible conflicts until Windows Azure offers us greater control over the IP addresses and DHCP ranges. Later, after the cluster is created, you will need to delete this client access point and add the client access point as described in I will publish a blog post in the future that describes this process in detail.

On this page make sure you Click Add Current User, or specify the accounts you wish to use to administer SQL Server.

Starting with SQL Server 2012, tempdb no longer needs to be part of the SQL Server Cluster. If you move the tempdb to a non-replicated volume, you will need to make sure that directory structure exists on each node. To change the location of the tempdb, click on the Data Directories tab and change the location where the tempdb is located.

When the installation completes on SQL1, it is time to run the SQL installer on SQL2 and add the second node to the cluster. Run the Setup on SQL2 and choose Add node to a SQL Server failover cluster.

After the installation completes, you now have a fully functional SQL Server AlwaysOn Failover Cluster Instance (FCI) running on the Azure Cloud. Each instance is in a different Fault Domain providing a high level of resiliency. Be sure to replace the client access point with a client access point as described in my post…

In the next post in this series I will show you how to extend this two node cluster to a third node for a multi-site cluster. This third-node will be located in my on-premise data center, which will give us the ultimate in both high availability and disaster recovery.

Creating a SQL Server 2014 AlwaysOn Failover Cluster (FCI) Instance in Windows Azure IaaS #Azure #Cloud

Extending Your Datacenter to the Azure Cloud #Azure

In part 1 of my series on using Windows Azure as a disaster recovery site, I explained how to create a site-to-site VPN using Windows Server 2012 R2 Routing and Remote Access (RRAS). Now that the two sites are connected, I’m going to walk you through the steps required deploy your first VM in the Windows Azure IaaS Cloud and add it to your on-premise network as a Domain Controller. I will assume you have already done the following:

  • Have a functioning on-premise Active Directory
  • Have complete the steps to create a site-to-site VPN connecting your on-premise datacenter to the Azure Cloud and the VPN is connected.
  • Have created an Azure account and are familiar with logging in and basic management features

At this point we are ready to stat. Open the Windows Azure Portal. You should minimally see the Virtual Network the we previously created listed when you select the “All Items” category on the left.

To provision your first VM, click on the “Virtual Machines” in the left hand navigation pane and click “+New” in the bottom left hand corner.

For our purposes, we are going to create a new virtual machine from the gallery.

We will use the Windows Server 2012 R2 Datacenter Image.

Pick your machine size, username and password.

The next step has you create a “Cloud Service”, “Storage Account” and Availability Set. It also asks you where to place the VM. We will choose the Virtual Network that you previously created when you created your site-to-site VPN. We will create a new Cloud Service and Storage Account. The rest of the VMs we will create later will make use of the different accounts we create this first time through.

The final page lists the ports where you can administer this VM, but I’ll show you an easy way to RDP to it in just a moment.

Once the VM is provisioned it should look something like this.

If you click on the VM’s name you will be taking to the VM’s welcome screen where you can learn more about managing the VM

Click on Dashboard, this will give you some detail information about your VM. From here you will be able to click on the Connect button and launch an RDP session to connect to the running VM

Using the username and password you specified when you provisioned the VM, use the RDP session that opens when you click Connect to log in to the provisioned VM. Once connected, you will notice that the VM has a single NIC and it is configured to use DHCP. This is expected and DHCP is required. The VM will maintain the same internal IP address throughout the life of the VM through a DHCP reservation. Static IP addresses are NOT support, even though it may appear to work for a while should you change it to a static IP.

Also notice that if you configured you Virtual Network as I described in my first post, the DNS server should point to the DC/DNS Server that resides in your onsite network. This will ensure that we are able to add this server to the on-premise domain in the next step.

Assuming your VPN is connected to the Gateway as shown below, you should be able to ping the DNS server on the other side of the VPN.

Ping the DNS server to verify network communication between the Azure Cloud and your on-premise network.

At this point you are able to add this server as a second Domain Controller to your domain, just as you would any other typical domain controller. I’m going to assume you know to add a Domain Controller to an Existing Domain and are familiar with other best practices when it comes to AD design and deployment.

The last step I recommend you update your Azure Virtual Private Network to specify this new DC as the Primary DNS Server and use the other on-premise DC as your secondary domain controller.

Click on Networks, then the name of the Virtual Private Network you want to edit.

Add the new DNS server to the list and click Save

From this point on when you configure servers in this Virtual Private Network, the VMs will be automatically configure with two DNS servers.

In Part 3 of my series on configuring Windows Azure for High Availability and Disaster Recovery we will look at deploying a highly available SQL Server Failover Cluster Instance in the Windows Azure Cloud using the host based replication solution call DataKeeper Cluster Edition.

Extending Your Datacenter to the Azure Cloud #Azure

How to Create a Site-to-Site VPN Tunnel to the Windows Azure Cloud Using a Window Server 2012 R2 Routing and Remote Access (RRAS) Server

Not long ago I set out to build a multisite SQL Server cluster where one my nodes resides in my local data center and the other node resides in Microsoft’s Infrastructure as a Service (IaaS) offering, the Windows Azure Cloud. The Azure Cloud has an offering where you can deploy VMs and pay for just the resources you utilize, much like Amazon’s EC2. My goal was to create a proof of concept where I would use the Azure Cloud as an inexpensive disaster recovery site. My configuration is shown in Figure 1.

1. An example of the simple DR configuration I used in my POC

My on premise VMs are used as follows:

  • VM1-internal – Routing and Remote Access Server for NAT and VPN connectivity to the Azure Cloud
  • VM2-internal – The primary node in my cluster
  • VM3-internal – My domain controller

For this POC I only deployed on server in the Azure cloud, Azure-DR. Azure-DR is the secondary node in my cluster. If this were an actual production site, I certainly would also want to deploy another domain controller in the Azure cloud to ensure that my Active Directory was available in the DR site. Your actual DR configuration will vary greatly depending upon your needs. I will use the server name depicted in my illustration as I describe the configuration steps below.

The scope of this post

For the purpose of this post, I am going to focus on what you need to do to get to the point where you have configured your virtual network in Azure and you create a site to site VPN connection to your primary data center. My next article will discuss the steps required to actually create a multisite cluster for disaster recovery. As with most cloud related services, the interfaces and options tend to change rapidly; the screen shots and directions you see below are relevant as of January 2nd, 2014. Your experience may vary, but these directions should get you pretty darn close. If you encounter difference, please send me a comment and what you did to make it work so other users can benefit from your experience.

Create your Local Network

I’m not going to walk you through this step-by-step, but essentially you should have a Windows Server 2012 R2 DC configured (VM3-internal) and two additional Windows Server 2012 R2 servers in the domain (VM1-internal and VM2-internal). Each server should use the DC server as their primary DNS server and on VM2-internal and VM3-internal the gateway should be configured to point to VM1-internal, which will eventual be configure with Routing and Remote Access (RRAS). The RRAS (VM1-internal) should be dual homed, with one NIC connected to the internal network and one NIC connected directly to the Public network. Generally this will be the biggest obstacle in deploying this in your lab, as you must have a spare public IP address that you can use for your RRAS server. This configuration will not work if your RRAS server sits behind a NAT’s firewall, it must be directly connected to the internet. The RRAS Server should be configured with just the IP address, subnet mask and DNS server, no gateway should be defined. DO NOT enable Routing and Remote Access, this will be done automatically via a script at a later step.

Create a Virtual Network

Log in to the Windows Azure Management Portal and create a new Virtual Network following the steps illustrated below.

When you click the check box you should now see the new virtual network you just created.

Create the Gateway

Once the virtual network is created, you will need to create the Gateway. From the Dashboard of the newly created virtual network, you will be able to create a Gateway as shown below. As of April 25th, 2013, Static Routing with RRAS is not supported in the Azure VPN connection, so be sure to choose Dynamic Routing.

It could take 30 minutes or longer before your gateway is finished being created, be patient…

Once the gateway is finished creating, you will see your Gateway IP Address and the amount of Data In and Data Out as shown below.

Configure your local RRAS Server

At this point you are ready to configure your on-premise RRAS Server (VM1-internal) to create a site-to-site VPN to the Gateway that you just created. Microsoft has made this very easy, so don’t worry if networking and configuring VPNs are not your specialty. You will just need click on “Download VPN Device Script” and run it on your RRAS server. Microsoft also supports a bunch of Juniper and Cisco VPN routers as well, so if you want to move to a hardware based VPN device in the future you can always come back and download the configuration script specific to your device.

Choose Microsoft Corporation as the Vendor, RRAS as the Platform and Windows Server 2012 as the Operating System and click the checkbox to download the Powershell script. In my case, this same script worked just fine when run on Windows Server 2012 R2.

As of the date of this writing, it seems as if Microsoft has made the script creation process even more intelligent than it was just last month. The script that it just created for me was pre-populated with all the information required; I did not have to edit anything at all.

At this point, all you need to do is copy the script file on to your RRAS Server (VM1-internal) and save it as a .ps1 and run the PowerShell script. This script will install Routing and Remote Access and configure the Site-to-Site VPN to connect to the Windows Azure Virtual Network you just created. Once you have finished with the RRAS installation go back to the Azure Portal and click Connect to complete the VPN site-to-site connection.

When connected, the Azure Portal should look something like the following.

Enable NAT on the RRAS Server

The final step I had to take to have a usable network was to enable NAT on my RRAS Server. Without having NAT enabled none of my servers could reach the internet. The basic steps for enabling NAT on RRAS are as follows:

  • Open the Routing and Remote Access MMC
  • Expand IPv4, right-click General, and then click New Routing Protocol.
  • In Routing protocols, click NAT, and then click OK.
  • Right-click NAT, and then click New Interface.
  • Select the interface that connects to your private intranet, and then click OK.
  • Select Private interface connected to private network, and then click OK.
  • Right-click NAT, and then click New Interface again.
  • Select the interface that connects to the public Internet, and then click OK.
  • Select both Public interface connected to the Internet and Enable NAT on this interface, and then click OK.

Now What?

The fun can now begin. In my next post I will walk you through the process of provisioning a Windows VM in Azure and joining it to your on-premise domain.

How to Create a Site-to-Site VPN Tunnel to the Windows Azure Cloud Using a Window Server 2012 R2 Routing and Remote Access (RRAS) Server