Highly Available SQL Server Storage Options in #Azure: SMB 3.0 File Service or Premium Storage, a look at performance differences

November 24, 2015January 6, 2016 daveberm2 Comments

When looking at storage options for deployments of SQL Server deployments in Azure you have a few options as described in the article Windows Server Failover Cluster on Azure IAAS VM – Part 1 (Storage). The article also references the newly released Azure File Service which can be used to host your SQL Server cluster data over SMB 3.0. As of today the Azure File Service does not support Premium Storage, so you are limited to about 1000 IOPS or 60 MB/s per file share. With those limits in place I see Azure File Service really only being an option for databases that have minimal IO demands. We will see why that holds true based on my tests results below.

I wanted to test a few configurations, so I provisioned a DS4 VM and attached some premium storage to it. I also attached a SMB 3.0 File share using Azure File Service. The storage was configured as follows:

F:\ – Three 1 TB P30 Premium Storage Disks added to a single 3TB pool

G:\ – One 1 TB P30 Premium Storage Disk (no Storage Pool)

Z:\ – SMB 3.0 File share on Azure File Services

To configure the Storage Pool for use in a cluster you have to be careful how you proceed. You either have to create the Storage Pool before you create the cluster or you have to use the Powershell script described in Sql Alwayson with Windows 2012 R2 Storage Spaces if the cluster is already created. The pool I created was a Simple mirror (RAID 0) for increased performance. I’m not concerned about redundancy since the Azure storage on the backend has triple redundancy.

With three disk in the Storage Pool in a RAID 0 I expect I should get up to three times the performance of a single disk. Adding even more disk to the pool should give me even performance. A single P30 disk gives me 5000 IOPS and 200 MB/S, so for my pool I should expect up to 15000 IOPS and 600 MB/S throughput.

Now that I have the storage configured I configured Dskspd to run the same test on each of the different volumes. The parameters I used with Dskspd are as follows:

Diskspd.exe -b8K -d60 -h -L -o8 -t16 -r -w30 -c50M F:\io.dat

Diskspd.exe -b8K -d60 -h -L -o8 -t16 -r -w30 -c50M G:\io.dat

Diskspd.exe -b8K -d60 -h -L -o8 -t16 -r -w30 -c50M Z:\io.dat

The results were pretty predictable and summarized below

Dskspd As you can see, while this particular job did not push the upper limits of the theoretical maximum of any of these storage solutions, the latency had a significant impact on the overall performance of this particular test. The test used 8k blocks in a mix of 30% writes and 70% reads to simulate a typical SQL Server OLTP workload.

Of course the more money you want to spend the more performance you can expect to achieve. As of November 24, 2015 the price for the best solution shown here (F:\) would cost you $1,216/month and give you full access to 3 TB of storage with unlimited reads/writes. The second best solution (G:\) would give you 1 TB of storage at 1/3 the price, $405/month. The Azure File Share is priced at $0.10/GB plus additional charges for read/write operations. You are only charged for the actual usage so estimating the actual cost will be very dependent on your usage, but before the additional charges for read/write operations you are at about 25% of the cost of Premium Storage.

Prices, like everything else in the Cloud, tend to change rapidly to address the market demands. Have a look at the latest price information at https://azure.microsoft.com/en-us/pricing/details/storage/ for the latest price information.

In summary, while Azure File Services looks enticing from a price perspective, the latency at this point does not make it a viable option for any serious SQL Server workload. Instead, have a look at utilizing premium storage and leveraging either host based replication solutions such as SIOS DataKeeper to build SQL Server Failover Cluster Instances (SQL Standard or Enterprise) or look at SQL Server Enterprise Edition and AlwaysOn AG.

Configuring the SQL Server AlwaysOn ILB for the Client Listener in Azure Resource Manager (ARM) deployment model #SQLPASS

October 29, 2015January 6, 2016 daveberm15 Comments

In preparation for my talk at PASS Summit this Friday, I’d figure I document something that I finally got working that I want to talk about during my session on highly available SQL Server in Azure. Now I just need to update my slide deck and my demo environment.

In case you didn’t know, Azure has two deployment models: Resource Manager (ARM) and Classic Deployment. Classic deployment is the “old” way of doing things and ARM is the new way of doing things. There are numerous benefits to using ARM as described in the Azure article Understanding Resource Manager deployment and classic deployment . However, one of my favorite new features of ARM is the ability to have three Fault Domains per Availability Set rather than just the two Fault Domains you get with the Classic deployment model. This is a critical feature for SQL High Availability.

With three fault domains you can ensure that each cluster node in a two node cluster and the file share witness all reside in different fault domains. This eliminates the possibility that the failure of a single Fault Domain would impact more than one quorum vote in your cluster. In the Classic deployment model with two fault domains you could only put two cluster nodes in an availability set. For maximum availability you really needed to put your file share witness in a different geographic location as there was no guarantee that it wouldn’t wind up in the same fault domain as one of your cluster nodes if you kept it in the same geographic location, meaning that the failure of a single fault domain could impact 2 out of your 3 quorum votes, bringing down your entire cluster. ARM’s three Fault Domains eliminates that possibility.

ARM is definitely the way to go as new Azure features are only being introduced in ARM. However, the documentation is light and some features are not quite there yet, including such things as documented support for ExpressRoute. Both of these problems get better almost daily, but early adopters really have to work extra hard until Azure catches up. One other issue is that you can’t mix Classic and ARM deployments, so if you started down the road with Classic deployments you are basically going to have to start from the ground up with Resource Manager when you make the switch. If you can manage it a little pain now will help you avoid a larger headache next year when you find that you want some new feature only available in ARM.

I hope this article helps you in at least one of aspect of your ARM deployment – getting highly available SQL Server deployed. As I have documented in earlier articles, deploying both AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances in Azure “Classic” requires the use of an Azure Load Balancer (internal or external) for client redirection. Getting that configured in Classic Azure is not exactly straight forward, but it is documented well enough that any administrator reasonably familiar with Azure, Failover Clustering, SQL Server and PowerShell can get it to work.

AlwaysOn Availability Groups and AlwaysOn Failover Cluster Instances using the ARM deployment model still requires the use of an Azure Load Balancer for client redirection, however the steps on creating and configuring that load balancer are completely different and as of today not exactly documented very well. In this article I’m going to highlight the steps required to configure the ILB and update the SQL cluster IP Resource. In the next article I will walk you through the entire process step by step from the ground up from creating the vNet to installing SQL and creating the cluster.

Before we begin, I’m making the following assumptions:

You have created a vNet using ARM
You have provisioned 3 ARM based VMs (DC, SQL1, SQL2)
You put DC, SQL1 and SQL2 in the same Availability Set and Resource Group
You have created a cluster with SQL1 and SQL2 and used the DC for the file share witness
You have either created an AlwaysOn Availability Group or AlwaysOn Failover Cluster instance with SIOS DataKeeper Cluster Edition. In either case you will wind up with a client listener, consisting of a name resource and IP resource. The AlwaysOn AG and FCI configuration up to the point of creating the load balancer is exactly the same as it is in the Azure Classic deployment model and documented on the web in many places including my own blog post

Now that you have a fully configured AlwaysOn AG or FCI, you probably notice that you can’t connect to the cluster name from any server other than the node that currently hosts the SQL cluster name resource. I’ve been told that this is because Azure networking does not support gratuitous ARPS so clients can’t communicate directly with the cluster IP address. Instead the clients need to communicate with the ILB and the ILB will redirect traffic to the active node. So step 1 is to create the ILB. As of now this can’t be done through the Azure Portal so we will use the following Azure PowerShell command.

[1/6/2016 Update – The directions below assume you are using Azure PowerShell pre-version 1. The script if you are using Azure PowerShell Version 1 or later is detailed in my blog post here.]

Switch-AzureMode -Name AzureResourceManager

Select-AzureSubscription -SubscriptionName "MSDN Azure"
# name whichever subscription you used to create your vNet and VMs

#Declare your variables using values relevant to your deployment

$ResourceGroupName ='SIOS-EAST-RG'
# Resource Group Name in which the SQL nodes are deployed

$FrontEndConfigurationName ='FE'
#Call it whatever you like

$BackendConfiguratioName ='BE'
#Call it whatever you like

$LoadBalancerName ='ILB'
#Provide a Name for the Internal Local balance object

$Location ='eastus2'
# Input the data center location of your SQL VMs

$subname ='PUBLIC'
# Provide the Subnet name in which the SQL Nodes are placed

$ILBIP = '10.0.0.201'
# Provide the IP address for the Listener or Load Balancer

$subnet = Get-AzureVirtualNetwork -ResourceGroupName $ResourceGroupName|Get-AzureVirtualNetworkSubnetConfig –name $subname

$FEConfig = New-AzureLoadBalancerFrontendIpConfig -Name $FrontEndConfigurationName -PrivateIpAddress $ILBIP -SubnetId $subnet.Id

$BackendConfig = New-AzureLoadBalancerBackendAddressPoolConfig -Name $BackendConfiguratioName

#create the ILB
New-AzureLoadBalancer -Name $LoadBalancerName -ResourceGroupName $ResourceGroupName -Location $Location
-FrontendIpConfiguration $FEConfig -BackendAddressPool $BackendConfig

Now that the ILB is created, we should see it in the Azure Portal if we list all the objects in our Resource Group as shown below.

The rest of the configuration I’m sure can also be done through PowerShell, but I’m going to use the GUI in my example. If you want to use PowerShell you could probably piece together the script by looking at the article Get started configuring internal load balancer using Azure Resource Manager but honestly that article gives me a headache. I’ll figure it out some day and try to document it in a user friendly format, but for now I think the GUI is fine for the next steps.

Follow along with the screen shots below. If you get lost, follow the navigation hints at the top of the Azure Portal to figure out where we are.

Click Backend Pool setting tab and selects the backend pool to update the Availability Set and Virtual Machines. Save your changes.

Configure Load Balancer’s Probe by clicking Add on the Probe tab. Give the probe a name and configure it to use TCP Port 59999. I have left the probe interval and the unhealthy threshold set to the default settings, which means it will take 10 seconds before the ILB removes the passive node from the list of active nodes after a failover, meaning your clients may take up to 10 seconds to be redirected to the new active node. Be sure to save your changes.

Navigate to the Load Balancing Rule Tab and add a new rule. Give the rule a sensible name (SQL1433 or something) and choose TCP protocol port 1433 (assuming you are using the default instance of SQL Server). Choose 1433 for the Backend port as well. For the Backend Pool we will choose the Backend Pool we created earlier (BE) and for the Probe we will also choose the Probe we created earlier. We do not want to enable Session persistence but we do want to enable Floating IP (Direct Server Return). I have left the idle timeout set to the default setting, but you might want to consider increasing that to the maximum value as I have seen some applications such as SAP log error messages each time the connection is dropped and needs to be re-established.

At this point the ILB is configured and there is only one final step that needs to take place. We need to update the SQL IP Cluster Resource just the exact same way we had to in the Classic deployment model. To do that you will need to run the following PowerShell script on just one of the cluster nodes. And make note, SubnetMask=“255.255.255.255” is not a mistake, use the 32 bit mask regardless of what your actual subnet mask is.

# This script should be run on the primary cluster node after the internal load balancer is created
# Define variables

$ClusterNetworkName = "Cluster Network 1"
# the cluster network name

$IPResourceName = "SQL IP Address 1 (SQLCluster1)"
# the IP Address resource name

$CloudServiceIP = "10.0.0.201"
# IP address of your Internal Load Balancer

Import-Module FailoverClusters

# If you are using Windows 2012 or higher, use the Get-Cluster Resource command. If you are using Windows 2008 R2, use the cluster res command which is commented out.

Get-ClusterResource $IPResourceName
Set-ClusterParameter -Multiple @{"Address"="$CloudServiceIP";"ProbePort"="59999";SubnetMask="255.255.255.255";"Network"="$ClusterNetworkName";"OverrideAddressMatch"=1;"EnableDhcp"=0}

# cluster res $IPResourceName /priv enabledhcp=0 overrideaddressmatch=1 address=$CloudServiceIP probeport=59999 subnetmask=255.255.255.255

I have just one final note. In my initial test I still was not able to connect to the SQL Resource name even after I completed all of the above steps. After banging my head against the wall for a few hours I discovered that for some reason the SQL Cluster Name Resource was not registered in DNS. I'm not sure how that happened or whether it will happen consistently, but if you are having trouble connecting I would definitely check DNS and add the SQL cluster name and IP address as a new A record if it is not already in there.

And of course don't forget the good ole Windows Firewall. You will have to make exceptions for 1433 and 59999 or just turn it off until you get everything configured properly like I did. You probably want to leverage Azure Network Security Groups anyway instead of the local Windows Firewall for a more unified experience across all your Azure resources.

Good luck and let me know how you make out.

Three Fault Domains in #Azure now default when using Resource Manager deployment model

September 8, 2015October 29, 2015 davebermLeave a comment

After being away from Azure for a month or two this summer I decided fire up the Azure Portal to see what changes have been implemented recently as I prepare for my PASS presentation on Azure SQL Server high availability. I was extremely happy to see that they have finally started offering Three Fault Domains per Availability Set as the default setting if you choose “Resource Manager” as your deployment model instead of “Classic”.

If you have been following along, up until now when you created an Availability Set the default option was to create two Fault Domains per Availability Set. When deploying a cluster, it is important to have a minimum of three Fault Domains, one for each cluster node and one for your File Share Witness. This ensures that a failure of a single fault domain never impacts more than one of your quorum votes at any given time. Before this feature was implemented in the GUI there was a way to do it through an ARM Template, but putting it in the GUI makes it easy for those administrators not quite up to speed on ARM templates.

This feature now completes the steps I documented earlier on how to create a SQL Server FCI in Azure.

Clustering SAP ASCS instance on Azure

June 5, 2015October 29, 2015 davebermLeave a comment

Microsoft published a blog post and white paper on clustering SAP ASCS instance using Windows Server Failover Cluster on Microsoft Azure public cloud. It describes how to install and configure a high-availability (HA) SAP central services instance ASCS in a Windows Failover Cluster (WSFC) using the platform Microsoft Azure.

Download the white paper here http://go.microsoft.com/fwlink/?LinkId=613056

Azure Cloud Reaches Canada

June 2, 2015October 29, 2015 davebermLeave a comment

Great news for my Canadian friends as Azure adoption was inhibited by strict data storage compliance codes. Now with the cloud on Canadian soil there should be a much easier path to get to Azure.

Read the press release here…http://reimagine.microsoft.ca/en-ca/

Protecting Business Critical Apps By @DaveBerm | @CloudExpo [#Cloud]

May 20, 2015October 29, 2015 davebermLeave a comment

Protecting Business Critical Apps By @DaveBerm | @CloudExpo [#Cloud] — Gartner predicts that the bulk of new IT spending by 2016 will be for cloud platforms and applications and that nearly half of large enterprises will have cloud deployments by the end of 2017. The benefits of the cloud may be clear for applications that can tolerate brief periods of downtime, but for critical applications like SQL Server, Oracle and SAP, companies need a strategy for HA and DR protection. While traditional SAN-based clusters are not possible in these environments, SANless clusters can provide an easy, cost-efficient alternative.

Attend my session at Cloud-Expo on June 11^th at the Javits Center in NYC – http://www.cloudcomputingexpo.com/event/session/2854

Why would you want to build a #SQLServer failover cluster instance in the #Azure cloud?

March 5, 2015March 5, 2015 davebermLeave a comment

There was an interesting discussion happening today in the Twitterverse. Basically, someone asked the question “Has anyone set up a SQL Server AlwaysOn Failover Cluster Instance in Azure?” The ensuing conversation involved some well respect SQL Server experts which led to the following question, “Why would you want to build a SQL Server AlwaysOn Failover Cluster instance in the cloud?”

That question could be interpreted in two ways: “Why do you need High Availability in the Cloud” or “Why wouldn’t you use AlwaysOn Availability Groups instead of Failover Cluster Instances?”

Let’s address each question one at a time.

Question 1 – Why do you need High Availability in the Azure Cloud?

You might think that just because you host your SQL Server instance in Azure, that you are covered by their 99.95% uptime SLA. If you think that, you would be wrong. In order to take advantage of the 99.95% SLA you have to have at least two instances of SQL running in an Availability Set. With a single instance of SQL running you can definitely expect that there will minimally be downtime during maintenance periods, but you are also susceptible to unplanned failures.
Two instances of SQL Server cannot generally be load balanced, so you have to implement some sort of mechanism to keep the servers in sync and to ensure that if there is a problem with one of the servers, the other server will be able to continue to service the requests. High Availability solutions like AlwaysOn Availability Groups, AlwaysOn Failover Cluster Instances and even the deprecated Database Mirroring can provide high availability for SQL Server in that scenario. Other solutions like log shipping and transactional replication may be able to help keep data synchronized between servers, but they are not typically considered high availability solutions and will not ensure the availability of your SQL Server.
Microsoft does occasionally need to perform maintenance on Azure that could bring down an entire Upgrade Domain and all the instances running in that Upgrade Domain. You don’t have any say on when this will happen, so you need to have a mechanism in place to ensure that if they do have to bring down your primary SQL Server instance, you can expect that your secondary SQL Server instance will take over the workload without missing a beat. All of the high availability solutions mentioned above can ensure that you will continue to run in the event that Microsoft is doing maintenance on the Upgrade Domain of your primary server. Microsoft will only do maintenance on a single Upgrade Domain at a time, ensuring that your secondary server will still be online assuming you put the both in the same Availability Set.
What do you do if YOU want to performance maintenance on your production SQL Server? Maybe you want to install a Service Pack or other hotfix? Without a secondary server to fail over to, you will have to schedule planned downtime. One of the primary benefits of any high availability solution is the ability to do rolling upgrades, minimizing the impact of planned downtime.

Question 2 – Why wouldn’t you use AlwaysOn Availability Groups instead of Failover Cluster Instances?

Save Money! SQL Server AlwaysOn Availability Groups requires Enterprise Edition of SQL Server. Why not save money and deploy SQL Server Standard Edition and build a simple 2-node Failover Cluster Instance? Unless you need Enterprise Edition for some other reason, this is a no brainer.
Protect the ENTIRE SQL Server instance. AlwaysOn Availability Groups only protects user defined databases; you cannot protect the System and MSDB databases. If you build a Failover Cluster Instance instead, you are protecting the ENTIRE instance, including the System and MSDB databases.
Ease Administration. In Azure, you are limited to just on client listener. This limits you to just one Availability Group. In contrast, with a Failover Cluster Instance one client listener is all you need, so there is no limitation.
Worker Thread Exhaustion. With AlwaysOn AG you have to keep an eye on the available worker threads. The available worker threads limit the number of databases you can protect with AlwaysOn AG. In contrast, AlwaysOn Failover Clustering with DataKeeper block level replication does not consume more resources for each database you add, meaning you can scale to protect hundreds of databases without the additional overhead associated with AlwaysOn AG.
Distribute Transaction Support. AlwaysOn AG does not support distributed transactions (DTC), so if your application requires DTC support you are going to have to look at an AlwaysOn Failover Cluster Instance instead.
Support of Other Replication Technologies. If you plan on setting up Peer to Peer replication between two databases protected by AlwaysOn AG you can forget about it. In fact, there are many restrictions you have to be aware of once you deploy AlwaysOn Availability Groups. AlwaysOn FCI’s do not have any of those restrictions.

Knowing what you know above, shouldn’t the question really be “Why would I want to implement AlwaysOn AG in the Cloud when I can have a much more robust and inexpensive solution building an AlwaysOn Failover Cluster instance?”

If you are interested in building an AlwaysOn Failover Cluster Instance in Azure, check out my blog post Step-by-Step: How to configure a SQL Server Failover Cluster Instance (FCI) in Microsoft Azure IaaS #SQLServer #Azure #SANLess

You can also check out the only Azure Certified HA solution in the Azure Marketplace at http://azure.microsoft.com/en-us/marketplace/partners/sios-datakeeper/sios-datakeeper-8-bring-your-own-license/

New Azure Announcements: Azure for Disaster Recovery and Very Large VMs #azure

January 15, 2015January 27, 2015 davebermLeave a comment

Azure continues to improve their services. Just in the past week they have announced their new “G-Series” VMs, some of the most powerful VMs offered by any cloud provide. Check out the details here…

http://azure.microsoft.com/blog/2015/01/08/largest-vm-in-the-cloud/

Also, if you have been considering using Azure as a DR site as I describe in my blog post, https://clusteringformeremortals.com/2014/01/14/creating-a-multi-site-cluster-in-windows-azure-for-disaster-recovery-azure-cloud/ you will be happy to know that you can get a 6 month trial of ExpressRoute for FREE! ExpressRoute will help you connect your on-premise datacenter into Azure with a fast secure connection. Check out the announcement here…

http://azure.microsoft.com/blog/2015/01/12/azure-data-center-migration-just-got-easier/

Step-by-Step: How to configure a SQL Server Failover Cluster Instance (FCI) in Microsoft Azure IaaS #SQLServer #Azure #SANLess

January 1, 2015December 16, 2020 daveberm22 Comments

7/19/2016 Update – The steps below describe a deployment in Azure “Classic”. If you are deploying a SQL Cluster in Azure Resource Manager (ARM) then you should see my aerticle here. https://clusteringformeremortals.com/2016/04/23/deploying-microsoft-sql-server-2014-failover-clusters-in-azure-resource-manager-arm/

Before we begin, we are going to make some assumptions that you are at least slightly familiar with failover clustering and Microsoft Azure and have already signed up for an Azure account. Throughout this Step-by-Step guide we will refer to additional resources for additional reading. Included in this guide are screen shots and code examples. Azure is a rapidly developing product, so your experience may be different than that described, but you should be able to adapt and adjust as needed. I will attempt to keep this article up to date my adding additional comments as time progresses. The new Azure Portal is still in the Preview stage as of the writing of this article, so we will use the currently supported portal along with PowerShell in all of our examples.

At a high level, these are the following steps that need to be taken in order to create a highly available SQL Server deployment on Azure IaaS. If you already have a functioning domain in Azure IaaS you can skip items 1-3.

Create your Virtual Network
Create a Cloud Service
Create Storage Account
Create your Azure VMs and Storage
Configure Active Directory
Create a Cluster
Create Replicated Cluster Volume Resources
Install SQL into the Cluster
Create an Internal Load Balancer
Update the Client Listener

We will take a closer look at each of these steps below.

Overview

These instructions assume you want to create a highly available SQL Server deployment entirely within one Azure region. It is entirely possible to configure SQL Server clusters that span different geographic regions within Azure, or even Hybrid Cloud configurations that span from on premise to the Azure Cloud or visa-versa. It is not my intent to cover those types of configurations in this document. Instead, the configuration I will focus on the configuration is illustrated in in Figure 1.

Figure 1 – SQL Server Failover Cluster in Azure

This article will describe how to create a cluster that spans two different Fault Domains and Update Domains within an Azure region. Spanning different Fault Domains eliminates downtime associated with Unplanned Downtime. Spanning different Update Domains eliminates failures associated with Planned Downtime.

For additional overview information, you may want to watch the webinar I did on SQLTIPS that discusses this topic in detail. It can be viewed at http://www.mssqltips.com/sql-server-video/360/highly-available-sql-server-cluster-deployments-in-azure-iaas/

Create your Virtual Network

In order for this to work, you will need to have all of your VMs within a Virtual Network. Creating a Virtual Network is pretty straight forward. The screen shots below should help guide you through the process.

At this point I like to add the Google DNS Server address of 8.8.8.8. I have experienced weird connectivity issues when trying to download updates from Microsoft when using their default DNS servers. After we have downloaded all of the updates that these servers need we will come back and replace the DNS server IP address with the IP address of our AD controller. But for now, add 8.8.8.8 and all of your VMs provisioned in this Virtual Network will receive this as a DNS server via the DHCP service. This forum post describes the problem I have experienced without adding this DNS server entry. Before adding all of your servers to the domain I have found that you need to delete this 8.8.8.8 address and replace it with the IP address of the first domain controller that you create.

You will see that I created one subnet in this Virtual Network and labeled it Public. Later on when we create our VMs we will use the Public network. While Azure recently added support for multiple NICs per VM, I have found that adding multiple subnets and NICs to an Azure VM can be problematic. The main problem is that each NIC is automatically assigned a Gateway address, which can causes routing problems due to multiple Gateways being defined on the same server.

It will take a few minutes for your Virtual Network to be created.

Create a Cloud Service

Your VMs will all reside in the same “Cloud Service”. Good luck finding the definition of an Azure “Cloud Service”, since Azure overall is a “Cloud Service”. However, this is a very specific think specific to Azure IaaS that you need to create before you start deploying VMs. The screen shots below will walk you through the process.

Make sure you put the Cloud Service in the same Region as your Virtual Network.

Create a Storage Account

Before you begin provisioning VMs you must create a Storage Account. Follow the steps below to create a storage account.

Make sure you create a Storage Account in the same Location as your Virtual Network

Create your Azure VMs and Storage

If you have not downloaded and installed Azure PowerShell yet, do that now. Also, make sure you set your default subscription and CurrentStorageAccountName.

We will start with provisioning the first VM which will become the Domain Controller (DC). In our example, we will also use the DC as a file share witness, so we will create an Availability Set that will include the Domain Controller and the two nodes in the cluster. The following is an example script which will create the VM and assign it a “Static Address”.

$AVSet=”SQLHA”

$InstanceSize=”Large”

$VMName=”DC1″

$AdminName=” myadminaccount”

$AdminPassword=”mypassword”

$PrimarySubnet=”Public”

$PrimaryIP=”10.0.0.100″

$CloudService=”SQLFailover”

$VirtualNetwork=”Azure-East”

$ImageName=”a699494373c04fc0bc8f2bb1389d6106__Windows-Server-2012-R2-201412.01-en.us-127GB.vhd”

$image = Get-AzureVMImage -ImageName $ImageName

$vm = New-AzureVMConfig -Name $VMName -InstanceSize $InstanceSize -Image $image.ImageName –AvailabilitySetName $AVSet

Add-AzureProvisioningConfig –VM $vm -Windows -AdminUserName $AdminName -Password $AdminPassword

Set-AzureSubnet -SubnetNames $PrimarySubnet -VM $vm

Set-AzureStaticVNetIP -IPAddress $PrimaryIP -VM $vm

New-AzureVM -ServiceName $CloudService –VNetName $VirtualNetwork –VM $vm

Tech Note – I say “Static IP Address”, but it really just creates a DHCP “Request”. I call it a DHCP “Request” and not “Reservation” because it really is only a best effort request. If this server is offline and someone starts a new server, the DHCP server could hand out this address to someone else, making it unavailable when this server is turned on.

Once you create your 1^st VM you are ready to create the two SQL VMs used in the cluster. You will see that I tried to make the script easy to use by allowing you to specify the different variables. I highlighted the variables you need to change for each VM.

$AVSet=”SQLHA”

$InstanceSize=”Large”

$VMName=”SQL1″

$AdminName=”myadminaccount”

$AdminPassword=”P@55w0rd”

$PrimarySubnet=”Public”

$PrimaryIP=”10.0.0.101″

$CloudService=”SQLFailover”

$VirtualNetwork=”Azure-East”

$ImageName=”a699494373c04fc0bc8f2bb1389d6106__Windows-Server-2012-R2-201412.01-en.us-127GB.vhd”

$image = Get-AzureVMImage -ImageName $ImageName

$vm = New-AzureVMConfig -Name $VMName -InstanceSize $InstanceSize -Image $image.ImageName –AvailabilitySetName $AVSet

Add-AzureProvisioningConfig –VM $vm -Windows -AdminUserName $AdminName -Password $AdminPassword

Set-AzureSubnet -SubnetNames $PrimarySubnet -VM $vm

Set-AzureStaticVNetIP -IPAddress $PrimaryIP -VM $vm

New-AzureVM -ServiceName $CloudService –VNetName $VirtualNetwork –VM $vm

Run the script once again to provision the 2^nd cluster node

$AVSet=”SQLHA”

$InstanceSize=”Large”

$VMName=”SQL2″

$AdminName=” myadminaccount”

$AdminPassword=”mypassword”

$PrimarySubnet=”Public”

$PrimaryIP=”10.0.0.102″

$CloudService=”SQLFailover”

$VirtualNetwork=”Azure-East”

$ImageName=”a699494373c04fc0bc8f2bb1389d6106__Windows-Server-2012-R2-201412.01-en.us-127GB.vhd”

$image = Get-AzureVMImage -ImageName $ImageName

$vm = New-AzureVMConfig -Name $VMName -InstanceSize $InstanceSize -Image $image.ImageName –AvailabilitySetName $AVSet

Add-AzureProvisioningConfig –VM $vm -Windows -AdminUserName $AdminName -Password $AdminPassword

Set-AzureSubnet -SubnetNames $PrimarySubnet -VM $vm

Set-AzureStaticVNetIP -IPAddress $PrimaryIP -VM $vm

New-AzureVM -ServiceName $CloudService –VNetName $VirtualNetwork –VM $vm

You see that each of these VMs are all placed in the same Availability Set, which I called “SQLHA”. By placing the VMs in the same Availability Set you take advantage of Fault Domains and Update Domains as described here. http://blogs.technet.com/b/yungchou/archive/2011/05/16/window-azure-fault-domain-and-update-domain-explained-for-it-pros.aspx

Once you have created your VMs your Azure Portal should look like this.

Words of Wisdom about Fault Domains

Fault Domains are a great concept; however Microsoft dropped the ball by not guaranteeing (As of Jan 2015) that you will always get three fault domains per Availability Set. In fact, most of the time I only get two Fault Domains. If you wind up with just two Fault Domains you will want to consider putting your File Share Witness in a different region just to be 100% sure that you don’t have a majority of your cluster votes sitting in the same rack. Once Windows Server 10 is GA this will no longer be a problem as you will be able to use a Cloud Witness instead of a File Share Witness. If you would like to see three Fault Domains be the standard, follow this link and VOTE for that idea on Azure idea website.

Configure Active Directory

First we will connect to DC1 via RDP and enable active directory. Use the “Connect” button to download the RDP connection to DC1. Use the username and password that you specified when you created your Azure VM. Promote DC1 to a Domain Controller.

Insider Tip – I have also found that DNS resolution works best if you remove all DNS forwarders on the DNS server and just use root hints. Azure can sometime have problem resolving Microsoft web properties if you use their DNS servers are forwarders.

Figure 2 – Remove all forwarders for reliable name resolution

Create a Cluster

Once you have configured DC1 as a Domain Controller, you will connect to SQL1 and SQL2 and add them to the domain. However, before you do that, you will need to change the DNS Server of the Virtual Network to that of the DC1 Server (10.0.0.100) and reboot both SQL1 and SQL2. Once SQL1 and SQL2 have 100.0.0.100 as their DNS Server you will be able to join the domain.

Once you are joined to the domain you will have to complete steps illustrated below to create a SQL Server Failover Cluster Instance (FCI).

First, enable .Net 3.5 Framework on each node.

If you find that .Net Framework cannot be installed, refer to my tip about DNS.

Enable Failover Cluster

Now that .Net 3.5 is enabled, you will then need to enable the Failover Cluster Feature on both SQL1 and SQL2.

Validation

Once the cluster feature is enabled, you will need to create the cluster. The first step is to run cluster Validation. I am assuming you are familiar with clustering, so I will just post a few of the screen shots and make note of things you need to consider.

The Validation will complete, but there will be some warnings. Most of the warnings will be around storage. You can ignore those as we will be using replicated storage rather than a shared disk. Also, you may get a warning about the network. You can also ignore that warning as we know that Azure has network redundancy built in at the physical layer.

Create Cluster Access Point

11/24/2015 UPDATE – I have found that creating a cluster via Powershell avoids all this issues described in the GUI steps show below because you can specify the IP Address of the Cluster as part of the creation process. The two PowerShell commands below replace all the steps show in the GUI screen shots that follow in this section. Make sure the StaticIaddress parameter

Test-Cluster –Node Server1, Server2

New-Cluster –Name MyCluster –Node Server1, Server2 –StaticAddress 10.0.0.200

If you ran the Powershell Script above then you can skip the rest of this section and jump right to the next section on creating the file share witness.

I would advise creating the Click Finish to start the cluster creation process. First choose a name for the cluster.

You will see that there are some warnings if you click View Report. You can ignore the warning as we will be creating a File Share Witness.

You may get the following message after the cluster creates. “The action ‘Validate Configuration…’ did not complete.

Fix Cluster Access Point IP Address

The underlying issue here is that the cluster is not resolving the cluster name properly due to an IP address conflict. What you will find is that Azure DHCP actually gives out a duplicate IP address to the cluster computer object that you just created. This is just one of the weird Azure idiosyncrasies that you will have to deal with as shown below.

You may need to open the Failover Cluster GUI on SQL2 in order to connect. Once you are able to connect to the cluster you will see that the cluster grabbed the same IP address as one of the cluster nodes. This of course causes IP address conflicts.

What we need to do is change the 10.0.0.102 IP address to another IP address not used in this subnet.

You will see I picked 10.0.0.200 as my address. This address is NOT reserved in the DHCP scope as there is currently no way to control the DHCP scope or add reservations. I just pick an address at the upper end of the DHCP scope and make sure that I don’t provision enough VMs within this subnet to ever reach that IP address.

Now that the Cluster IP address is fixed you will be able to connect to the cluster using Failover Cluster Manager from either node.

Create File Share Witness

Next we will create a File Share Witness for the cluster quorum. For a complete description of cluster quorums read my blog post on MSDN press, http://blogs.msdn.com/b/microsoft_press/archive/2014/04/28/from-the-mvps-understanding-the-windows-server-failover-cluster-quorum-in-windows-server-2012-r2.aspx

The file share witness will be created on the Domain Controller. Essentially you need to create a file share on DC1 and give read/write permissions to the cluster computer account “sioscluster”. Make sure to make these changes to both the Share and Security permissions as shown below.

The following steps are done on DC1.

Create a new folder.

Make sure you search for Computer objects and pick the cluster computer object name, in our case, SIOSCLUSTER

Make sure you give it Change permissions.

You also need to change the Security to allow the cluster computer object Modify permissions on the folder.

Once you create the shared folder, you will add the File Share Witness using the Windows Server Failover Cluster interface on either of the nodes as shown below.

Install DataKeeper

DataKeeper Cluster Edition from SIOS Technology is needed in order to provide the replication and cluster integration that will allow you to build a failover cluster instance without shared storage. First, you will install DataKeeper Cluster Edition on both of the nodes of your cluster. Run through the setup as shown below.

For demonstration purposes I used the domain administrator account. The only requirement is that the user account used is in the local administrator group in each server.

Create a DataKeeper Volume Resource

After you install the software on each cluster node (SQL1 and SQL2) you are ready to create you first replicated volume resource. Launch the DataKeeper GUI on either node and follow the steps below to create a DataKeeper Volume Resource.

After you connect to both servers, click on the Server Overview Report. It should look like the following.

You’ll notice that you are connected to both servers, but there are no volumes listed. Next we will need to add additional storage to each cluster node. Do this through the Azure portal as shown below.

After you have added the additional volume to each VM and created a formatted partition, you DataKeeper GUI should look like this.

You are now ready to launch the Create a Job Wizard and create the DataKeeper Volume resource as shown below.

Create the job and give it a name and optional description.

Install SQL into the Cluster

Now that you have the cluster configured and a DataKeeper Volume in Available Storage, you are ready to begin the SQL Server Cluster Installation. This process is exactly the same as if you were installing a SQL Server Failover Cluster Instance using shared storage. Since the DataKeeper Replicated Volume resource is a Storage Class resource, failover clustering treats it like a Physical Disk resource. Follow the steps pictured below to install SQL Server into the cluster.

You can use SQL Server 2014 Standard Edition to build a 2-node Failover Cluster. In this scenario DataKeeper can also replicate Data to a 3^rd node, but that node cannot be part of the cluster. If you want to create a 3+ node cluster you will need to use SQL Server 2014 Enterprise Edition. Earlier versions of SQL work perfectly fine as well. I have tested SQL 2008 through SQL 2014.

Before clicking Next, click on the Data Directories tab.

Once SQL is installed on the first node, you will then need to run the installation on the second node.

Create an Internal Load Balancer

Once the cluster is configured, you will need to create the internal load balancer(ILB) which will be used for all client access. Clients that connect to SQL Server will need to connect to the ILB instead of connecting directly to the cluster IP address. If you were to try to connect to the cluster directly at this point you would see that you cannot connect to the cluster from any remote system. Even SQL Server Management Studio will not be able to connect to the cluster directly at this point.

Run this Powershell Command from your local desktop to create your Internal Load Balancer (ILB).

# Define variables

$IP = “10.0.0.201” # IP address you want your Internal Load Balancer to use, this should be the same address as your SQL Server Cluster IP Address

$svc=”SQLFailover” # The name of your cloud service

$vmname1=”sql1″ #The name of the VM that is your first cluster node

$epname1=”sql1″ #This is the name you want to assign to the endpoint associated with first cluster node, use anything you like

$vmname2=”sql2″ #The name of the VM that is your second cluster node

$epname2=”sql2″ #This is the name you want to assign to the endpoint associated with second cluster node, use anything you like

$lbsetname=”ilbsetsqlha” #use whatever name you like, this name is insignificant

$prot=”tcp”

$locport=1433

$pubport=1433

$probeport=59999

$ilbname=”sqlcluster” #this is the name your clients connect to, it should coincide with you SQL cluster Name Resource

$subnetname=”Public” #the name of the Azure subnet where you want the internal load balancer to live

# Add Internal Load Balancer to the service

Add-AzureInternalLoadBalancer -InternalLoadBalancerName $ilbname -SubnetName $subnetname -ServiceName $svc –StaticVNetIPAddress $IP

# Add load balanced endpoint to the primary cluster node

Get-AzureVM -ServiceName $svc -Name $vmname1 | Add-AzureEndpoint -Name $epname1 -LBSetName $lbsetname -Protocol $prot -LocalPort $locport -PublicPort $pubport -ProbePort $probeport -ProbeProtocol tcp -ProbeIntervalInSeconds 10 –DirectServerReturn $true -InternalLoadBalancerName $ilbname | Update-AzureVM

# Add load balanced endpoint to the secondary cluster node

Get-AzureVM -ServiceName $svc -Name $vmname2 | Add-AzureEndpoint -Name $epname2 -LBSetName $lbsetname -Protocol $prot -LocalPort $locport -PublicPort $pubport -ProbePort $probeport -ProbeProtocol tcp -ProbeIntervalInSeconds 10 –DirectServerReturn $true -InternalLoadBalancerName $ilbname | Update-AzureVM

Assuming the script ran as planned, you should see the following output.

Update the Client Listener

Once the internal load balancer is created we will need to run a Powershell script on SQL1 to update the SQL Server Cluster IP address. The script references the Cluster Network name and the IP Resource Name. The pictures below show you were to find both of these names in Failover Cluster Manager.

The script below should be run on one of the cluster nodes. Make sure to launch Powershell ISE using Run as Administrator.

# This script should be run on the primary cluster node after the internal load balancer is created

# Define variables

$ClusterNetworkName = “Cluster Network 1” # the cluster network name

$IPResourceName = “SQL IP Address 1 (sqlcluster)” # the IP Address resource name

$CloudServiceIP = “10.0.0.201” # IP address of your Internal Load Balancer

Import-Module FailoverClusters

# If you are using Windows 2012 or higher, use the Get-Cluster Resource command. If you are using Windows 2008 R2, use the cluster res command which is commented out.

Get-ClusterResource $IPResourceName | Set-ClusterParameter -Multiple @{“Address”=”$CloudServiceIP”;”ProbePort”=”59999″;SubnetMask=”255.255.255.255″;”Network”=”$ClusterNetworkName”;”OverrideAddressMatch”=1;”EnableDhcp”=0}

# cluster res $IPResourceName /priv enabledhcp=0 overrideaddressmatch=1 address=$CloudServiceIP probeport=59999 subnetmask=255.255.255.255

Assuming your script ran as expected, the output should look like below. You see that in order for the changes to be applied, you will need to bring your cluster resource offline once and then bring it online.

Firewall

Open TCP port 59999, 1433 and 1434 are open on the firewall of each server.

Summary

Now that the cluster is created you can connect to the SQL Failover Cluster Instance via the Internal Load Balancer, using the name sqlcluster or directly to 10.0.0.201.

If you have any questions at all, I will be glad to give you a helping hand. Tweet me @daveberm and I’ll be sure to get in touch with you.

For more information visit https://us.sios.com/solutions/cloud-high-availability/azure/

#Azure as a Cloud Backup Service your Windows Clients

December 18, 2014January 27, 2015 davebermLeave a comment

Just announced this week, Azure can now be used as a backup target for Windows 7/8 client backups. The first 5 GB are free, then it is $0.20 per GB. You can backup about 105 GB (compressed) per month for about $20. That is half the price of MozyHome. What do you all use for your offsite backups for your home PC?

http://azure.microsoft.com/blog/2014/12/16/azure-backup-announcing-support-for-windows-client-operating-system/

Clustering For Mere Mortals

Microsoft Cloud and Datacenter MVP David Bermingham's thoughts and advice on Windows clustering and other related technologies

Azure

Highly Available SQL Server Storage Options in #Azure: SMB 3.0 File Service or Premium Storage, a look at performance differences

Configuring the SQL Server AlwaysOn ILB for the Client Listener in Azure Resource Manager (ARM) deployment model #SQLPASS

Three Fault Domains in #Azure now default when using Resource Manager deployment model

Clustering SAP ASCS instance on Azure

Azure Cloud Reaches Canada

Protecting Business Critical Apps By @DaveBerm | @CloudExpo [#Cloud]

Why would you want to build a #SQLServer failover cluster instance in the #Azure cloud?

New Azure Announcements: Azure for Disaster Recovery and Very Large VMs #azure

Step-by-Step: How to configure a SQL Server Failover Cluster Instance (FCI) in Microsoft Azure IaaS #SQLServer #Azure #SANLess

Overview

Create your Virtual Network

Create a Cloud Service

Create a Storage Account

Create your Azure VMs and Storage

Words of Wisdom about Fault Domains

Configure Active Directory

Insider Tip – I have also found that DNS resolution works best if you remove all DNS forwarders on the DNS server and just use root hints. Azure can sometime have problem resolving Microsoft web properties if you use their DNS servers are forwarders.

Create a Cluster

Enable Failover Cluster

Validation

Create Cluster Access Point

Fix Cluster Access Point IP Address

Create File Share Witness

Install DataKeeper

Create a DataKeeper Volume Resource

Install SQL into the Cluster

Create an Internal Load Balancer

Update the Client Listener

Firewall

Summary

#Azure as a Cloud Backup Service your Windows Clients