January 2013 – Clustering For Mere Mortals

Have you been thinking about moving to the cloud? The potential cost savings makes it nearly impossible not to consider. The cost justification is usually easy to figure out and the cloud almost always comes out looking like a good investment. However, after you stop counting the money you are going to save you start thinking about things like security and availability and wonder whether the cloud is for you.

In a traditional data center you have the control and can deploy whatever security and high availability solution you like. However, once you decide to move your servers to the cloud your choices can become much more limited. It doesn’t matter whether you’re with Amazon, Google or Microsoft, outages in the cloud can and do occur and you need to do whatever you can to mitigate such risks.

Let’s take a closer look at Amazon Web Services (AWS) for instance. What are the options you have to ensure that your SQL Server database can survive an unexpected outage? While some applications can be deployed in a load balanced configuration across multiple availability zones, SQL Server is generally not deployed in a load balanced configuration. What this means is that SQL Server itself resides in a single availability zone and if that zone should become unavailable, your whole application stack can come to a grinding halt.

If you read this article by Miles Ward, you will see that with SQL Server 2008 R2 your availability options are pretty limited. In that article on page 11 there is a nice chart that lays out your HA options. As you will see, the options are severely limited and mostly fall outside of the category which would be described as HA. Log shipping, mirroring and transactional replication are pretty much the only options you have, and they are more of a data protection options rather than HA options. If you want Microsoft failover clustering you will find yourself out of luck due to some network limitations (clients can’t connect to a clustered IP address) in AWS and the lack of a shared disk resource required for traditional SQL clusters.

If you are looking to deploy SQL Server 2012, your options get a little better. As described by Jeremy Peschka, with a little manual intervention you can deploy AlwaysOn Availability Groups in AWS to do asynchronous replication from your data center to AWS, or even between AWS availability groups. Of course this assumes you have the SQL 2012 Enterprise license required for AlwaysOn Availability Groups. The only “issue” is that AWS really doesn’t support moving cluster IP address from one server to another, so client redirection has to be done manually using the ec2-unassign-private-ip-addresses and ec2-assign-private-ip-addresses commands after switchover that Peschka describes in his article. All-in-all this is a very manual process, which again does not really fit the description of a highly available system.

If you can live without automated recovery and with the limitations of AlwaysOn Availability Groups that I described in a previous blog post, then you might just want to go ahead and try the AlwaysOn Availability Group deployment in AWS. However, if you are looking for an easier, more affordable, more robust HA solution, I have some really good news. SIOS Technology Corp has been looking at this problem and has developed a solution that overcomes all of the limitations previously described and will be available as an AMI for easy deployment. This solution is currently in private beta, but will be widely available later this year.

The SIOS solution is based on SQL server in a Microsoft Failover Clustering using DataKeeper Cluster Edition host based replication. By using hosted based replication they have overcome the first obstacle of clustering in EC2 – lack of shared storage. The second obstacle that SIOS had to overcome was the issue of client redirection described by Peschka; the client access point needs to be manipulated from within EC2, not failover clustering. SIOS has built intelligence into their AMI solution such that the reassigning of the IP address is automated as part of the cluster failover process, effectively simulating the behavior you would normally expect from a cluster.

And because all of this is built on top of failover clustering, this can be deployed using SQL 2008/2008 R2 or 2012. Even the Standard Edition of SQL Server will support a 2-node cluster so the cost savings vs. deploying SQL 2012 AlwaysOn Availability groups could be substantial.

Let me know what you think. Does this solution sound interesting? What are you doing today to ensure the availability of your SQL Server EC2 instances?

In my previous post I walked through the process of building a 2-node cluster up to the point where we are ready to start configuring the cluster resources. If you have completed those steps you are ready to move on and actually create your clustered application. First up, we have SQL Server 2012. SQL Server 2012 cluster installation is pretty much identical to SQL 2008/2008 R2 cluster installations, so most of this will apply even if you are using SQL 2008/2008 R2. The terminology around SQL Server 2012 Clustering gets a little convoluted. You will hear mention of SQL Server AlwaysOn, which essentially could mean one of two different things: AlwaysOn Availability Groups or AlwaysOn Failover Cluster Instance. The confusion arises because both solutions require some level of integration with Windows Server Failover Clustering and it is even further confused by the fact that you can deploy a combination of AlwaysOn Availability Groups and AlwaysOn Failover Clustering, but that is a topic for another day!

I’ll break it down in easy to understand terms. Essentially AlwaysOn Availability Groups is what used to be called Database Mirroring in SQL 2008 R2 and earlier. It has some new bells and whistles that overcome some of the limitations of earlier versions of database mirroring, so it is certainly worth checking it out. AlwaysOn Failover Cluster Instance is simply what used to be called a SQL Server Failover Cluster. This is the latest edition of the same clustering technology that has been available since early versions of SQL Server. One of the best new features of SQL Server 2012 AlwaysOn Failover Cluster Instance is the ability to have nodes in different subnets. This was a major limitation in earlier versions of SQL Server. In a previous blog entry I discussed some of the limitations of AlwaysOn Availability Groups, you should check that out before you make any decisions on which technology to deploy.

With that said, this article is going to focus on the Step-by-Step instructions on deploying a SQL Server 2012 AlwaysOn Failover Cluster Instance.

Step 1 is to make sure your cluster storage is ready. If you followed the instructions in my previous post, you will know that instead of a shared disk resource, we are going to use a replicated disk resource using the 3^rd party software DataKeeper Cluster Edition. If you are using shared storage and have added the storage than you can skip right to Step 2 where we begin the SQL install. Otherwise, follow the steps below to configure DataKeeper Cluster Edition to replicate the local disks for use in a SQL cluster.

Install and configure DataKeeper Cluster Edition
1. Run DK Setup
2. Go through the entire installation process selecting all of the default values.
3. Restart the computer after the installation completes as prompted and repeat the process on the SECONDARY server
4. Launch the DataKeeper UI on PRIMARY and click Connect to Server. Connect to PRIMARY and then connect to SECONDARY
5. Click on Create Job and walk through the Create Job wizard to create a mirror of the E drive
  
  Choose the source volume of the mirror and the IP address of the NIC that will carry the replication traffic.
  
  Choose the target of the mirror and click Next
  
  Here you will choose your mirror options:
  Compression – only enable for replication across a WAN
  Asynchronous – choose this for all WAN replication
  Synchronous – this is ideal for LAN replication
  Maximum bandwidth – used in WAN replication as a way to put a cap on the amount of bandwidth replication is allowed to use. Generally it should be left on 0, however for initial mirror creation you may want to limit the bandwidth so replication does not use all available bandwidth to do the initial synchronization
  
  Once you click Done the mirror will be created.
  
  Once the mirror is created you will be prompted to register the volume in Windows Server Failover Clustering (WSFC). Click Yes and a new DataKeeper Volume Resource will be registered in Available Storage (see picture in Step 2).
In Step 2 we are going to begin the installation of SQL Server 2012 on the first cluster node.
1. Before we begin, make sure your storage appears in Failover Cluster Manager and is assigned to the Available Storage group as shown below
2. At this point we are going to launch the SQL Server 2012 setup and go to the Installation Tab and click New SQL Server failover cluster installation
3. Step through the installation as shown in the following screen shots.
  
  The following error is expected if your servers are not connected to the internet. If you are connected to the internet you should go ahead and accept the updates it finds.
  
  For Service Account best practices read the following: http://msdn.microsoft.com/en-us/library/ms143504.aspx
  For our lab purposes I am just using the Administrator account
  
  Before you click next, click on the Data Directories tab and change the location of tempdb. With Windows Server 2012 tempdb no longer has to reside on the cluster storage. In our example we are moving tempdb to the C drive to avoid replicating unnecessary data.
  
  At this point you will need to make sure to create the same tempdb directory on the SECONDARY server as advised by the warning.
  
  Congratulations, the 1^st cluster node has been installed.
We are now ready to install SQL on the second node of the cluster.
1. Go to the SECONDARY server and launch the SQL Server 2012 Setup and follow the wizard as shown in the following screen shots, starting with clicking on Add node to a SQL Server failover cluster.
  
  The following error is expected if your servers are not connected to the internet. If you are connected to the internet you should go ahead and accept the updates it finds.
Congratulations – you have built a 2-node SQL Server 2012 AlwaysOn Failover Cluster Instance. Open up Failover Cluster Manager and you should see something that looks like this.

This article was meant to be just a quick run through on how to install SQL 2012 in a Windows Server 2012 cluster. For additional reading start here and let Google be your friend!