How to Create a Site-to-Site VPN Tunnel to the Windows Azure Cloud Using a Window Server 2012 R2 Routing and Remote Access (RRAS) Server
Not long ago I set out to build a multisite SQL Server cluster where one my nodes resides in my local data center and the other node resides in Microsoft’s Infrastructure as a Service (IaaS) offering, the Windows Azure Cloud. The Azure Cloud has an offering where you can deploy VMs and pay for just the resources you utilize, much like Amazon’s EC2. My goal was to create a proof of concept where I would use the Azure Cloud as an inexpensive disaster recovery site. My configuration is shown in Figure 1.
1. An example of the simple DR configuration I used in my POC
My on premise VMs are used as follows:
- VM1-internal – Routing and Remote Access Server for NAT and VPN connectivity to the Azure Cloud
- VM2-internal – The primary node in my cluster
- VM3-internal – My domain controller
For this POC I only deployed on server in the Azure cloud, Azure-DR. Azure-DR is the secondary node in my cluster. If this were an actual production site, I certainly would also want to deploy another domain controller in the Azure cloud to ensure that my Active Directory was available in the DR site. Your actual DR configuration will vary greatly depending upon your needs. I will use the server name depicted in my illustration as I describe the configuration steps below.
The scope of this post
For the purpose of this post, I am going to focus on what you need to do to get to the point where you have configured your virtual network in Azure and you create a site to site VPN connection to your primary data center. My next article will discuss the steps required to actually create a multisite cluster for disaster recovery. As with most cloud related services, the interfaces and options tend to change rapidly; the screen shots and directions you see below are relevant as of January 2nd, 2014. Your experience may vary, but these directions should get you pretty darn close. If you encounter difference, please send me a comment and what you did to make it work so other users can benefit from your experience.
Create your Local Network
I’m not going to walk you through this step-by-step, but essentially you should have a Windows Server 2012 R2 DC configured (VM3-internal) and two additional Windows Server 2012 R2 servers in the domain (VM1-internal and VM2-internal). Each server should use the DC server as their primary DNS server and on VM2-internal and VM3-internal the gateway should be configured to point to VM1-internal, which will eventual be configure with Routing and Remote Access (RRAS). The RRAS (VM1-internal) should be dual homed, with one NIC connected to the internal network and one NIC connected directly to the Public network. Generally this will be the biggest obstacle in deploying this in your lab, as you must have a spare public IP address that you can use for your RRAS server. This configuration will not work if your RRAS server sits behind a NAT’s firewall, it must be directly connected to the internet. The RRAS Server should be configured with just the IP address, subnet mask and DNS server, no gateway should be defined. DO NOT enable Routing and Remote Access, this will be done automatically via a script at a later step.
Create a Virtual Network
Log in to the Windows Azure Management Portal and create a new Virtual Network following the steps illustrated below.
When you click the check box you should now see the new virtual network you just created.
Create the Gateway
Once the virtual network is created, you will need to create the Gateway. From the Dashboard of the newly created virtual network, you will be able to create a Gateway as shown below. As of April 25th, 2013, Static Routing with RRAS is not supported in the Azure VPN connection, so be sure to choose Dynamic Routing.
It could take 30 minutes or longer before your gateway is finished being created, be patient…
Once the gateway is finished creating, you will see your Gateway IP Address and the amount of Data In and Data Out as shown below.
Configure your local RRAS Server
At this point you are ready to configure your on-premise RRAS Server (VM1-internal) to create a site-to-site VPN to the Gateway that you just created. Microsoft has made this very easy, so don’t worry if networking and configuring VPNs are not your specialty. You will just need click on “Download VPN Device Script” and run it on your RRAS server. Microsoft also supports a bunch of Juniper and Cisco VPN routers as well, so if you want to move to a hardware based VPN device in the future you can always come back and download the configuration script specific to your device.
Choose Microsoft Corporation as the Vendor, RRAS as the Platform and Windows Server 2012 as the Operating System and click the checkbox to download the Powershell script. In my case, this same script worked just fine when run on Windows Server 2012 R2.
As of the date of this writing, it seems as if Microsoft has made the script creation process even more intelligent than it was just last month. The script that it just created for me was pre-populated with all the information required; I did not have to edit anything at all.
At this point, all you need to do is copy the script file on to your RRAS Server (VM1-internal) and save it as a .ps1 and run the PowerShell script. This script will install Routing and Remote Access and configure the Site-to-Site VPN to connect to the Windows Azure Virtual Network you just created. Once you have finished with the RRAS installation go back to the Azure Portal and click Connect to complete the VPN site-to-site connection.
When connected, the Azure Portal should look something like the following.
Enable NAT on the RRAS Server
The final step I had to take to have a usable network was to enable NAT on my RRAS Server. Without having NAT enabled none of my servers could reach the internet. The basic steps for enabling NAT on RRAS are as follows:
- Open the Routing and Remote Access MMC
- Expand IPv4, right-click General, and then click New Routing Protocol.
- In Routing protocols, click NAT, and then click OK.
- Right-click NAT, and then click New Interface.
- Select the interface that connects to your private intranet, and then click OK.
- Select Private interface connected to private network, and then click OK.
- Right-click NAT, and then click New Interface again.
- Select the interface that connects to the public Internet, and then click OK.
- Select both Public interface connected to the Internet and Enable NAT on this interface, and then click OK.
The fun can now begin. In my next post I will walk you through the process of provisioning a Windows VM in Azure and joining it to your on-premise domain.
When you launch a new instance you only have two options for the OS storage: Standard or Provisioned IOPS. Both are EBS volumes persistent across reboots. Many instances come with a bunch of extra ephemeral drives attached, which are NOT persistent. I usually delete these ephemeral drives so I am not tempted to store data on them. You will have to add additional EBS volumes for additional persistent storage.
This article seems to indicate that you can launch AMI’s based on the “EC2 Instance Store”, which is NOT persistent, but I’ve never seen that option. All of my instances have always had root devices that are EBS based; I have not seen one that is not EBS based. I’m assuming they mean some of the instances in the Amazon Market Place may use non-persistent volumes. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/RootDeviceStorage.html
You’ll see the root device when you launch the instance, like I highlighted below. As long as EBS is the root device you are good to go and can be sure your changes will persist across reboots.
As far as instance size, it will depend on the needs of the application. The good thing about EC2 is that if you provision an AMI that is under powered, you can go back and increase the instance size, though it does require a reboot. If IOPS are important, you will want to make sure you choose an instance that is EBS optimized. See this page for the instance details. http://aws.amazon.com/ec2/instance-types/#instance-details . You’ll see the first instance type which is EBS optimized is M1.large.
Read this guide for additional tips for optimal storage configuration. http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSPerformance.html . One of the best tips for increased IOPS is to use multiple smaller EBS volumes and put them together in a RAID 0 on the Windows server. Because the EBS volumes are RAID1 on the backend, you are essentially deploying RAID 1+0 in your VM for optimal performance and availability.
In the old days if you wanted a guaranteed 4000 IOPS on EBS, you had to provision a minimum of a 400 GB vo0lume. Considering you pay per the GB, and provisioned IOPS are not cheap, if you only needed 100 GB of fast storage you were stuck paying for 300 GB of unused storage.
With this recent announcement Amazon has made it easier to get fast storage in smaller increments. Now if you want 4000 IOPS you can get that in EBS volumes as small as 133 GB up to 1 TB in size. Read the following press release for more information.
Check out this great article by SQL Server guru @MrDenny
Title: Massive Speed and HA
Subtitle: Two Things That Usually Don’t Go Together
First Paragraph: Question: William asks “I have a SQL Server which needs both high availability and high-speed storage. Even SAN storage with SSD based hard drives isn’t providing the performance levels that we are looking for, while still getting the failover cluster HA solution that we need for our SQL Server 2008 R2 database?”
Article Link: http://sqlmag.com/blog/massive-speed-and-ha
Author: Denny Cheery (@MrDenny)
Source: SQL Server Pro / SQLMag.com
Source Link: http://sqlmag.com/
Denny Cherry is the owner and principal consultant for Denny Cherry & Associates Consulting and has over a decade of experience working with platforms such as Microsoft SQL Server, Hyper-V, vSphere and Enterprise Storage solutions. Denny’s areas of technical expertise include system architecture, performance tuning, security, replication and troubleshooting. Denny currently holds several of the Microsoft Certifications related to SQL Server for versions 2000 through 2008 including the Microsoft Certified Master as well as being a Microsoft MVP for several years. Denny has written several books and dozens of technical articles on SQL Server management and how SQL Server integrates with various other technologies.
Webinar Invite: How to Deploy SQL Server AlwaysOn Failover Clusters in Amazon EC2 with @awscloud #amazonaws
Deploying Your Business Critical SQL Server Apps on Amazon EC2
Amazon Web Services (AWS) and SIOS Technology Corp, an AWS Partner Network (APN) Technology Partner, invite you to attend this live webinar to learn how to optimize mission critical SQL Server deployments on Amazon EC2.
Learn how to take advantage of the cost benefits and flexibility of Amazon EC2 while maintaining protection with native Microsoft Windows Server Failover Clustering – all without shared storage.
Who should attend:
Solution Architects, Developer, Development Leads and other SQL Professionals
Miles Ward, Solutions Architect, Amazon Web Services
Tony Tomarchio, Director of Field Engineering, SIOS Technology Corp
Date / Time:
Wednesday, June 5, 2013 – 10AM PT / 1PM ET
Click here to register
If you want to install ANY version of SQL Server in a Windows Server 2012 environment I highly recommend you read the following KB article.
In particular, I ran into this error while trying to install SQL Server 2008 R2 on Windows Server 2012 and was running into the following error (among others).
Figure 1 – Rule “Cluster service verification” failed
The fix is simple, as described in the KB article, simply enable the Failover Cluster Automation Server in the Add Roles and Features wizard or via the following Powershell command:
That fix will resolve the other Setup Support Rules errors including the cluster validation error and any errors about cluster storage. You should be able to re-run the SQL installation and it will pass all the Setup Support Rules and allow you to continue with the cluster install.
I just got the word that this is official and we are ready to ship software…
SIOS Technology is pleased to offer the Microsoft MVP Community a fully functional two NFR copies of SteelEye DataKeeper Cluster Edition (DKCE). DKCE enables SANless failover clustering solutions using any local attached storage and enables high speed and highly available SMB 3.0 storage solutions for SQL Server and Hyper-V.
Common use cases of DKCE include SAN-less SQL Server Failover Cluster Instances using local high speed storage solutions like Fusion-io or even building clusters in the public cloud. Another exciting possibility is highly available file servers on Windows Server 2012 for robust SMB 3.0 storage solutions for Hyper-V or SQL Server without having to purchase a shared SAS Array or SAN. Once again you can take advantage of the blazing speeds possible with SSD or local flash based storage without sacrificing any availability and be the first kid on your block to migrate your Hyper-V and SQL Server to SMB 3.0 and take advantage of faster failovers and easier storage management.
Simply email firstname.lastname@example.org to learn how to get started with DataKeeper Cluster Edition!
MVPs grab your copy today and let me know what you think. I will be monitoring the forum you will be invited to join and look forward to your feedback.
If you have followed the history of clustering as closely as I have for the past 10 years as a Microsoft Cluster MVP, you will notice that Microsoft has been steadily been moving away from single copy clusters. It started with Windows Server 2003 with the elimination of a shared disk quorum and the introduction of majority node set quorums and the file share witness. The complaint with clusters based on shared disk quorums was that if the quorum became unavailable or corrupt, the entire cluster would fail. This was a major complaint and it primarily what gave clustering a bad name in the early days of clustering.
Once the shared disk quorum was eliminated, people were still left with their application data residing on the cluster which was also a problem as the SAN was still a single point of failure in a cluster, a performance impediment and a management headache. Microsoft has begun to address those concerns with the introduction of Exchange 2007 CCR and Exchange 2010 DAGs as well as SQL Server 2008 R2 Database Mirroring. Microsoft has eliminated Exchange 2010 single copy clusters entirely and SQL Server single copy clusters are only still around because they haven’t perfected SQL Server replication yet.
Hyper-V being the most recent cluster resource supported by Microsoft clustering does not yet have a native cluster integrated replication solution. This is where SIOS DataKeeper fits in. We first demonstrated our DataKeeper Hyper-V replication solution at the Microsoft Virtualization launch in September of 2008 and have been providing HA and DR solutions for Hyper-V since Hyper-V was first introduced. Our solution is logo certified for Windows Server 2008 R2 as well as Hyper-V.
DataKeeper fills the gap left by single copy clusters as show in the table below and subsequent paragraphs. The following customer story also highlights some of the reasons why people are adopting DataKeeper in lieu of SAN based solutions.
Eliminates single point of failure
A SAN is a single entity made up of redundant pieces. To have a truly redundant SAN you need redundant controllers, power supplies, CPU’s, switches, UPS, RAM and the clients connecting to it need to have redundant NICs or HBAs and multi-path solutions configured. Even once you have eliminated hardware as a single point of failure, the SAN is still controlled by firmware which itself is a single point of failure. And then because the SAN resides in a single location, any physical disasters (think water, fire, etc.) also represents a risk.
Given same disk specs, disk installed locally will perform better than disks stored on a SAN accessed via iSCSI. Also, using local storage opens up the possibility of using even higher speed storage solutions such as flash based PCIe storage which outperforms SANs that costs hundreds of thousands of dollars at a fraction of the costs.
Not only do you have to facture in the initial investment, which the DataKeeper solution wins by a significant percentage, you have to factor in the ongoing expense involved with maintenance, power and cooling required for any enterprise class SAN.
Supports future expansion for Disaster Recovery
Should disaster recovery solutions become a requirement in the future the DataKeeper solution can easily accommodate adding an addition Hyper-V node in a remote location in a multisite cluster configuration for a robust disaster recovery solution that includes the best RTO and RPO available. The SAN solution would require the purchase of an additional SAN, replication software and might not even include cluster integration as there are only a few solutions that actually integrate with failover clustering as well as DataKeeper does.
Eliminates Planned Downtime
With SAN based cluster solutions, any maintenance on the SAN requires planned downtime. The DataKeeper solution allows for rolling upgrades, meaning planned downtime for hardware maintenance is eliminated.
SAN administration usually involves a SAN administrator who is familiar with the features and functionality of a SAN. The DataKeeper solution on the other hand is a simple software solution that is managed by the Windows Server administrator and features complete integration with Windows Server Failover Clustering, meaning the management is controlled through failover cluster, a tool which should be familiar with most Windows Administrators.
In summary, DataKeeper is able to provide a much more resilient cluster solution at a fraction of the cost of SAN based solutions.
Here is an excellent white paper on SQL Server Multisite Clusters, however they forget to mention that you can also do this with host based replication. Instead, they assume you have “two EMC Symmetrix VMAX enterprise storage arrays, one at each site. The arrays were both configured with two VMAX storage engines and 240 disk drives”. If you have a million plus dollars in your budget for storage, go ahead and knock yourself out. If not, you may want to look into some Fusion-io PCIe Flash storage and host based replication with DataKeeper cluster edition, faster than a SAN at a fraction of the cost with all the availability. Check out how Polaris Industries did just this http://www.fusionio.com/blog/polaris-sios/
Have you been thinking about moving to the cloud? The potential cost savings makes it nearly impossible not to consider. The cost justification is usually easy to figure out and the cloud almost always comes out looking like a good investment. However, after you stop counting the money you are going to save you start thinking about things like security and availability and wonder whether the cloud is for you.
In a traditional data center you have the control and can deploy whatever security and high availability solution you like. However, once you decide to move your servers to the cloud your choices can become much more limited. It doesn’t matter whether you’re with Amazon, Google or Microsoft, outages in the cloud can and do occur and you need to do whatever you can to mitigate such risks.
Let’s take a closer look at Amazon Web Services (AWS) for instance. What are the options you have to ensure that your SQL Server database can survive an unexpected outage? While some applications can be deployed in a load balanced configuration across multiple availability zones, SQL Server is generally not deployed in a load balanced configuration. What this means is that SQL Server itself resides in a single availability zone and if that zone should become unavailable, your whole application stack can come to a grinding halt.
If you read this article by Miles Ward, you will see that with SQL Server 2008 R2 your availability options are pretty limited. In that article on page 11 there is a nice chart that lays out your HA options. As you will see, the options are severely limited and mostly fall outside of the category which would be described as HA. Log shipping, mirroring and transactional replication are pretty much the only options you have, and they are more of a data protection options rather than HA options. If you want Microsoft failover clustering you will find yourself out of luck due to some network limitations (clients can’t connect to a clustered IP address) in AWS and the lack of a shared disk resource required for traditional SQL clusters.
If you are looking to deploy SQL Server 2012, your options get a little better. As described by Jeremy Peschka, with a little manual intervention you can deploy AlwaysOn Availability Groups in AWS to do asynchronous replication from your data center to AWS, or even between AWS availability groups. Of course this assumes you have the SQL 2012 Enterprise license required for AlwaysOn Availability Groups. The only “issue” is that AWS really doesn’t support moving cluster IP address from one server to another, so client redirection has to be done manually using the ec2-unassign-private-ip-addresses and ec2-assign-private-ip-addresses commands after switchover that Peschka describes in his article. All-in-all this is a very manual process, which again does not really fit the description of a highly available system.
If you can live without automated recovery and with the limitations of AlwaysOn Availability Groups that I described in a previous blog post, then you might just want to go ahead and try the AlwaysOn Availability Group deployment in AWS. However, if you are looking for an easier, more affordable, more robust HA solution, I have some really good news. SIOS Technology Corp has been looking at this problem and has developed a solution that overcomes all of the limitations previously described and will be available as an AMI for easy deployment. This solution is currently in private beta, but will be widely available later this year.
The SIOS solution is based on SQL server in a Microsoft Failover Clustering using DataKeeper Cluster Edition host based replication. By using hosted based replication they have overcome the first obstacle of clustering in EC2 – lack of shared storage. The second obstacle that SIOS had to overcome was the issue of client redirection described by Peschka; the client access point needs to be manipulated from within EC2, not failover clustering. SIOS has built intelligence into their AMI solution such that the reassigning of the IP address is automated as part of the cluster failover process, effectively simulating the behavior you would normally expect from a cluster.
And because all of this is built on top of failover clustering, this can be deployed using SQL 2008/2008 R2 or 2012. Even the Standard Edition of SQL Server will support a 2-node cluster so the cost savings vs. deploying SQL 2012 AlwaysOn Availability groups could be substantial.
Let me know what you think. Does this solution sound interesting? What are you doing today to ensure the availability of your SQL Server EC2 instances?