In a previous blog post, I talked about how to take advantage of the cloud for disaster recovery (DR) and the key areas on which to focus. Taking advantage of the cloud for DR solutions can reduce business continuity expenses and provide a way to ensure that key services are operational in the event an unplanned adverse event occurs. In this article, I will talk about a specific cloud for this solution: Microsoft Azure.
Azure’s primary DR solution is Azure Site Recovery (ASR). With ASR, you can replicate Hyper-V, VMware virtual machines (VMs), or even a physical machine to Azure Storage. When a disaster occurs, you can launch these replicated VMs in specific order automatically and begin using them in production while the disaster is being mitigated. When you are ready to go back to regular operations, you can failback to the on-premises environment by replicating back. A lot of what I am going to describe is documented very well by Microsoft. This page describes how to set up and deploy ASR. My goal is to provide some insight into how ASR works and some thoughts on what to consider when using it for DR.
Figure 1 describes key focus areas for using Azure as a DR tool. Rather than repeat the Architecture Overview section, I will focus on some of its key components (items 1 to 5). Although items 6 to 11 are important, I want to focus on areas that are often missed when discussing the use of ASR as part of your DR strategy. For instance, two areas that people tend to overlook are SQL Server (or database) replication and Active Directory (AD). These are important because, depending on your needs, you most likely will require active servers running in Azure. Except for the very smallest environment, it is difficult to get away with only VM replication.
1. DNS: Part of the process needs to include domain name system (DNS) entries so that Active Directory can be accessed by users and applications. Global failover can be automated with Azure Traffic Manager. This helps if users access applications externally via the web (e.g., commerce web). If they are internal users to an internal app, DNS needs to be adjusted accordingly.
2. Networking: SQL and AD must reside on the same virtual network (VNet) that the VMs will failover to. It will still need to include all the components necessary to make it operational. Does the entire network have to be duplicated? That depends upon your needs during DR. If the idea is to just get functioning, it could run in a degraded mode. You can always adjust afterward if the disaster lasts longer than foreseen. That is the power of the cloud!
3. Active Directory Replication: AD is also affected by recovery time objectives (RTOs)and recovery point objectives (RPOs). Microsoft recommends replicating (protecting) an AD domain controller, especially if DNS is installed. In more complex environments, adding an active domain controller replication to the Azure virtual network is necessary. Of course, this will add to your monthly costs.
4. VM Replication: During the process of planning out your DR solutions and configurations, you need to be aware of your RPOs and RTOs. Microsoft has a guaranteed 99.9 percent availability for each protected instance. A protected instance is a VM marked for replication as part of the Site Recovery Service. Its RTO is 4 or 6 hours, depending on whether or not it is encrypted. RPO is a setting you configure in the Create Protection Group > Replication Settings. Understanding your needs and requirements will determine the costs of replication. For example, if your RPO and RTO have short timeframes, then higher bandwidth will be required to ensure the proper replication and continuity are achieved.
5. Database Replication: If you need to replicate a database, you will most likely need another active database server in the Azure VNet to receive replication data. Traditionally, Log Shipping was done to handle off-site replication, but this process takes some time to bring the database up live during a disaster. If it is a SQL Server, Microsoft recommends using Availability Groups, which integrates well with ASR. When a disaster occurs, it is very easy to activate the Azure SQL instance. Note that if you are using SQL Standard, you must use Log Shipping (a good reason to upgrade to SQL Enterprise). If it is another vendor’s database, check with the vendor to see which would be the best way to achieve replication for DR.
Configuring for DR is not a trivial effort: You don’t just point and click. DNS needs to be set up, bandwidth needs to be calculated, initial replication time needs to be estimated and Runbooks needs to be created. Some of the tools available to help figure all this out include:
- Hyper-V Capacity Planning Tool to get the change rate
- vSphere Replication Capacity Planning Appliance to model the impact of VM replication
- Azure Site Recovery Capacity Planner to figure out bandwidth and servers and storage in source and target locations
The cost for ASR varies depending upon your configuration. The first 31 days are free for each VM, and then it is $54 per instance protected. Additional charges are incurred for storage, storage transactions, outbound data transfers (i.e., replication during failback) and active running servers (e.g., SQL Server, Active Directory DCs). If you are protecting vSphere instances, there are additional charges to set up Master Target and Config Servers. Check the Microsoft Azure Pricing Tool for more detailed estimates that match your environment.
These tools, a good understanding of what the expectation is for failover and failback, and a willingness to understand what needs to function and how it needs to perform during a disaster are key to a successful DR strategy. Hopefully, it will only be for a short period, but as Hurricane Katrina showed us in 2005, sometimes it lasts longer than you think.
For latest trends in cloud computing, check out BizTech Magazine to ensure your company is staying on the cutting edge.
Lastly, feel free to leave a comment below with any questions.