In the days of physical servers, if one server were to suffer a loss of network connectivity, then that one particular service would of course be down for the organization. But chances are that it might not cause really wide-spread interruption to the business. It is just one server out of possibly dozens or hundreds.
Now in the world of the virtualized datacenter, we may have anywhere from 10 to 30, or possibly even more virtual servers running on a single physical server at any point in time. If there were a physical loss of connectivity on a cluster node, that could affect a large number of servers. It’s the proverbial “All your eggs in one basket” syndrome. Fortunately, we are able to reduce this risk by combining multiple virtualization hosts together into a Highly Available (HA) Failover Cluster. This can quickly recover services in the event that one of those “baskets” were to break down.
While the use of Microsoft’s Failover Clustering provides terrific protection from system failures, it is unfortunately one of the topics I seem to find many IT admins have difficulty gaining a great appreciation and understanding of. There are probably several reasons for this, but rather than dwelling on why it’s confusing, let’s spend a bit of time demystifying the topic.
First and foremost, don’t try to over-complicate anything about it. The really complicated stuff was already coded and compiled by Microsoft and is included in Windows Server. Let’s just use this great technology.
Just the facts Ma’am…
It’s important to understand that a Hyper-V cluster is self-aware and is constantly making decisions about how things happen inside of it. At the heart of a cluster is a continual stream of communications amongst the nodes and these communications depend on reliable networking. Because of this, it just makes sense that proper design and network provisioning be a key factor when creating a Hyper-V Cluster. Let’s first take a look at the basic networking needs of a Hyper-V host and then we can break each down to better understand each and its configuration.
- Host network – A Hyper-V host is a functioning server on a network. It needs a network connection for management and administration just like any server would.
- Live Migration Network – Transferring a running guest VM from one host to another host requires a fair amount of bandwidth. This network should be separate from other network traffic and has no need for external routing. A dedicated VLAN or switch is recommended.
- CSV Network – A network for Clustered Shared Volume communications between hosts should be provisioned to isolate this traffic from other network traffic.
- VM Network – Virtual Machines need a path to the network that provides the bandwidth necessary to support the needs of all guest VM communications.
- Storage Network – If VM storage access is based on iSCSI or SMB, adequate networking should be supplied to support the IO demands of the guest VMs.
Bandwidth, Bandwidth, Bandwidth
As you can see, there are many different traffic types needed in a Hyper-V cluster. Each can create great demands for bandwidth and therefore should be provisioned as either:
- Dedicated NIC in separate switches or VLANs
- vNIC with specific VLAN access
However this is defined, an adequate number of network connections must be available to support the particular network need. This is very important. Customers that try to configure combinations of uses like iSCSI combined with other uses on too little bandwidth do not get good performance results. This quickly leads to dissatisfaction or even instability issues.
Do not cut corners on networking in a clustered virtualization environment. For purposes of Hyper-V clustering, it can be configured in many different ways. Essentially, we need a way to ensure that we can provide the right amount network throughput for each type of network communication. If your environment consists of only 1GbE switches, it may be advisable to consider a minimum of six (6) 1GbE NIC ports per server.
- 2 – iSCSI or SMB storage networking
- 2 – Host Communications and Cluster Communications (Live Migration and CSV)
- 2 – VM Networking
HP and its Flex-fabric approach is interesting because a physical Flex fabric adapter in a server can be configured from 2 to 8 logical adapters (4 per port). Then each of those logical adapters can be assigned varying levels of bandwidth from 100Mb to 10Gb. However, the aggregate total cannot exceed the 10Gb limit on each port. So it may be possible to create the following: • 2 ports at 4Gb for iSCSI or SMB networking
- 2 ports at 500Mb each for Host Communications
- 2 ports at 4Gb Each for VM Networking
- 2 ports at 1500Mb each for Cluster Communications (Live Migration and CSV)
As VM densities rise, so do the I/O demands on the network connections. It is essential to provision network connections to exceed those demands so that networking does not become a bottleneck on your Hyper-V host systems. The exact amount of network connections will of course vary quite a bit depending on many factors. Having around 8 to 12 physical 1GbE NICs or two 10GbE NICs is usually a pretty safe place to start in terms of aggregate available bandwidth.
The question I always get is, “What exactly is the best practice?” Unfortunately, I have to respond with the answer that customers hate to hear – “It depends.” And to be real, there really is no single “Best Practice.” Virtualization is completely dependent on 3 pillars – Server, Storage and Networking. Each of those impacts one another.
Chances are that you already have a sizeable network infrastructure in place with many established VLAN etc. Changing that whole design may not be feasible, so we need to consider how a new Hyper-V infrastructure can fit into that. Or, perhaps, you already have sizeable SAN investments. Are you going to suddenly switch that up? Many times we have to work in the confines of various pieces of infrastructure. Ultimately, “Best Practice” means creating a design that best meets your end goals, but fits within the realm of your own reality.
We previously said that we must not limit the bandwidth available for the various networking needs. To that I will add, every physical network connection on a Hyper-V cluster node must be redundant. That also goes for the other end of the network cable. Do not plug all of the NICs of your server into one network switch. There must be network redundancy all the way throughout the environment. However, getting in the on-premise network infrastructure may be too much for this post so let’s get back to the point I was trying to make.
Prior to Server 2012, network teaming inside of Windows Server was dependent upon the various network card manufacturers to supply specific teaming for their network adapters. Because the teaming software was supplied by a 3rd party, NIC teaming wasn’t directly supported by Microsoft. Fast forward to today, Hooray! NIC teaming is now an included function of Server 2012 and works very well, even across different NIC manufacturers. So leverage the Windows Server built in NIC Teaming. Unless (and this is where that “it depends” response comes into play) you are deploying on something like Cisco UCS and its fabric management functionality.
With Cisco UCS we can create Service profiles (thru Cisco HW configuration), in which we can define network access functionality that leverages hardware virtualization functions. Wow, that was crazy to say! But basically, we can create a “Virtual” Network Adapters in the fabric management of Cisco UCS that are made up of connectivity from two separate physical network connections. In order words, create NIC teams in the Hardware of the server. All that we’ll see inside of Hyper-V (when it’s loaded) are network adapters that are already teamed in UCS so we don’t need to mess with teaming in Windows or Hyper-V networking. Just use the redundant vNICs presented to Windows.
Summing it up
Whether you utilize Windows Server built in NIC teaming functionality or another solution is ultimately up to whomever will be supporting the solution long term. Evaluate the options, but just be sure to include planning for redundancy in the physical network connectivity layer within your plans. Additionally, be sure to provide an adequate amount of bandwidth for all the needs of your VMs as well as the HA cluster itself. Lastly, be sure to provide a means to segregate your various traffic types and assign that bandwidth accordingly. Ensuring you are addressing these areas of your network configuration will provide the best chance for a successful deployment of a highly available Hyper-V cluster.