In the first part of this series, I focused on the dramatic shift that virtualization has caused across the network and storage industries and how partitioning-based solutions have been utilized to address some of the key challenges introduced by virtualization.
I closed out Part 1 with a comment suggesting that these same partitioning solutions can also be used to benefit the deployment of both hypervisors and native operating systems.
In this part of the article, I am going to explore the value that these partitioning technologies can bring to bare metal environments. In order to appreciate that value, it is important to understand the typical architecture of today’s servers and some of the inherent challenges that come with it.
The most significant set of challenges are related to the stateful nature of traditional servers. A stateful server is any server that contains data persistently stored on the server itself or has its identity information known by systems or devices external to the server. What makes a server stateful?
- A local disk that contains the operating system used to boot the server and run applications
- A network interface whose unique name may be configured in external systems (like a DHCP server), discovered and stored by application software, or discovered and stored by external Ethernet switches
- A host bus adapter that is used to connect the server to shared storage in order to access application data. A host bus adapter has a unique name that is provided to the storage administrator to grant the server access to the assigned disks within the shared storage environment.
- A CD drive with media inserted
You might look at this list and wonder what’s the downside to any of those items – every server needs an operating system, so why not put it on the local disk? Many servers need to be connected to a shared storage environment – what’s wrong with connecting using the identity of the host bus adapter that is on the server?
The short answer is lock-in. Any one or combination of the items above fundamentally limits the flexibility of the compute resources of that server. Let’s take a simple example of a server that has a Windows operating system (along with a set of applications) installed on its local disk and has a host bus adapter, which provides connectivity to external storage that contains the application data for the server's applications.
Now let’s look at what happens if there is a major failure of that server (e.g. CPU) or if you want to move that application onto a next-generation server. With a stateful design, the move can be complex and require non-trivial time and energy from IT staff.
The first problem is that the new server needs to get the operating system and applications from the failed server. Since the failed server has the root filesystem embedded in it, you have to make a choice – you can either re-install the OS and applications by hand on the new server, or you can try to move the disk to the new server (if it is hot swappable) or copy the contents. If you move the disk or copy it, you have some serious potholes to avoid.
The largest issue is the operating system. Windows, for example, is particular about the underlying hardware and the devices that it learned about during the installation of that OS. Changing the hardware will cause Windows to complain about the new devices, including the boot device. Issues with the boot device will actually prevent the server from booting. And, even if you hack the registry enough to convince Windows to boot, your next hurdle will be the changes to the network and storage adapters. New adaptors can provoke all sorts of problems.
On the network front, you will possibly have connectivity problems – maybe because your DHCP server configuration is based off of the old network adaptor’s identity and now you cannot get an IP address.
To further complicate matters, you might end up having application startup problems. Some applications actually tie their licenses to the unique name of the Ethernet adapter. When that name changes, the application will stop working because it considers itself not properly licensed. Additionally, when you connect your new server to the shared storage environment to access the same disks, you are going to have to get your storage administrator involved because the new host bus adaptor has its own unique name and that name has to be programmed into the storage array and the switching fabric to make the appropriate disks visible to this new device. If you have ever had to manage a shared storage device, you know all the things that can go wrong in that process. It’s a real headache. In the end, the only sensible solution is to re-install and re-set up the OS and applications. Not the way you want to spend your IT hours.
Now you might be asking, doesn’t virtualization already solve thisproblem – it allows my OS and applications to be independent of theunderlying hardware so I can avoid these lock-in issues.
Theanswer is yes AND no. The “yes” part is simple in that virtualizedenvironments leverage hypervisor technology to present the operatingsystem with a consistent view of the devices regardless of theunderlying physical hardware, and this provides the flexibility andmobility to prevent lock-in. However, a virtualized solution does notcover all aspects of an enterprise datacenter and therein lies the “no”part.
The first consideration is the hypervisor itself. We tendto forget about the hypervisor because it is invisible to most parts ofan organization. However, the hypervisor is a key responsibility of IT -they own installing and maintaining every hypervisor in the datacenter.The reality is that the hypervisor is just another bare metal operatingsystem, which has all of the same stateful server issues that any baremetal operating system might have. So, that lock-in problem has simplymoved from the OS and application down a level to the hypervisor.
Thesecond consideration is the little-talked-about reality ofvirtualization in the enterprise. That is, not all applications can runin a virtualized environment. While most enterprise organizations havealready moved 70-80% of their applications into a virtualizedenvironment, they have not made progress in moving the remaining 20-30% (Figure 1 ).
Figure1: While the majority of network installations have moved applicationsinto a virtualized environment, 20-30 percent remains to be converted.
Theseremaining applications are locked into a bare metal environment forreasons that include licensing, support, security, compliance, and/orperformance considerations. And, because these applications cannotparticipate in the virtualized solution, they suffer from the lock-inthat is associated with a stateful architecture.
As it turnsout, all lock-in problems can be resolved with the storage and networkpartitioning technologies we talked about in the first part of thisarticle. It’s interesting that partitioning technology, driven by theneeds of virtualization, has actually ended up helping bare metalenvironments as well. Let’s examine how it actually helps, starting onthe network side of things.
Bare metal platforms
Themulti-channel feature we discussed can be used to define the bare metalserver network interfaces via software. That means you can createvirtual network interfaces on a bare metal platform – defining thenumber of network interfaces and the identity of those interfaces. Thistechnology provides the same type of abstraction that the hypervisorprovides for the guest operating systems, but for bare metal operatingsystems. In the Windows server example above, the new server would beprogrammed to have the same network identity as the original server, andtherefore the Windows operating system would actually believe it wasrunning on the same set of network interfaces, thus avoiding the networktransition pitfalls.
On the storage front, the HBA partitioningcapability offers the analogous solution for storage. The server’s newHBA adapter can be programmed to have the same storage identity that theoriginal server had to access its disk in the shared storageenvironment. This means the server administrator does not need tocontact the storage administrator and re-program the storageenvironment.
If you adopt the policy of booting your operatingsystem from the shared storage environment, you can complete the pictureand achieve the ultimate flexibility of a stateless server architecturethat allows you to move your OS and applications between servers withthe ease of a virtualized solution.
This can make managing yourremaining bare metal applications and the infrastructure for yourvirtualized environments a dream compared to what you have to do today.Additionally, you can attain more sophisticated capabilities using thissame technology to achieve dramatically simplified N+1 high availabilityand disaster recovery.
If you are intrigued with theinfrastructure management value of these solutions, I encourage you tocheck out some of the new products on the market today – look for thosethat are agnostic to the underlying infrastructure, can manageheterogeneous environments, and streamline management through automationand an easy to use UI.
Scott Geng is CTO and Executive Vice President of Engineering at Egenera ,and has been instrumental in the design and development of thecompany’s Processing Area Network (PAN). Prior to joining Egenera, Gengmanaged the development of leading-edge operating systems and middlewareproducts for Hitachi Computer Products and as consulting engineer forthe OSF/1 1.3 micro-kernel release by Open Software Foundation. He holdsBachelor of Arts and Master of Science degrees in computer science fromBoston University.