Using PCI Express I/O Virtualization to pool network switch/server blade resources -

Using PCI Express I/O Virtualization to pool network switch/server blade resources


Data centers in recent years have been turning to a variety of virtualization technologies to ensure that their capital assets are used efficiently.

Virtualization allows the physical network I/O switch and server blade hardware to be pooled and shared across multiple applications ” increasing its utilization and capital efficiency while maintaining the standard access and execution models for operating systems and applications.

Generically, virtualization consists of three distinct steps:

* Separation of resources – providing management independence
* Consolidation into pools – increasing the utilization, saving cost, power and space
* Virtualization – emulating the original functions as “virtual” functions to minimise software and process disruption

Storage virtualization has been around for a long time. It separates the delivery of persistent storage – whether as logical disks (SANs) or files (NAS) – from the physical reality of spinning disks. With the exponential growth of data, this has improved the efficiency, security and manageability of data storage to a level that would not have been feasible with non-virtualised storage.

More recently, server virtualization has separated the execution platform for applications and services from the underlying server hardware. Applications and operating systems execute within Virtual Machines (VMs), many of which can be hosted on a single physical server.

Initially focusing on managing server sprawl through physical hardware consolidation, this is now becoming a critical technology supporting business agility and continuity, and is a key component of the “cloud,” which aims to deliver flexible opex-based application execution.

I/O Virtualization (IOV) follows the same concept. Servers – virtual or physical – require connectivity and bandwidth to clients, other servers, networked storage and to their local Direct-Attached Storage (DAS).

Instead of providing each server with dedicated adaptors, cables, network ports and disks, IOV separates the physical I/O from the servers (Figure 1 below ), leaving them as highly compact and space efficient pure compute resources such as 1U servers or server blades. The I/O from multiple servers can now be consolidated into an “IOV Switch”.

Figure 1: Separation of I/O from servers

Because the I/O components are now shared across many servers, they can be better utilized and the number of components is significantly reduced when compared to a non-virtualized system.

The system becomes more cost, space and power-efficient, more reliable – due to fewer components and architectural advantages such as RAID6 on the disks – and easier to manage.

The final step is to create “virtual” I/O devices in the servers, which look to the server software exactly the same as the original physical I/O devices (Figure 2, below ). This functional transparency preserves the end-users' huge investment in software: applications, OSs, drivers and management tools.

Figure 2: Synthesis of virtual I/O devices in servers

The server hardware appears to the software to be exactly the same as today – the same I/O architecture, the same I/O devices, the same drivers all managed with the same tools ” but with all the cost, space, power and dynamic configuration and manageability advantage which come with I/O consolidation (Figure 3, below .

Figure 3: Consolidation of I/O into IOV switch

Data Centers and Commodity Servers
Commodity blade-based servers today trace their ancestry – and unfortunately their architecture – back to the humble Personal Computer of the early 1980s, which was designed to provide low cost, commodity home computing. Applications were simple word processors, spreadsheets and games.

I/O beyond the keyboard and display was optional and rare and “networking” irrelevant. In contrast business computers – mainframes – were designed with a completely different aim: the storage and manipulation of large amounts of business-critical data. Their architectures were much more I/O-centric and had surprisingly little “processing” capabilities.

A quarter of century on and the ubiquity of the PC has changed the shape of enterprise computing, and volume servers today are effectively PCs – albeit with far more powerful CPUs, memory and I/O devices.

We now have the acquisition cost and scalability advantages that have come with the high volumes of the PC market, but the business demands on enterprise servers remain much the same as they were in terms of reliability, storage capacity and bandwidth, networking and connectivity – demands that a PC was never intended to address.

The PC has become so cheap that over the last decade these demands have been answered by simply providing more and more hardware, but now this trend is proving to be no longer sustainable.

In particular power, floor space and management costs have become the dominant cost of the data center. More hardware is no longer a solution to the need for growth.

Mainframe-like capabilities have been making their way into both the data centers and the CPUs for some years – networked storage, optical networks, server virtualization have all sought to address the growth and manageability requirements, but the architecture of the PC server has resolutely remained the same as it was 20 years ago: CPU and memory with some dedicated, optional I/O.

Server I/O can be defined as all the components and capabilities which provide the CPU – and ultimately the business application – with data from the outside world and allow it to communicate with other computers, storage and clients.

The I/O in a typical server consists of the Ethernet network adaptors (NICs) – which allow it to communicate with clients and other computers – networked storage adaptors (aka Host Bus Adaptors, HBAs) – which provide connectivity into a shared storage pools based on Fibre Channel (FC) or iSCSI – and local disk storage (DAS) for non-volatile storage of local data, Operating System (OS) and any other persistent server “state”.

I/O also includes all the cables and networking connectivity (switch ports) required to interconnect the many servers in a typical data center networking infrastructure. Each server has its own private set of I/O components.

Back in 1965, Intel's Gordon Moore observed that transistor densities – and by implication, CPU capabilities – were growing exponentially: doubling approximately every two years.

This trend has continued to this day, but unfortunately for enterprise servers this has been more relevant to CPUs than to I/O. I/O performance is determined by more than just transistor density and has lagged behind, particularly on a measure of cost-performance. I/O today can account for more than half the cost of the server hardware.

When it comes to resource requirements, applications are not all equal. A web server is typically quite different from a database server. The former requires access to the client network and other back-end servers, but not directly to the corporate SAN.

Conversely, the latter requires high bandwidth SAN access, but for security needs to be well isolated from the client network. Traditionally, the solution to this has been to build and configure specific servers for specific applications, resulting in the common multi-tier data center architecture seen today.

However, as IT tries to shift to a more flexible and scalable OPEX-focussed model, this static architecture no longer fits the demands of business. Virtualization of storage, servers, desktops, services and now I/O is being seen as the necessary means to completely decouple the execution of applications and the delivery of IT services from the underlying hardware platforms.

I/O Virtualization Approaches
The benefits of I/O virtualization has been available to high-end system for some time, either as proprietary implementations in mainframes or based on Infiniband. Until now there has not been a solution ” from the point of view of cost or transparency ” suitable for the commodity volume server market.

The default I/O interconnect in volume servers today is PCI Express (PCIe). It is estimated that there will be in excess of 400 million PCIe devices in use in 2010, making it the highest aggregate bandwidth interconnect ” higher than Ethernet and completely dwarfing Infiniband.

PCI was originally designed as a short reach (a few inches) parallel chip-to-chip interconnect, but the introduction of PCI Express in 2002 has extended its bandwidth to 10s of Gb/s and its range to several metres, enabled it to reach beyond the confines of the PC and server chassis.

Seeing the increasing need for virtualization, the PCI-SIG has more recently defined a number of extensions to PCI Express to support I/O virtualization capabilities both within a single server (SingleRoot-IOV) and across multiple servers (MultiRoot-IOV).

SR-IOV provides a much needed mechanism to allow VMs running high performance I/O intensive applications (eg., databases) to access I/O directly rather than through the software hypervisor. In 2010 we are starting to see the deployment or SR-IOV capable I/O devices and drivers, but it is not clear that the more complex MR-IOV capabilities will be available for some time yet.

Another approach is to virtualize standard high-volume PCI Express I/O devices and drivers available today by adding the virtualization capability into the PCI Express fabric rather than to the devices.

Because the virtualization capability is contained in the PCI Express switch fabric, neither the I/O devices nor any of the software, firmware or hardware of the servers needs to change.

This approach, which Virtensys has termed “Native PCIe I/O Virtualization”, has the advantage of exploiting existing standard high volume I/O hardware and software and therefore is both very easy to deploy and is low cost.

Native PCIe I/O virtualization is based on adding the virtualization capabilities into the switch fabric and relies on the fact that the process of virtualization can be split into a complex control path (low bandwidth) function and a relatively simple data path (high bandwidth) function.

The high-speed data path function is a device-independent IOMMU which isolates the address space of each I/O adaptor from the address spaces of the multiple servers. This IOMMU provide the PCIe address and ID translation functionality to allow all the servers and I/O adaptors to operate as though they are part of a traditional single root system.

The more complex but lower bandwidth device-dependent control path functions provide the virtualization of the configuration spaces, register maps, interrupts and global controls function (initialization, reset, power management, etc. ). This functional split allows an IOV switch to support 100s Gb/s and millions of IOPS of I/O throughput in a low cost and low power hardware implementation.

PCIe I/O virtualization switches can be deployed both with rack and blades servers. In a rack system, the IOV switch is implemented as a 2U unit which connects with standard PCIe cables to a stack of industry standard 1U or sub-1U servers.

A 2U IOV switch will typically support 16 servers and provide eight ports of either 10Gb Ethernet, 4Gb Fibre Channel and 6Gb SAS/SATA. In a blade implementation, the IOV switch takes the place of the Ethernet or Fibre Channel switch in the blade chassis, with the PCIe connections routed over the midplane directly from the server blades to the switch.

Key Features and Benefits of Native PCIe IOV
Cost reduction through consolidation. IOV reduces system hardware cost by improving on the poor utilization of I/O in most servers today. Native PCIe Virtualization contributes to this cost saving by reusing the existing high-volume, low-cost PCIe components and by adding very little in the way of new components.

Power reduction. As with acquisition cost, increasing the I/O utilization through consolidation minimize the amount of I/O hardware required and hence the power dissipation of the data center.

Management simplification. I/O virtualization changes server configuration from a hands-on, lights-on manual operation involving installation of adaptors, cables and switches, to a software operation suitable for remote or automated management.

By removing humans from the data center and providing automated validation of configuration changes, data center availability is enhanced. It is estimated that 40% of data center outages are due to “human error.”

Dynamic configuration – agility. Businesses today need to adapt quickly to change if they wish to prosper. Their IT infrastructure also needs to be agile to support rapidly changing workloads and new applications. I/O virtualization allows servers to be dynamically configured to meet the processing, storage and I/O requirements of new applications in seconds rather than days.

Ease of deployment and non-disruptive integration. Native PCIe IOV technology has been designed specifically to avoid any disruption of existing software, hardware or operational models in data centers.

Native PCIe IOV works with – and is invisible to – existing volume servers, I/O adaptors, management tools, OSs and drivers, making its deployment in the data center extremely straightforward.

Rapid/cost-effective technology adoption
CPU and I/O technologies have been evolving at different rates. New, more powerful and cost/power effective CPUs typically appear every 9 months while new I/O technology generations come only every 3-5 years.

In particular the “performance-per-watt” of new CPUs is significantly higher than those of a few years ago. The separation of I/O from the compute resources in servers (CPU and memory) allows new power efficient CPUs to be introduced quickly without disrupting the I/O subsystems.

Similarly, new I/O technologies such as FCoE or 40Gb Ethernet can be introduced as soon as they are available. Since these new high-cost and high-performance I/O adaptors are shared across multiple servers, their introduction cost can be significantly smoothed when compared with today's deployment model.

I/O virtualization is an innovation that allows I/O to be separated, consolidated and virtualized away from the physical confines of a server enclosure.

Of the various approaches described, Infiniband-based IOV is most suitable for those installations that already have an Infiniband infrastructure and whose servers already use Infiniband software.

For the majority of data centers without Infiniband, IOV based in the standard I/O interconnect – PCI Express – provide a lower-power, lower-cost solution. In particular, Native PCIe Virtualization provides today all the benefits of IOV without requiring new I/O devices, drivers, server hardware and software.

Marek Piekarski is the co-founder and CTO of Virtensys, where he is responsible for the company technical strategy and for defining its technologies and products. He has more than 20 years experience in a broad range of industry segments.

Prior to co-founding Virtensys, he was responsible for defining products and applications for switching technology at Xyratex, and was the technical lead on a consortium project developing novel storage and interconnect solutions for video servers. Before Xyratex, Marek served with Transitive, developing dynamic binary translation software and with Power X, where he led the technical development of high-performance telecom switch fabrics.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.