Faster cloud computing


IBM Distinguished Engineer Michael Factor

By reducing the overhead associated with managing virtual machines, IBM researchers make cloud entry more practical for businesses with data intensive workloads 

Cloud computing has introduced a new world of possibilities when it comes to increasing capacity and adding capabilities – without businesses needing to purchase more powerful computers or software. But despite many benefits, the cloud isn’t perfect.

The biggest drawback? Data intensive workloads might run as much as 50 percent slower in the cloud, versus on an independent system, because its architecture has many virtual machines sharing a single physical system.

To speed up I/O intensive workloads running in virtual machines, IBM Research – Haifa, in collaboration with the Technion – Israel Institute of Technology, developed ELI, short for ExitLess Interrupts. Building on many previous efforts to reduce virtualization overhead, ELI makes it possible for untrusted and unmodified virtual machines to reach nearly bare-metal performance, even for the most demanding I/O intensive and interrupt heavy workloads.

The IBM research is part of a joint European Union initiative known as IOLanes*, whose purpose is to advance the scalability and performance of I/O subsystems in multicore platforms. The three-year project began in January 2010 with the goal of enabling data centers to increase their workload by a factor of 10, while maintaining the same number of storage resources.

What is virtualization and why is it slower?

The cloud’s virtual machines act independently, run their own operating environment, and essentially behave as separate resources – but share physical resources. Hypervisor software makes virtualization possible by managing the virtual machines and their interaction with the underlying hardware.

The hardware generates signals, called interrupts, to asynchronously communicate to the CPU the completion of I/O operations. The hypervisor intercepts all these interrupts to signal to virtual machines when their I/O requests have been completed. The hypervisor handles hundreds of thousands of interrupts per second – on top of managing the virtual machines’ connection to the physical hardware. And until now, these interrupts created enough overhead to significantly slow down the I/O virtualization performance.

Speeding up hypervisors by cutting down on interrupts

IBM’s ExitLess Interrupts (ELI) uses a software-only approach that eliminates the overhead involved in handling the interrupts by allowing the hardware to communicate directly with guest virtual machines instead of doing it through the hypervisor.

The solution has already helped speed up workloads running in virtual environments by up to 66 percent, achieving up to 97 percent of their optimal performance on physical machines. In short, by reducing the cost of each interrupt to almost nothing, workloads can run on virtual machines without paying the price of lower performance.

Take for example a call center that uses computers to process data from the millions of customer calls. Having an application read and analyze this data to gain insights or make new marketing plans would take hours to process on a computer – without virtualization. If the call center wants to move these computations to the cloud, on virtual machines, this same data crunching could take twice as long. With the new technology from IBM, this workload can now be analyzed on the cloud in half the time.

Extending storage systems by moving functions to the virtual machines

This technique to speed up processing can also be used to extend storage systems by having the controller use virtual machines to provide services such as file serving, compression, deduplication, analytics, or various types of encryption in a virtual machine. In this way, the services are provided at much lower costs since they don’t need to run on separate hardware or require tight coupling with the base controller code—both of which have significant drawbacks. The IBM Research team, working closely with storage development groups, has already demonstrated such combinations. In the demonstration, the controller consistently ran at maximum speed by using the hypervisor to isolate and prioritize services.

Senior executives in IBM’s storage division are excited by the possibilities, which have the potential to drive the rapid deployment of new storage services and data protection for a broad range of offerings.

The bottom line is that more efficient processing for data-centric applications can translate directly into less wasted energy, greener data centers, and reduced costs—alongside making the cloud more accessible.

* The IOLANES project is supported by partners Foundation for Research and Technology – Hellas (FORTH) – Coordinator, Barcelona Supercomputing Centre (BSC), University of Madrid (UPM), INTEL Performance Labs Ireland, IBM Research – Haifa, and Neurocom S.A. IOLANES is funded by the E.C. under the 7th Framework program, and is part of the portfolio of the Embedded Systems Unit – G3 Directorate General Information Society

Comments: 1
Michael Factor

About Michael Factor

Michael Factor is an IBM Distinguished Engineer with Storage and Systems, IBM Research.
This entry was posted in All Posts and tagged , , , , , , , , . Bookmark the permalink.

One Response to Faster cloud computing

  1. abraham says:

    I agree the I/O optimization is obviously critical for any data intensive cloud computing environment. The planning and workload selection analysis up-front should flag these requirements and the solution architects must address them in their design, maybe adding more VCPU to a VM machines. We have much more slower data processing in cloud environment, such as the data travel through the network via VLANs, etc. which is much slower than I/O, but critical to system availability. I see in most cases, VMs performance can be resolved by putting them in shared resource pull with reservation quota. The idea of improving I/O performance sounds very good, but we are introducing another layer of complexity to the already complex system. Which Hypervisor would come out and support this feature, VMware, Hyper-V, Zen or KVM. I am sure IBM P system does!

Comments are closed.