Virtualisation has been gaining popularity for several years. There has been a particular increase over the last two to three years with customers’ virtualisation of their environments in order to economise.
Simply put, virtualisation takes a lot of machines and combines them onto one piece of hardware, where each copy of the operating system runs within its own virtual machine.
The question has been asked of me a number of times, ‘how does this affect the overall performance of my database-centric applications?’ The answer is that we have done a lot of benchmarking and testing that have found it to have a significant effect. To understand why this is, we have to look at why people want to do virtualisation in the first place.
Industry experts Gartner and IDC agree that the typical server is only utilised 10 to 20% of the time. Therefore, 80 to 90% of the time that machine is doing nothing other than warming the room.
The promise of virtualisation is to take those spare computing cycles, and to put them into production in a way that delivers return on your investment for that hardware.
We have seen companies who have put as many as 120 virtual machines on a single piece of hardware. However, it is important to understand the impact of virtualisation on data access. Data access for any application, or more specifically retrieving data from the database, can be very CPU and memory intensive, and accounts for as much as 75%-95% of all the time spent in a data centric application.
When 80% of the time the CPU is not being used, you have spare cycles, which can offset bad algorithms, bad data access code, or a bad JDBC driver. However, this can become a problem when the utilisation rises to 80% or 90%.
Despite the benefits of virtualisation, there are limits to it, particularly issues of scalability. Once the limits of hardware are tested, inefficient drivers or code can create a bottleneck and scalability rapidly decreases.
Often enterprises have a large number of users running an application and it works well, but when the system is virtualised, the application no longer performs well. The problem usually turns out to be inefficiently written code or a driver, which uses too much CPU or memory.
For example, an enterprise might have an application that uses an application server where it performs its data access,connecting to Oracle and is running fine with 100 users. We have seen many times that once that environment is virtualised and excess CPU and memory are no longer available, all of a sudden this application starts under-performing.
It usually turns out to be that either the data access code (Hibernate, JDBC, .NET, OBDC, etc) is not written efficiently, or there is some piece of middleware – some driver – that is written inefficiently and uses too much CPU or too much memory. The result is that within the virtualised environment, the excessive use of memory and CPU and disk (and sometimes the network) does become a big issue.
Ten or 15 years ago, when the hardware was slower and more expensive, code had to be better written, or at least more efficiently.
This requirement disappeared when hardware became so much cheaper and faster. With virtualisation, those same problems become important again: code has to be better; algorithms must be better, and the database middleware must be better. In fact, every component of the stack has to be better if the fruits of virtualisation are to be realised.
The cost of inefficient techniques becomes clear when thinking about scalability lost, which is usually the reason behind virtualising in the first place. The good practices discussed in the Data Access Handbook are even more essential for successful virtualisation.
Rob Steward is Vice President of Research and Development at Progress DataDirect Technologies’. He has spoken on data access performance at many industry events including: Microsoft PDC, Devscovery, WinDev, and Virtualization World. Rob is also co-author of The Data Access Handbook.