Data centre decisions are never easy, no matter what the size of your company. When it comes to making the most of your facility, why not follow the lead of the big players?
We talked to executives at some of the tech industry's largest companies to find out how they are innovating in brand new data centres, including one that Google built in Belgium and Cisco's new state-of-the-art facility in Texas. Intel and Yahoo also weighed in with their best practices.
Google: All about efficiency
Google operates "dozens" of data centers all over the world. The firm's primary focus is on making its data centres more efficient than industry averages, says Bill Weihl, green energy czar at Google. According to EPA estimates, many data centres run at an efficiency of around 2.0 PUE (power usage effectiveness), meaning they use twice as much energy as they actually need. The PUE is the total energy consumed by all data centres within a company, divided by the energy consumed by IT.
Google, for its part, runs at around 1.18 PUE across all its data centres, Weihl says. One of the ways Google has become more efficient is by using so-called "free cooling" for its servers.
"We manage airflow in our facilities to avoid any mixing of hot and cold air. We have reduced the overall costs of cooling a typical data centre by 85%," Weihl says, adding that the reduction comes from a combination of new cooling techniques and power backup methods described below. The average cold aisle temperature in Google's data centers is 80 degrees instead of the typical 70 or below. Hot aisle temperature varies based on the equipment used. Google would not elaborate on specifics about hot aisle temperatures or name specific equipment.
Further, Google uses evaporative cooling towers in every data centre, including its new facility in Belgium, according to Weihl. The towers push hot water to top of the tower through a material that causes faster evaporation. During this period of evaporative cooling, the chillers used to cool the data centre are not needed or used as often.
"We have data centres all around the world, in Oregon where the climate is cool and dry and in the southwestern and midwest part of the US. Climates are all different, some are warmer and wetter, but we rely on evaporative cooling almost all of the time," he says.
Weihl says the facility in Belgium, which opened in early 2010, does not even have backup chillers, relying instead on evaporative cooling. He says it would be a "100 year event" when the data centre would not need evaporative cooling, so Google chose to forgo backup chillers to reduce the facility's electrical load. The centre runs at maximum load most of the time, he says. On some infrequent hot days administrators put a few servers on idle or turn them off.
He advises companies to look seriously at "free cooling" technologies such as the evaporative cooling towers described above. Another option is to use towers to redirect outside air to servers, then allow the server temperatures to rise within acceptable ranges and use less direct cooling on the racks.
In terms of power management, Google uses voltage transformation techniques that step AC power down to DC voltage. Google also uses local backup power, essentially a battery on each server, instead of a traditional UPS, mostly due to the AC-to-DC conversion process.
Google uses a transformer to convert energy from utility power lines before the power is sent to servers. Traditionally, individual power supplies have converted the voltage from AC to DC, but that tack has proven to be inefficient, industry experts agree.
David Cappuccio, an analyst at Gartner, says, "Google, Facebook and many others have begun reducing the number of AC/DC conversions from when power hits the building to when it's delivered to the servers," he says. This can take the form of DC-based power distribution systems that move the conversion away from individual servers and to the tops of each rack. Typically this shaves a few percentage points off energy use, he explains.
Google also uses power supplies in servers and voltage regulators that are 93% efficient, he says. To make more efficient regulators would be prohibitively expensive.
"We use a single output power supply for a 12-volt rail that draws virtually no power when it is charged. The backup draw is less than 1% as opposed to a typical draw of 15% or more," says Weihl, citing the EPA estimates on typical data centre energy draw.
Another interesting Google technology involves custom software tools for managing data sets. Weihl says much of the management is automated with tools that help find out why a server is drawing too much power or how it may be misconfigured. The company uses a proprietary system called Big Table that stores tabular data sets and allows IT managers to find detailed information about server performance.
Google claims that its data centres are at an overall efficiency overhead of 19%, compared to the EPA estimate of 96% for most data centres. The overhead percentage indicates how much power is used for heating and cooling IT gear rather than to run the servers.
Cisco and the "downsized upgrade"
Like other organizations, Cisco has implemented the concept of a "downsized upgrade" achieved through virtualisation and consolidation. The process involves reducing the overall size of the data center and compacting equipment into a smaller chassis to save energy, but at the same time actually increasing the performance of the data centre.
At Cisco's new centre in Texas, for instance, the company mapped out enough space for a massive cluster of computers that can scale with rapid growth. The basic concept: Cram as much power into a small space and still get high performance.
Essentially, a cluster by Cisco's definition is a rack with five Cisco UCS (Unified Computing System) chassis. In each chassis there are eight server blades. In the data centre as a whole, there is a potential to have 14,400 blades. Each blade has two sockets, which can support eight processor cores. Each core supports multiple virtualised OS instances. To date, Cisco has installed 10 clusters, which hold 400 blades.
Another way Cisco has improved is with cable management. John Manville, Cisco's vice president of IT, says Cisco has saved $1 million by reducing the number of cables in its data centers.
"Most people don't realise" that cabling accounts for 10% to 15% of total costs, says Manville. "That reduction in cables also keeps the airflow moving better, and with the new cooling technology we installed, we expect to save $600,000 per year in cooling costs."
Besides this consolidation, Cisco is also figuring out how to reduce hardware and management costs for each operating system and each server. Manville says the costs today are around $3,700 per physical server per quarter. Through virtualisation, he expects to reduce that cost down to $1,600 per physical server per quarter and eventually hopes to reduce that figure even further, down to $1,200 per server per quarter.
The Texas data centre is actually two separately located facilities that operate as one, a concept called Metro Virtual Data Centers, which Cisco developed internally and does not sell publicly. The company plans to open two more MVDC facilities in the Netherlands by the end of 2012, for a total of four operating as one.
The MVDC approach is not about cost savings or energy conservation, because both data centres run the same applications at the same time. Instead, Cisco uses the technique for replication. If a natural disaster takes out one data centre, operations continue unabated in real time.
Like Google, Cisco is highly focused on efficient operations. Manville says the Texas facility goes a few steps further than most. For instance, power is distributed at 415V for a savings of about 10% compared to the typical lower voltage systems used in other places. The facility also uses all-LED lighting for about a 40% savings in energy use compared to incandescent lights, he says.
LED lights are expensive, and about where compact fluorescent bulbs were when they first appeared, says Charles King, an analyst with PUND-IT. "Over time, as costs come down, LED will become a no-brainer so Cisco deserves kudos for pushing the envelope."
Yahoo sites data centers in remote locales
The traditional approach is to locate at least one major data centre in a city, or at least in a reasonably large population, to be near IT administrators so they have easy access to servers and storage. According to Scott Noteboom, Yahoo's senior director of data centre engineering and operations, that concept has changed dramatically in recent years.
Yahoo operates larger centres in more remote areas. The company is building five new facilities in North America, Europe and Asia. This new location strategy would not be possible without improvements to the software used for management, which has advanced such that IT employees at Yahoo headquarters can manage the finer details, such as storage and virtual servers, remotely. Notebloom declined to provide any more details about how they do this.
Siting data centres remotely allows Yahoo to tap into lower utility costs in, say, Washington or Oregon. Noteboom says the build time is much less for these remote facilities, in some cases just six months compared to the more typical 18 to 24 months, and that also reduces costs compared to building in a large urban location.
The main benefit to building faster, he says, is greater accuracy for figuring out how much computing power Yahoo will need when the facility is complete, it's just easier to do capacity planning for six months down the road versus two years hence. "The faster you build, the less you need to rely on the crystal ball looking that much more forward into the future."
In addition, Noteboom says Yahoo has started using a new approach where services can be scaled up or down dramatically as computing needs change. In the past, an entire data centre would be rated for a specific uptime and capability. But now those ratings can be much more granular. Using the company's software, called Yahoo Data Center Flex Tier QOS Design, IT workers can allocate which power utility should be used, which backup generators and UPS to buy and what level of redundancy is required.
An example Noteboom gave: Email or search might require high availability, while a new beta service for checking stock quotes might not need as much. Yahoo can set different QoS levels for those applications. In the past, all applications would be locked into the same QoS.
The software also allows Yahoo to move applications and services to servers with high redundancy or to use a cluster, if possible, when one or two nodes go down for a short time.
Another way Yahoo scales is by having more options with the local utility company. A new data centre might start off using only 1 or 2 megawatts of power but could scale up to 20 megawatts. Before construction begins, Yahoo contracts to use multiple utilities in the same data centre, or have flexible contracts from the same utility, and even arrange different tax incentives for each level of service.
"From a cost perspective, this gives us more bang for the buck instead of having redundant investments," Noteboom says.
So far, the new flexible QoS approach is working well. A just-completed data centre in upstate New York has a PUE rating of 1.08, says Noteboom, mostly due to the ability to adjust services to application needs. Not every UPS and every server is running full bore all day. Rather, the IT staff adjusts to the current QoS needs of the application.
Pund-IT's King says Yahoo is onto something. "This is an interesting approach, especially the granular adjustment of QoS according to an app's importance. It could be a notable addition for cloud service providers, a way to better structure their deals for end customers," he says.
However, King is less certain about switching utility feeds on the fly. He says that although it's "novel," its effectiveness "depends largely on locale. There aren't a lot of places where companies have a choice of utility provider. But that situation is also likely to change as more alternative power sources come online," he says.
Intel's chimney stack
Intel engineers have developed a unique chimney stack system that works like a plastic curtain to expel hot air from the server racks. Intel now licenses the chimney stack technology. "This is a cost effective way to keep the cold air in the cold aisle and saves a lot on power consumption," says Kim Stevenson, vice president of IT at Intel.
Stevenson says the chimney stacks, located above each rack, were necessary because many of the data centres at Intel are located within manufacturing facilities. Some of the buildings are not exactly new, so it was a way to work within the existing structure and deal with the heat issues at the same time. The alternative would have been re-locating the servers, at a much higher cost, to resolve the cold aisle heat dissipation problem.
Further, Intel has embarked on a strategy to view the "entire data centre, software, servers, storage, networking and facilities, as a system that is optimised for specific business needs," according to Intel's most recent annual IT report. For example, silicon design teams need to run millions of highly compute-intensive jobs each week. Intel IT "met these unique requirements with a high performance computing grid optimised for design."
In general, the company analyses the performance of its data centres based on four key metrics: efficiency, quality, capacity and velocity. This year, the report says, Intel plans "to implement business intelligence tools that will enable us to apply supply chain concepts to our private cloud, helping us better understand demand signals to improve capacity planning."
Another innovation concerns how Intel manages remote access, which relates to Intel's changing business model. Traditionally, the company developed only highly technical products, including motherboards and processors, and its data centre was closed off and guarded. Now Intel is expanding its business to provide products and services that are more open.
For example, Intel is selling applications for netbooks, and the company must validate users who are purchasing the software. Intel uses its own SOA ExpressWay product to validate all incoming user accounts. Intel Expressway helped IT create a seamless security policy enforcement architecture that simplified the messaging design and reduced unnecessary authentications, Stevenson says.
Stevenson explains that the only alternative would have been to invest in an expensive appliance for token authentication related to the outside transactions.