Genetics research for cancer, diabetes and heart disease boosted by new High Performance Computer

Scientists at the Wellcome Trust Centre in Oxford University are able to analyse the genetics behind illnesses such as diabetes, heart disease and cancer faster than ever, thanks to a new high performance computing cluster.


Scientists at the Wellcome Trust Centre in Oxford University are able to analyse the genetics behind illnesses such as diabetes, heart disease and cancer faster than ever,, thanks to a new high performance computing (HPC) cluster.

Data that the scientists analyse assist international studies of cancers, type-2 diabetes, obesity, malaria and the spread of bacteria (which could help track resistance to antibiotics, for example).

The university's statistical genetics department, one of the largest genome sequencing facilities in England, replaced one of the two clusters it runs simultaneously (to ensure a constant stream of genomic information), with a Fujitsu high-performance BX900 blade-based cluster and new network and storage system. The cluster will support the work of over 100 researchers, who can use its computing power simultaneously, without interruption.

The cluster has given researchers an almost three-fold increase in performance over its predecessor, reading block data at 20 GB per second thanks to a combindation of Intel Ivy Bridge CPUs with a Mellanox FDR InfiniBand network that links the compute nodes to a DDN GRIDScaler SFA12K storage system. 

This infrastructure, combined with genomic analysis software, has allowed researchers to cut some of its data analysis from months.

How much memory does a genome take up?

The department produces more than 500 genomes a year. A single genome uses about 30GB on disk when compressed. The department stores around 20,000 genomes, occupying half a peta-byte.

With the new cluster, researchers have an extra 1,728 cores and 27TB memory to store genomes so that individual research teams can use the data to study the genetic basis of human diseases.

Different teams can use the cluster from different servers

Dr Robert Esnouf, head of the research computing core at the Wellcome Trust Centre for Human Genomics said: “Each research group can use their own server to submit jobs to, and receive results from, the cluster. If it runs on the server it can easily be redirected to the cluster. Users don’t need to logon directly to the cluster or be aware of other research groups using it. We try to isolate groups so they don’t slow each other down and have as simple an experience as possible. Users have Linux skills, but they do not need to be HPC experts to use the system safely and effectively. It is a deliberate design goal.

Additionally, the University’s infection and immunity research is now classed as world league as the cluster allows scientists to create some of the world’s highest resolution electron microscopy reconstructions – which will allow scientists to understand how and why we get ill.

Professor David Stuart, professor of structural biology at Oxford University said: “Advances in detector design and processing algorithms over the past two years have revolutionised electron microscopy, making it the method of choice for studying the structure of complex biological processes such as infection. However, we thought we could not get sufficient access to the necessary compute to exploit these advances fully. The new genetics cluster provided such a fast and cost-effective solution to our problems that we invested in expanding it immediately.”

The cluster and storage systems were designed by the Wellcome Trust Centre for Human Genetics and integrated by OCF, who also provided training for the new system.

Another university in the UK uses HPC for more novel purposes, University Campus Suffolk replenished its game engineering department with HPC workstations to ensure high availability for students who were throwing “more at the hardware than it could manage”.

Image credit: iStock/Nicolas