AstraZeneca has been using AWS cloud storage to process more than nine terabytes of human genome data to supercharge its cancer research team.
Michael Heimlich, life science solution delivery manager at AstraZeneca told Computerworld UK during AWS re:Invent in Las Vegas that scientists approached his team asking for a means to reprocess 20,000 genome samples using its new home grown variant caller, which it calls VarDict, which the company claims identifies nearly 20 percent more cell mutations and variations in the genome than the standard algorithms.
"This will provide our scientists with many more target areas to design medications which benefit patients," Heimlich explained to Computerworld UK. The existing drug development process can take at least fifteen years, cost upwards of tens of billions of dollars and most drugs never even make it to market. AstraZeneca has also recently announced investments in a drug discovery robot called NiCoLA-B, which it claims can test 300,000 compounds a day.
There are also a number of startups applying complex deep learning algorithms to human genome data to speed up drug discovery, with London-based Benevolent Bio already working on its own drug discovery platform to increase speed to market.
"This kind of project you couldn't even dream of doing on-premise," Heimlich said, "not only because we didn't have the storage but it would have meant shutting down entire production runs on our system."
Speaking of the benefits Heimlich continued: "This is what makes the cloud so ideal for this kind of work. You need to be able to scale up, do the processing and shut it down with no footprint. We had a time constraint of roughly 20 days between Thanksgiving and Christmas. During that time we were able to take our existing process, re-engineer it for the cloud using managed services and EC2 instances."
The researchers started small but "when we were convinced it would flow we flipped the switch, ramped it up with auto scaling groups" and get the project done.
Naturally doing the processing where the data sits is a lot more efficient than having to move petabytes around all the time, particularly in a globally collaborative environment like genomic research. "In the old days it was: take the data, put it on a hard disk, ship it over and transfer it onto your system, hope you have enough space for it and work on it," Heimlich explained. "Now the paradigm has totally shifted where it's transferred to the cloud to work directly on that data."
Leigh Bennett, VP technical innovation and delivery excellence added that AstraZeneca still has a large on-premise presence, but is always looking for ways to streamline. "We still have a lot of services on-prem and we still have three data centres, but we are always looking at cost and speed."
AstraZeneca isn't stopping with genomics, or with AWS as a vendor. Bennett said: "Our transformation journey has taken us through various cloud providers not just AWS." This includes point solutions like Workday for HR, ServiceNow for IT and Box for document storage and collaboration, as well as pure infrastructure plays.
When it comes to infrastructure Bennett seems resistant to vendor lock-in, saying that although AWS is strong on security and compliance, which are massively important when dealing with patient data, "it doesn't all have to be AWS".
"If you look at what Microsoft and Google are doing they are moving into life science positions so we keep a very open mind about our workloads," Bennett said.
Then there are the advanced use cases of applying techniques like machine and deep learning to their data for research, and using advanced technology in its manufacturing plants.
Bennett said that AstraZeneca is looking at a lot of machine learning use cases because "deep learning and machine learning is great for science."
When it comes to the manufacturing plants he said augmented reality is an interesting technology. The company is already experimenting with Amazon's Alexa personal assistant too. "Alexa has arrived on our production lines to help manufacturing teams talk about their standard operating procedures to ask what they do next."