The financial sector may be one of the more cautious industries when it comes to adopting the cloud. But for HSBC the ability to analyse large volumes of information and access machine learning tools via APIs has served as a catalyst for its own cloud ambitions.
Banks are, by their nature, data-intensive organisations, and HSBC - one of the world's largest banks, with 37 million customers and billions of dollars in assets - is certainly no exception.
The 150-year old lender has around 100 petabytes of information across its organisation, and that figure is growing fast too, as customers change the way they bank, favouring digital interactions over traditional methods.
“We have more and more demands to do more with data, every day,” says HSBC chief architect, David Knott. “And we also have more coming our way as well. Banks are data organisations at heart and we have a lot of data to manage, and increasing demand for deriving insight and value from that data.”
While the growth of data has its benefits, allowing the bank to understand its business and customers better than ever, there are also challenges in managing and gaining usable insight from the vast amount of information. Machine learning tools will play a key role in achieving this for HSBC, says Knott, and involves adopting a ‘cloud-first’ approach for its analytics requirements.
“As we have seen, the power of machine learning and the ease of consumption through machine learning APIs, we realise that is going to be a huge part of our future,” says Knott.
“We also realise that is something we would really struggle to service by trying to do it all on-premise,” he continues. “We didn't have the native machine learning capability, we are not going to build a ground-up machine learning engineering capability - there are only so many people in the world that can do that. But people like Google have made it easy to consume from the cloud, so we can go there and that is going to make a huge difference to us.”
Cloud proof of concepts
HSBC is now on its way to running data analytics and machine learning in the cloud, having completed a set of five proof of concept (PoC) projects in partnership with Google. CIO Darryl West revealed the pilot projects earlier this year at Google Cloud Next, with work centring on areas including anti-money laundering and risk simulations. West said that it will help the bank become a "simpler, better and faster organisation" and respond more quickly to customer demands. Read next: HSBC turns to Google Cloud for analytics and machine learning capabilities
“We did these PoCs very quickly, they were successful,” Knott says, adding that the bank has a number of other projects lined up once the initial projects are fully up and running. “We are literally a few weeks away from going live with the five first set of use cases and then hot on the back of that we will say to all the other hundred people or so that have been waiting: ‘you can now start deploying’.”
From Hadoop to BigQuery and CloudML
The bank had previously run all of its analytics on-premise over the years, progressing from SQL to traditional data warehouses, before investing in Hadoop around 2011. “We had built what most people had built; a set of big data and analytics capabilities using various parts of the Hadoop ecosystem.” This involved a mixture of open source and commercial technologies “which we had selected and then integrated together to basically build ourselves data lakes and analytics clusters and all that kind of stuff”.
However the Hadoop systems had limitations, such as scalability and flexibility.
“We got some value out of that but to be honest we found it hard to keep on top of, just hard to build skills at the pace required to integrate new technologies,” Knott says.
“No matter how hard we ran there is always something new coming in that we wanted to get access to, but we couldn’t get there quite fast enough to have really finished deploying what we were deploying previously.
“So it was hard to manage, hard to keep on top of, and also hard to scale. We had reasonable success but we were having these challenges.”
The aim for the company was to access machine learning capabilities, but without the need to run the systems on-premise.
With regards to Google Cloud, HSBC is using a variety of tools. This includes BigTable and BigQuery for data analytics, Dataflow, PubSub for event handling, as well as a range of Google’s machine learning APIs, including one for Data Loss Prevention.
“Around last year we started a conversation with all of the cloud providers to say 'show us what you have got’,” Knott says, “and after some conversations we decided to work with Google on a series of PoCs, to answer three questions which were: if we bring some big data use cases to you, will they work, can we do the things we are trying to do? [Secondly] are they economic - can we do them at least the same price but hopefully a cheaper price than beforehand. Thirdly, is it easier, basically, which was really the big one.”
Knott says that Google delivered on each of these categories.
“The answer to those three was, the first two a very firm yes, and the third one a very, very strong yes. It was enormously easier to build this stuff, basically because we were just consuming a set of services sitting on top of the cloud infrastructure that already existed.
“We found that by using the cloud, we are taking a huge amount of friction out of the process, and basically all the people who are focused on deriving value from data really do data science or software engineering rather than infrastructure managing and provisioning and procurement and all that kind of stuff.”
Machine learning tools maturing
Google is not the only large cloud provider offering machine learning tools, with Amazon Web Services (AWS) and Microsoft Azure selling their own services, but Knott found that the search giant had a clear advantage in this area. “They invented this stuff,” Knott says.
HSBC had previously trialled machine learning tools within its organisation. However the tools available were difficult to manage and the success of the initiatives was limited.
“A year to 18 months ago, we did have a number of smaller companies selling machine learning solutions into us, and they were fairly niche and they were quite difficult to adopt and consume. You kind of needed real hardcore AI PhDs and people like that to really get to grips with it,” he says.
“What we have seen is that, not just on the cloud but generally some other more commercial platforms as well, we have seen a real…I hesitate to use the word democratisation, it is more commercialisation and industrialisation of that marketplace, to actually make it consumable to normal engineers.
“So as one of the people I work with has put it: ‘you don't have to learn how to build machine learning from the ground up’, you now have to learn how to use machine learning - there is still learning to do, but that is a much more doable proposition.”
He says that the number of tech staff working directly with machine learning systems is “probably in the ‘tens’” at the moment. However, he says HSBC now recognises the need to “establish the ability to use machine learning as a core competency” within the organisation. Knott expects the number of staff with knowledge of machine learning systems to expand into the hundreds quite swiftly.
Knott acknowledges that accessing skills is a challenge in a fast growing area.
“We look at different ways to get hold of those skills, including training programmes and scholarships,” he says. “But to be honest it is a young industry - AI and machine learning have been around for decades, the new generation of machine learning is still very new, and I think there is a distinction between those.
“So if I look in Google at Fei-Fei Li, who runs the machine learning product line, there aren't many of her in the world. But fortunately we don't need her in our organisation, because Google are making the product of her research available to us through the cloud.”
The partnership with Google is just one part of HSBC’s extensive cloud strategy - itself something of a rarity among the large multinational banks. HSBC is also using Azure to move some of its Microsoft applications to the cloud, and is using AWS for dev and test environments. It is also using software as a service tools such as Oracle Fusion financial apps.
On premise it has a mix of legacy systems running along more modern, virtualised applications. It is also using Pivotal’s Cloud Foundry platform as a service tools to further automate delivery of its infrastructure to developers.
“Like any large technology organisation, we have managed a very large and diverse portfolio of suppliers for decades - that is bread and butter of being an enterprise technology business. So in some ways managing multiple providers who do similar things is not new to us, we have managed those kinds of relationships for a long time.”
Nevertheless, managing multiple cloud vendors presents some challenges.
“There are some specific nuances that the cloud brings,” he says. “One of those is one that the cloud industry hasn’t quite figured out how to address yet, which is service management across multiple clouds and across our own services as well.
"It is going to be increasingly a fact of life where certain business services will depend on services that come partly from us, partly from our cloud provider, potentially from multiple cloud providers.
When something goes wrong, you need to make sure that everybody is going to show up and play their part in fixing it quickly. So through our cloud journey, one of the reasons why we have been a bit deliberate about some of adoption is that we have wanted to make sure that we really understand what happens when something breaks at three o'clock in the morning - because it is always at three o'clock in the morning.
“Beyond that I don't think we have seen huge challenges in adopting the multicloud world. I am sure that as it matures we will find other stuff, but the one top of mind at the moment is how do we do service management across multiple clouds."
HSBC is now considering which application workloads can move to the cloud too, but Knott says the big data and analytics plans currently is a “pretty bold move” in itself, and one “that is going to keep us busy for a while”.
“So we will start with that and we will keep an eye on all the other stuff. We are still forging relationships, we are still seeing what works what doesn't work. That is the stuff we think adds most value in the near term, so we will get on with that first and then see what comes later.”
Generally, though, he sees that more and more of the bank’s technology will be running in the cloud in future. "We are definitely accelerating,” he says. “I think if you have been watching HSBC for the past couple of years you will have seen that, so we probably wouldn't have been having this conversation a year ago about our adoption of cloud.
“I see that our peers are probably in a similar position to what we are, in that they have partly in the past been driven by cost, and the cloud has often been seen as a cost play. I think the cloud can be a cost play, but it is not the most compelling reason to move.
“We are moving for capability - what we are accessing is a big data and analytics capability and some of that capability you can only really realistically have on the cloud. So I think if other banks want that capability then they are going to have to accelerate their appetites.”
Find your next job with computerworld UK jobs