TfL selects AWS over Azure to handle 20x spikes in journey planner website traffic

TfL’s website is relied upon by a large number of commuters in London using various modes of public transport each day. Its site – revamped in 2012 - receives three million pages views and 600,000 unique views on an average day.


Transport for London (TfL) has picked Amazon Web Services (AWS) over Windows Azure as the public cloud platform to power its online services - such as its popular journey planner tool - allowing it to scale complex computational requirements to meet huge seasonal spikes in demand.

TfL’s website is relied upon by a large number of London commuters each day using various modes of public transport. Its site – revamped in 2012 - receives three million pages views and 600,000 unique views on an average day.

However, when there is disruption to services, such as a snowfall in the capital, the demand for its services can increase dramatically as travellers check their smartphones. According to Dan Mewett, solution architect for Transport for London, by hosting its website in the AWS cloud, TfL is able to quickly scale to cope with the additional strain on its systems – a situation that would have been impossible without ‘huge’ outlays on hardware.

 “The services such as journey planning, they are complex. If I wanted provision the hardware to handle that I would have to buy a huge amount,” he said at the AWS Summit in London on Wednesday.

“So the number one reason why we chose AWS, or cloud in general, was to deal with the ‘snow day’ phenomenon, and we wanted to use auto-scaling to meet the demands. We basically unlock a massive amount of cost savings."

He added: “If you have to scale your current deployment 20 times to deal with a spike that you might see five times a year, the cost of that in physical hardware compared to just running up 20 times when you need it in AWS, the costs savings were phenomenal, and that really kind of sealed it.”

Choosing AWS over Azure

TfL has traditionally been a Microsoft shop, and has used the vendor’s Azure platform in the past to host data feeds for third party’s to build their own services.

However, TfL choose AWS for its website requirements due to the ability to add on tools. This includes the high availability front end cache software, Varnish, used to sustain very high traffic and enable load handling of 2,000 to 3,000 requests per second.

“We get three million website views a day, and we have to show you the status information across all modes of transport. We also provide bike status,” he said.

“We have to provide this at volume. So we have to have some very smart engineering. We have done that – we use some of the AWS technologies [such as auto-scaling], but we also use a caching technology called Varnish.

“One of the reasons we chose AWS is that it is very good at allowing you to build what you need. So we considered things like Azure, and the majority of our [systems] are Microsoft based, but it can’t allow us to mix and match the solutions we want to deliver these kinds of volumes. That is one plus point for AWS that really helped.”

Data complexity

One of the challenges faced by TfL is managing the complexity of its data, with the need to deliver real-time information on journeys that cover 18,000 bus stops and 8,000 vehicles.

“We want to be able to tell when a bus is going to arrive from one of the routes coming to the stop. That creates a situation where there is 130,000 predictions every 30 seconds,” he said, adding that latency is a ‘key’ requirement.

“So we had to ensure that we could deliver this information on our website, and the implementation uses some smart technologies. We use a technology called WebSockets and this allowed us to push the information to citizens as soon as the information is available.

“This is good because the alternative is for [users] to poll [the site] every few minutes, and if there is a strike day with two million people polling you every five minutes, then you are going to crumble.”

He added: “In order to use WebSockets, you can just use any old type of load balancer, we need to use a TCP load balancer. But with AWS we didn’t have this problem and we were able to build what we wanted to, and now we can offer this service to everybody.”