Sky swaps Oracle for Cassandra to reduce online shopping errors

Broadcasting giant Sky has stopped using an Oracle relational database to support its online shopping services and is instead using a Cassandra NoSQL alternative in a bid to boost performance for customers purchasing goods via its website.


Broadcasting giant Sky has stopped using an Oracle relational database to support its online shopping services and is instead using a Cassandra NoSQL alternative in a bid to boost performance for customers purchasing goods via its website.

When Sky’s online sales were beginning to ramp up in 2011, it found the shopping journey of some customers was being disrupted. When customers filled their online shopping basket with products provided by the broadcasting company, sometimes these would disappear before they got to checkout.

Sky began experiencing these errors when it was using Ehcache, an open source technology that was allowing it to cache the sessions data (shopping baskets) server-side. However, since this wasn’t proving effective, the broadcaster it began using an Oracle database, since it was already using Oracle elsewhere in the stack.

“Actually that worked for a while, but when under load we were seeing issues and errors. So rather than trying to work through the issues with Oracle and fix the bugs, we decided to go with Cassandra,” Sky’s database consultant Paul Makkar told Computerworld UK.

“The thing with a relational database is that there are a lot of control structures and a typical shopping basket was more than a standard row could take in Oracle, which mean we would have had to use something special called a log segment. All of this meant extra overheads for something that should have been quite simple.”

He added: “It should be a case of just dumping the data in-memory, retrieving it from in-memory, and be done with it. Not having to get into all the heavy stuff that comes with relational databases.”

Having established that a move away from relational databases was the way forward, Sky carried out a market assessment of alternative data storing technologies, where Cassandra looked the most promising because it came with disaster recovery out of the box. Redis and Voldemort were also considered.

Cassandra’s NoSQL, distributed architecture allows companies to operate on cheap commodity machines, which can be scaled up quickly by simply adding extra nodes, and spread across multiple data centres or even across on-premise and in the cloud.

This means that if a number of machines are taken out for whatever reason, the company’s database isn’t knocked offline and doesn’t lose its performance capabilities – there is no single point of failure.

Big boost in performance for Sky shoppers

Since implementing Cassandra, Sky says it is getting ten times the performance it had on Oracle, with no errors occurring on the shopping basket.

Sky’s implementation of Cassandra took approximately three months, which Makkar described as cautious, given that his team of 200 developers were unfamiliar with the technology.

He said: “We took our time implementing it, we weren’t getting horrendous failures with Oracle, so there was no rush to get it into production. It’s a whole change of architectural framework and our engineering team weren’t familiar with the technology.”

“Getting the developers up to speed is a complete paradigm change in terms of how you use the database. Don’t be fooled into thinking you can use it like a relational database. For example, it doesn’t flex easily in terms of querying. It’s an education thing, but in time this will begin to mature.”

Makkar is now urging the team to consider broadening the use of Cassandra, because it now has the choice of Oracle’s relational database and a NoSQL alternative in the stack.

“We are using it for session store, the shopping basket, some bulk read only data, surveys – stuff that fits naturally in Cassandra,” he said.

Cheaper storage is a big plus for Cassandra

However, Sky isn’t yet looking at migrating anything out of the Oracle database into Cassandra. 

“Just as new things come along we have been pushing stuff in that direction. This might be because we haven’t reviewed the need for this, which if we did, we may come to the conclusion that 30 percent could quite easily go into Cassandra,” said Makkar.

“But if its working okay now in the relational database then leave it in there, there’s more pressing concerns for the business.”

Sky found that the biggest advantage of using Cassandra is the reduced cost in storage, where the database doesn’t need access to expensive storage area networks (SANS), because it utilises the local disk which is much cheaper.

“If you need more storage you can just scale out your nodes with Cassandra, because it uses local disk off the computer. Obviously there’s only so many disks you can fit into a computer itself, that’s why traditional databases tend to use SANS on the back end If they need to grow their storage,” said Makkar.

“Cassandra solves this by saying just add more nodes to your cluster, you will get those extra terabytes. Local disk is much cheaper than using SANS.”

In other news, having bought O2’s broadband and fixed line business in the UK earlier this year, Sky announced in its quarterly results last week that, before integration, the consolidation resulted in a net neutral impact to its operating profit, comprising £19 million of revenue and £19 million of operating cost.

"Recommended For You"

For most companies, NoSQL & Big Data are misunderstood at best, or hype at worst How Netflix survived the Amazon EC2 reboot