Channel 4 is using the scalability of cloud computing to handle a 10-fold growth in data, enabling the broadcaster to gain better insight into customer habits.
The broadcaster launched its 4oD on-demand service in 2006, as part of a digital strategy which has included the continual development of the Channel4.com website and a range of apps centred around popular programmes.
The digital services have since proved popular with UK viewers, with recent figures indicating more than six million people have registered registering directly with Channel 4 via 4oD in 2012, as well as over six million 4oD app downloads.
As user numbers have grown the broadcaster has had to contend with masses of customer data. This includes web logs and app logs which provide feedback such as who visited, how long they stayed and what they watched. With the public increasingly accessing programmes through mobile devices, there is even more data available to the company than ever, explains Channel 4 CTO Bob Harris.
"Everything we own generates an amount of data, and as you move towards consumption of media and video on demand on mobile devices, it means you will have data about what people are doing coming from more and more devices," he says.
In order to make sense of the data generated by website visits and app usage, Channel 4 previously used a range of on-premise software, including "conventional" tools like Oracle databases, SAP BusinessObjects, and SPSS.
However this approach led to a lack of insight into the growing volumes of data the company was accruing, says Harris, and much of the data was simply "thrown away".
"Those tools are great, but they had a finite capacity, so we outsourced the analysis of our web logs," Harris says. "The best insight we got was how many people watched how many programmes yesterday."
Public cloud strategy launched
In order to gain more value from the data being collated, Harris began to store and process data in the Amazon Web Services' cloud. This includes services such as S3 storage, EC2 compute, DynamoDB NoSQL databases, data warehousing tool Redshift and Amazon's Hadoop-as-a-service offering, Elastic MapReduce.
Using the public cloud services has enabled the broadcaster to crunch data that would previously have been deleted due to the cost of investing in physical hardware.
For example, before Channel 4 began its big data strategy, its data warehouse demands were in the region of 10 terabytes, while a recent test with AWS RedShift involved over 160Tb of data.
"We are now into a situation where we are crunching and analysing well over 10 times the data compared to two years ago, and heading towards two orders of magnitude of volume," says Harris.
"We regularly process multibillion row datasets and we do that in a matter of hours. We are heading to up to 10 times more data volumes in the next couple of years, easily."
With its improved insight into viewer behaviour, Channel 4 is creating a more tailored experience for its customers. This ties in with the company's wider business strategy to create a "deeper relationship" with viewers to generate larger audiences and increase advertising revenues.
"Largely what we are doing is about audience segmentation, taking each viewer and realising that this is perhaps a person who turns up and always looks at comedy, against a person who always looks at news or current affairs programmes."
He added: "It is all very well saying that someone likes comedy – but do they like slapstick comedy or satirical comedy? To work that out you need more data."
Having begun its cloud strategy, Harris expects that all of its infrastructure supporting online services will soon be placed in the cloud, with the potential for wider business applications to be moved off-premise in future too.
“We haven’t put a new server into our online DC in over a year, because everything new is going on cloud. What will happen over the next year is that that data centre which was running our historical online platform will almost certainly disappear completely. So from an online point of view it will be 100 percent cloud.”