Sportingbet decides against Hadoop to analyse 60Tb of data

Online gaming site Sportingbet has decided against using popular big data analytics software Hadoop in favour of technology provided by British-born SME Logscape, where it will analyse up to 60 TB of data within one year.

Share

Online gaming site Sportingbet has decided against using popular big data analytics software Hadoop in favour of technology provided by British-born SME Logscape, in order to analyse up to 60 TB of data within one year.

Computerworld UK spoke to Andrew Coates, enterprise integration architect at Sportingbet, who explained that the centralised approach to data analysis using Hadoop wasn’t suited for monitoring Oracle Coherence, a data grid system, and needed a tool that could monitor problems at the end-point.

Coates said that Logscape allows Sportingbet’s IT team to run queries on vast amounts of data, without the need for expensive storage and costly, process-heavy infrastructure - which wouldn’t be the case had it opted for Hadoop.

“The way it works is that you have a Logscape server running on dedicated hardware. Then you have a little agent running on every single machine you want to monitor, which means when you run a query, it runs the query on every single machine that you have told it to run it on,” he said.

“Each agent on each machine knows what the log files are, it’s indexed them, and then you run a query on it and it returns the necessary data back to the server. The server then aggregates across all the results.”

He added: “Other tools tend to move your logs off and onto a centralised repository. The problem with that is that you need a lot of storage and then you need a big processing machine to crunch that data. Logscape is quite clever, it leaves the data on each of the servers.”

Coates explained that his IT team could run a query that could tell if there had been any errors on any of the systems integrated with Logscape simply by limiting the parameters to a specific time period and typing the word ‘error’ on Sportingbet’s production system. He said: “It will run on every single machine, hitting every single file, to see if that word has appeared in any of the log files.”

It also enables Sportingbet to quickly analyse what has changed when a new code release takes place.

Coates said: “We can actually say this release put an additional load somewhere else where we weren’t expecting. We are able to work out why and qualify it to see whether it is a legitimate load or whether someone has done something wrong and a new patch release is needed to improve the way we are using our systems.”

He went on to explain why Hadoop wasn’t suitable for Sportingbet’s environment.

“So, it’s not Hadoop, or that kind of product. Personally I’m not a big fan of Hadoop. Although it does have its applications, this is more your standard DevOps monitoring of a system. It also monitors the host, so will tell you the CPU, the memory use of the disk, the network IO of the host,” said Coates.

“I just don’t think Hadoop is the right direction [for us]. You need a big dedicated infrastructure to run it and it’s the wrong processing pattern. Coherence, for example, you have the data and you pass the algorithm to the data – because the algorithm is tiny and that data is big – leave the data where is and leave the algorithm to it.”

He added: “Logscape follows a similar pattern. It leaves the data on each storage node and you run the processing algorithm on the data. That’s much better, especially as datasets continue to grow and grow.”

Find your next job with computerworld UK jobs