The concept of Big Data is never an easy one to explain to high ranking non technical executives. Few understand what Big Data actually is; it’s potential for the business, what the current problems are and how to approach solving the problems. “It requires some pretty hard discussions with the business,” says analyst Clive Longbottom at Quocirca.
Big Data is not about size, but complexity.
“Big data is not really about volume - a gazillion bytes of data in a data base is a large data problem. But just one gigabyte of data that is spread across databases and file and block storage is a big data issue - a key point that many do not understand,” says Longbottom.
Quocirca identifies Big Data by ‘The Three Vs’: Variety, Volume and Velocity. Variety, says Longbottom, is the biggest challenge. The variety of data extends beyond traditional databases and files, into data streams, social media and every type of data in motion. This includes a lot of information that may be under other people's control - across the value chain or out on the internet.
This creates challenges for the enterprise around security, privacy and copyright. It also means that new powerful tools are required to create the necessary data federation. Big data analysis requires new tools and platforms like SAP HANA and Hadoop and noSQL databases such as Cassandra or MungoDB.
In Memory computing techniques like HANA give companies the power to number crunch the other V, the volume, although their work is cut out because IDC research indicates that between now and 2020 the volume of data most companies manage will multiply thirty five-fold.
We are already past 1.8ZB (1.6 trillion gigabytes) says IDC and social interactions, mobile devices, facilities, equipment, R&D, simulations and physical infrastructure all contribute to the flow.
Still, SAP claims some global clients using HANA can now create intelligence reports in minutes where they used to take weeks. Hadoop aggregates smaller units of processing power and makes these resources universally affordable. So the general improvement in processing power makes short work of the rise in data volumes.
“It means companies can continually learn and improve their way of doing business by the hour,” says Matt Quinn, CTO for Big Data analytics company TIBCO. The four essential disciplines that should be applied to data, says Quinn, are to capture it, create rules, Model behaviour (based on past behaviour) and act on it. At its simplest, this could mean that, say, the systems could predict that a network was about to have a problem and dispatch an engineer in advance to nip the crash in the bud.
It is early days for Big Data, but there are obvious applications emerging. As the possibilities are explored, many more unexpected use cases will emerge, says Alys Woodward, IDC’s research director for European business analytics.
“Data collected by some organisations will find a market itself, for example credit checking information, aggregation of social information and retail sales information,” says Woodward.
In the utility market, smart meters measure usage of electricity (or gas or water) with real-time or near-real-time sensors, and regularly communicate the usage information back to the utility supplier. Specialist Big Data companies, such as i2O Water, are emerging in each utility to make efficiency savings.
They can also include power quality measures and outage notifications. Measurements can be taken as often as every 15 minutes, and the supplier is typically updated on a daily basis.
Before the existence of Big Data technologies, smart metering systems were limited to operational management of individual meters. “The Big Data opportunity for smart meters lies in helping utilities to spot anomalies,” says Woodward. Power outages, excessive usage and other unusual situations could indicate expensive setbacks such as fraud, illegal activity or service issues.
As utility markets become deregulated and move toward a B2C model similar to that which exists in telecoms - where consumers can switch easily between suppliers - utilities will need to extend their information architectures to monitoring customer satisfaction and managing customer churn.
Customer behaviour on ecommerce and other web sites presents huge opportunities for Big Data analysis and optimisation. But only for those organisations that can handle the data volume and variety involved.
Clickstream analysis can be used to track customers' paths round web sites to test interfaces, function and advertising effectiveness.
When Big Data is connected with purchasing information this gives a full view of customer behaviour and how it leads to monetisation for the supplier.
Online retailer OTTO (which owns the UK brands like Grattan) used predictive analysis vendor Blue Yonder to increase the accuracy of its sales forecasting by 40%.
“Companies can analyse their Big Data, interpret the results and make insightful decisions that cut surplus stock and maximise profitability,” says Blue Yonder founder Dr Michael Feindt.
Website retail recommendation engines can offer the user additional purchases by making and presenting predictions based on purchases by similar customers, or purchases that match goods already bought, or other options. “Integrating this transactional information with social data gives richer information about the individual customer's likes and how they interact with the your brands,” says IDC’s Woodward.
The more information that available for segmentation, the better. Some of this information is needed in real time, such as retail recommendations. Social information is challenging to integrate because of its unstructured format, while predictions based on social information need to take into account the incomplete nature of social information. Traditional transaction data was (and is) structured, whereas social information only tells simple stories and the analyst cannot draw inferences from anything that is not explicit.
Which is why the Big Data opportunity for customer behaviour analysis lies in the ability to ingest as much relevant data as possible, then to build predictions of customer behaviour on which systems and individuals can act.
In financial services, valuing a mortgage portfolio involves a complex network of calculations, predictions, and simulations into a valuation model.
Organisations have long used such a model to predict cash flow, the likelihood of default due to bad credit and the likelihood of strategic default, as well as the model serving as a basic standard metric for the value of the business. Big Data technologies allow these calculations to be conducted more quickly and at a far greater level of detail, without the need to predefine the dimensions for analysis.
The public sector can benefit too. Cambridge University Hospitals, for example, has to calculate its patient level costs by looking at a number of variables, such as patient numbers, types of operations, length of stay, doctor performance and number of beds available. It used Qlikview business discovery tools to integrate all the above data to spot trends and see which patients were costing more and what types of operations or patients were taking up the bed space.
Using this data analysis, the hospital recognised that, for certain low risk operations, asking the patient to either come in on the morning (rather than stay overnight) or scheduling the operation at midday instead of first thing in the morning, meant that more beds could be freed up overnight. This has driven down the cost of each patient stay and providing more resources for other patients.
Restaurant chain EAT uses Big Data analysis to minimise the waste of foods that have only a one day shelf life and attempted to combine its offerings to encourage the perishables to be sold. Latte and croissants can be more profitable together than as standalone products. EAT can up-sell or position combination products in store accordingly to maximise sales and also ensure the one day shelf life that EAT’s products have can be maximised. It can see where one store might need more of one product than another.
Gwent Police uses Big Data analysis to map crimes and monitor staff and force performance more effectively, with the data available in minutes. Using the data analysis, Gwent Police could delve into police performance to inform its daily management meetings.
Now it bases its patrol strategies on the previous day’s incidents and can match appropriate resources to crime hotspots, report them more accurately to the Home Office and save 15.5 hours per individual. This cut its overall costs by £350,000.
There are many other types of analysis that Big Data tools allow: web application optimisation, legal discovery, natural resource exploration, healthcare analysis, fraud detection, revenue detection, churn analysis and traffic flow optimisation among others.
Many of the use cases are specific to an industry, and were already being calculated before the rise of Big Data tools, but they are being completed far quicker and more efficiently. When HANA and Hadoop start to devour Big Data even more voraciously, more possibilities for real time analysis emerge. The industry is currently at the beginning of a learning process and no one knows where it will take us.
The era of Big Data has arrived, increasing the world's store of electronic information by about five trillion bits per second, but we have yet to fully understand the system we have constructed, says science historian George Dyson.
Find your next job with computerworld UK jobs