Companies looking on open source Hadoop development before they deploy the data processing system are like characters from Samuel Beckett's play Waiting for Godot, says analyst Gartner.
The indecision of the tragic characters in the play is now being played out by those companies not moving to try out Hadoop and big data solutions in general, said Gartner analyst Merv Adrian in a presentation to this week's Teradata Partners user conference in Nashville.
He said: "Waiting for Godot reminds me of IT, big data investments are on the rise but it is still surprising how many organisations are still waiting. In Godot, one character says 'let's go', and the other says 'no, let's wait a little longer', and it goes on."
Those planning for or deploying big data, he said, want to monetise data, use it for marketing and to achieve sales growth, support new products and services, manage risk and improve fraud detection, and improve operational and financial performance.
"On big data, Hadoop is the area we get the most enquiries about every month, but with the ever growing Hadoop distributions and features, the biggest complication to adopting Hadoop is building your stack. The early adopters just downloaded the system from Apache themselves, but as we move on we have to get away from command line deployments for wider enterprise use."
Adrian said 70 percent of companies who have invested in big data have mostly done so for pilots, with only 12 percent using big data in full production environments. He said 2015 will see a "rapid rise" in full big data deployments, and that Hadoop will help lead the way.
Adrian said Yarn for Hadoop would kickstart deployments as it enables cluster research management to support batch processing, supports interactive SQL, in-memory working, streaming and events and search, as well as noSQL operations. "Yarn changes the game, it all starts here. It is certain, even though in Godot they say 'Nothing is certain'."
He also tipped Hadoop Spark as a major driver for 2015 adoption as it was an in-memory and on-disk execution engine for the multiple re-use of data, with interactive data mining. Spark will come to market next year.