Cloudera cofounder Mike Olson talks Hortonworks merger and the future of Big Data

Cloudera cofounder Mike Olson
Cloudera cofounder Mike Olson

The Cloudera cofounder sat down with Computerworld UK to discuss the future of the big data market following the mega merger between his company and former rival vendor Hortonworks

Share

Back in October it was announced that Cloudera and Hortonworks had agreed on a merger, bringing together two heavily VC-backed companies specialising in enterprise versions of open source big data technologies, specifically Hadoop and Spark, under one roof.

Over coffee at the Ham Yard Hotel in London last week Mike Olson, the cofounder and chief strategy officer at Cloudera said: "I cofounded Cloudera a decade ago with the conviction that big data would be a big deal for large enterprises and I am genuinely excited at the prospect of bringing the companies together."

Olson added that the merger will not "fundamentally" change how he thinks about the industry and where it is headed.

"We have been focused on storing and managing and analysing data at scale and have been active participants and contributors to the open source ecosystem for 10 years now," he said. "All of that is driven out of a conviction that large enterprises have more data than ever before and need to analyse it in ways that weren't available."

Olson believes that the two companies are ultimately complementary, despite some product similarities (more on that later), particularly pertaining to their different investment strategies over the past few years.

"We made our investments in data warehousing at large scale and that's been good for us and our customers, we have made some significant advances and investments in machine learning and that will and does remain a focus for us," he said.

"One thing we highlighted at the time of the announcement was the good work that Hortonworks has done with its IoT and edge investments. So we view those investments as nicely complementary. Then of course we have both been active in the Hadoop and Hive and Spark ecosystems, so there is a lot of shared code in our core platforms that we think will stand us in good stead."

Read next: What the Cloudera Hortonworks mega merger means for the big data industry

According to Olson both companies will be remaining in Silicon Valley for the foreseeable future, and he envisions playing an "active role" in the newly merged company. Just a handful of leadership positions have been announced at this point, with Cloudera CEO Tom Reilly leading the newly joint business.

Better as one

Both companies have effectively struggled to monetise their enterprise big data solutions, with the two company's top line financials proving extremely similar. In 2017 Cloudera reported revenues of $261 million, but the company still made a loss of $187 million. Hortonworks made an operating loss of $199 million for the full year of 2017, on revenues of $262 million.

Olson didn't want to discuss whether the newly merged company will be able to overcome some of the monetising issues that have dogged the two businesses separately as this puts him into dangerous territory at this stage of the transaction. What he would say is: "We wouldn't be doing this if we didn't believe there were advantages to be gained."

Read next: 12 Hadoop case studies in the enterprise: Here's how business customers are using the big data framework

Olson believes that the newly merged company will continue to focus on hybrid cloud, an area of focus for Cloudera in recent years.

"As part of our thinking leading up to the merger announcement we have grown convinced that hybrid and private cloud are going to be important," he said. "I expect that incorporating technologies like Docker for containerisation and Kubernetes for orchestration will let customers run in their own data centres with the same simplicity and ease of use as the public cloud."

The big public cloud companies now offer their own flavours of these open source-backed solutions running on their own infrastructure, such as Amazon Elastic MapReduce (EMR) and Google Cloud Dataproc.

"The public cloud providers have great products, we want to be sure our customers can deploy and run wherever they want to, whether it is their own or any of the hyper scale cloud providers," he added. "We think it is a really compelling strategic opportunity and the platform has succeeded on premises but not in the kind of elastic and self-serve way that cloud providers have been able to deliver services, so we would love to make that possible."

He added that Hortonworks aligns nicely with this aim, as shown by its recent work with the Cloud Native Computing Foundation and partnerships with IBM and Red Hat to expand on containerisation and Kubernetes.

Product consolidation

Just because Cloudera and Hortonworks are happy families now, that didn't stop rival vendor MapR from trying to rain on their parade, with CEO and chairman John Schroeder issuing a statement shortly after the merger announcement saying: "Customers will not gain innovation benefits through this merger. The merger is about cost cutting.

"They are claiming a next generation data platform without the underlying technology. MapR has delivered on a next generation platform based on nine years of hard engineering. We support a broader set of workloads from Hadoop, to Spark to AI/ML and already provide hybrid cloud and containerisation with Kubernetes. MapR customers are already implementing what the merged company says they are hoping to deliver."

Olson refused to respond to Schroeder's comments directly, but earlier in our conversation he did address the idea that Cloudera and Hortonworks have "several redundant competing technologies, for example Ambari and Cloudera Manager or Sentry and Ranger," as Schroeder put it.

"We need to sit down and decide in cases where we have similar service but different implementations, how we resolve that," Olson said. "But much of the product is the same code, it's the Apache Hadoop project and Hortonworks just released version 3 of its platform in the summer and we released version 6 of ours and our two code lines are closer together than they have been in some years as a result of that."

Olson argued that the two companies' open source heritage gives them "a lot in common" meaning that "once we can plan things in detail it allows us to bring the products together pretty well," he said.

In relation to what Olson is talking about here Hortonworks has actually been packaging up its various capabilities for more specific use cases over the past couple of years to ease industry adoption. See: Hortonworks shifting to packaged solutions for IoT and cyber security

For example, Hortonworks has recently expanded into IoT and cyber security-specific products and Cloudera has been moving into the AI and machine learning model training space with its Data Science Workbench.

"Our hope and ambition is to produce a single, unified platform soon after the merger closes that includes the complementary pieces for both of us," Olson added. "Then over time to enhance, extend and improve that so that it is a functional set of what each of us has done separately over time."

"Recommended For You"

Microsoft (hearts) Linux, for Azure's sake How BT uses data analytics to cut down engineer call-outs