Hortonworks, a vendor of data management solutions, has launched a new product called Data Steward Studio (DSS) to help enterprise customers find, identify, secure and connect data for consistent governance across cloud and on-premise data lakes.
The product is an extension of the vendor’s security and governance tool, Hortonworks DataPlane Services, and is aimed at helping anyone within an organisation with data responsibilities - so a data steward all the way up to the chief data officer (CDO) - to find and curate data, collaborate on it, and eventually secure and report on data and its context.
Data can be more easily searched and organised according to its business classifications, purpose, or the level of protections needed, giving the relevant people the ability to classify and secure sensitive or personal data and set permissions using a single user interface. It also gives analysts all the relevant metadata for a clear view of security policies, data protection status and any anonymisation rules that have been put in place.
Speaking during the Dataworks Summit, which Hortonworks hosts, in Berlin this week, Scott Gnau, chief technology officer at Hortonworks said DSS allows "the data steward to find, identify data sources out there. It’s a tool that they can use to reduce redundancy, make decisions about how data gets stored, managed, replicated and moved across the infrastructure and proactively search through that data".
Also speaking in Berlin, John Kreisa, VP of marketing said that its customers tend to run on a hybrid architecture, "with some amount of on-premise and some in the cloud. We see most practical organisations are taking this hybrid approach so we are bringing products to market and have brought products to market to help satisfy and deliver in a modern day architecture,” he said. “So this area is where we see the most growth and interest in the market.”
In the Hortonworks stack this consists of the Hortonworks Data Platform (HDP), which is underpinned by Hadoop technology; Hortonworks Data Flow (HDF) for streaming data analytics and now the DataPlane Services (DPS) layer on top for governance.
DPS was launched last year to give customers that single view, and which will evolve as new demands appear, such as a data lifecycle management tool to maintain the providence of data as it is moved between clusters, and now DSS for visibility across data types and tiers.
Kreisa added that Hortonworks refers to the DataPlan Service release as ‘Hortonworks 3.0’ and that it is showing the most growth potential for the company.
Kriesa identifies the current enterprise requirements the vendor is seeing, namely: “Common access, security, governance and ops across multiple clusters with data from many data sources, an increasing number of data types and multiple tiers of data across diverse environments”, with the aim of getting a unified, single view of security and a single governance model.
Essentially, Hortonworks believes that these customers need a unified, single view of their data security and a single governance model, wether that data is held on-premise, in the cloud or even at the edge.
Nadeem Asghar, global field CTO, added: “When you are getting the data, whatever policies are applied they will be available in Data Steward Studio. So essentially you govern who can access that data being generated at enterprise level.”
“Think of it as the next generation of [Apache] Atlas,” he added. Atlas is an open source software solution designed to exchange metadata for governance control across a Hadoop stack for compliance reasons.
This all comes in handy for organisations preparing for the upcoming General Data Protection Regulation (GDPR), where data providence, governance and visibility of data lineage are paramount for compliance.
However, director of strategy and innovation Abhas Ricky told press earlier in the day that he doesn’t believe a single vendor can solve all GDPR issues for customers, which is why leveraging Hortonworks solutions and working with partners is their recommended approach.
Read next: How to prepare for GDPR
Luis Caldeira, chief architect at financial services firm Orwell Group, spoke about the importance of cataloging data as it moves around the organisation, especially in the new regulatory environment. During a customer panel at the Dataworks Summit he said it’s vital for him and his team to track “information as it moves around".
“So cataloging the information you have and what it is used for, which the legislation which is coming down the line will force us to do,” he said. “Making sure that you’re using data in a way with metadata associated to it, labelling it correctly so that when you decentralise it you know where the data comes from and what it is used for.”
A technical preview of DSS is available now and should become generally available, along with the next version of DLM in Q2 2018.