Graph database vendors: Who they are, what they do and who their customers are
Graph databases are growing in popularity in the enterprise space. According to Forrester analyst Noel Yuhanna in his market overview the graph database market is "embryonic but will grow significantly”.
Gartner started including graph database vendors in its magic quadrant for operational database management systems (ODBMS) in 2014 with the inclusion of Neo4j and GraphDB and also expects to see more entrants in the coming years.
A quick recap on what a graph database is first:
A graph database is a flavour of NoSQL database built upon graph theory, an academic computer science methodology which plots data points, known as objects or nodes, and the connections between them on a 'graph'. So, where a traditional, relational database stores data in rows and columns, a NoSQL database stores large sets of unstructured data. A graph database goes a step further by including the connections between those data points, essentially building up a network of data.
Here are some of the leading graph database providers in the market right now, from open source to established vendors and a few noteworthy pilots worth keeping an eye on…
1. Neo Technologies
Swedish co-founder and CEO of Neo Technologies Emil Eifrem started building graph databases eight years ago when he realised the limations of relational database technology as he built content management systems for enterprise customers.
He explains: “The key reason it was working against us is because the information we were working with didn’t really fit, the information was big and messy and interconnected and ever changing and evolving.”
The result was Neo4j, one of the first commercially available graph databases with a broad range of customers. Eifrem says: “When it comes to actual customers we now see this very horizontally across any aspect of the software industry. So we have a lot of pure software companies.
"Also financial services, retail, media and entertainment and social networks are an obvious example, telecoms and healthcare.”
DataStax announced its graph database product at its London summit in April 2016, adding it to its existing DataStax Enterprise platform, DSE, which is built on Apache Cassandra.
CEO Billy Bosworth told ComputerworldUK: “Think of [Cassandra] as our backbone. We can now do things like model the data differently on top of it, introduce different workloads like search indexing, real-time analytics, basic transactions. Here’s the real key: to never have any of those collide.”
The addition of graph comes on the back of DataStax’s acquisition of open-graph database specialists Aurelius and is essentially a new version of its Titan database.
Speaking at the DataStax summit, Matthias Broecheler, director of engineering for DSE Graph said: “DataStax Enterprise Graph is us taking the ideas from Titan and placing them on an enterprise grade platform.”
DataStax aren’t publishing any customers for this product yet but Dutch bank ING has shown an early interest.
Oracle offers spatial and graph options for its flagship database product. Oracle has a first mover advantage over competitors like SAP and IBM as one of the first major IT vendors entering the graph database space over two years ago for its NoSQL database.
Oracle’s Big Data Spatial and Graph analytics tools are Apache Hbase and Oracle NoSQL database compliant.
Oracle has a property graph, as well as an RDF graphs and network data model graph. The Oracle solution tends to be used more for knowledge management; so metadata, taxonomy and ontology, with customers like Bloomberg and Thomson Reuters attesting to these use cases.
Oracle says the product is good for: “Social network analysis to linked open data and network graphs used in transportation, utilities, energy and telcos and drive-time analysis for sales and marketing applications”
OrientDB has also been recognised by Gartner in its Operational DBMS Magic Quadrant, alongside Neo4j. According to OrientDB themselves the database is different from Neo4j’s, in that: “While Neo4j is a pure graph database, OrientDB has a hybrid document-graph engine that adds some compelling features to the graph database model.
OrientDB was first released in 2010 and the open-source database supports schema and schemaless modes. Similar to DataStax, Orient is keen to be seen as a multimodel engine for a range of use cases, be it transactional or operational for fraud detection, customer 360 and social networking.
Customers deploying OrientDB include CenturyLink, Ericsson, Pitney Bowes, Sky, and Warner Music.
NoSQL database vendor MongoDB brought graph database capabilities to its customers with its 3.4 update in November 2016. Now developers can access and store data in graph and faceted search query languages without using a specialist vendor. This allows for new data sets to be analysed and queried alongside other operational data held in MongoDB, without the need for data duplication.
A white paper by the vendor states: “Applications storing data in MongoDB frequently contain data that represents graph or tree type hierarchies. These connections can be as simple as a management reporting chain in a HR application, or as complex as multi-directional, deeply nested relationships maintained by social networks, master data management, recommendation engines, disease taxonomy, fraud detection, and more.”
The update brings native graph analytics and faceted search directly to customers, and fully integrates with existing security, management, availability and disaster recovery capabilities.
Stardog is the ‘smart graph database’ from US-software company Complexible. Stardog has a particular focus on OWL and RDF-based systems, with the latest release supports the SPARQL and Gremlin query languages. A key feature for Stardog is graph versioning to track changes.
Featured customers include NASA, JP Morgan Chase and Viacom. Stardog is priced for community, developer and enterprise tiers, with community free to download and use for up to four users and ten databases.
AllegroGraph is software developed by semantic web specialists Franz Inc. According to Forrester’s Noel Yuhanna: “Franz is the key developer, contributor, and supporter of AllegroGraph, a semantically enhanced graph database that focuses on W3C standards and is commercially licensed with open source extensions.”
The AllegroGraph product has a strong use case for event processing where “applications benefit from ontologies, reasoning, rules, and linked open data”, says Yuhanna.
AllegroGraph supports SPARQL, RDFS++, and Prolog reasoning and clients include Goldman Sachs, Pfizer, Siemens and Wells Fargo.
Flock started out as an open-source project at Twitter and lives on through GitHub, where developers can use the software for social media and web-site use cases.
According to its GitHub page: “FlockDB is much simpler than other graph databases such as Neo4j because it tries to solve fewer problems. It scales horizontally and is designed for on-line, low-latency, high throughput environments such as web-sites.”
Objectivity Inc, as the name suggests, started life in the niche object database space and went on to create InfiniteGraph.
The US software maker has since been busy creating a new graph database product called ThingSpan, focused on industrial internet of things (IoT) use cases.
ThingSpan is built to bring graph capabilities natively in to Apache Spark and the Hadoop Distributed File System (HDFS). Users can run mixed workloads on massive data sets, including navigation and pathfinding queries as well as parallel pattern-finding and predictive analytics.
A note on the Objectivity website reads: “InfiniteGraph is a highly specialized graph database. Its functionality is being migrated into ThingSpan. However, Objectivity will continue to support licensed users and will recommend it to Java developers who wish to use graph analytics outside of a Spark environment.”
Teradata’s Aster SQL-GR is a native graph processing engine so developers can run complex graph analysis on their data, based on bulk synchronous processing. Teradata says that Aster allows users to run graph analysis in parallel with text and statistical.
Like Oracle, Teradata stands out on this list as it isn’t traditionally an open-source company. Aster analytics can be installed on native hardware and an appliance, on the Teradata cloud or on Hadoop.
Use cases include: social network/influencer analysis, fraud detection, supply chain management, network analysis and threat detection and money laundering.
Microsoft Research has been working on a graph project, previously named Trinity, since 2012, and finally released Graph Engine 1.0 for public preview in May 2015.
Graph Engine is available to Microsoft customers through Visual Studio and on the Azure cloud.
Apache's Giraph project has also been in development since 2012. According to the Apache Foundation, it is: “An iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyse the social graph formed by users and their connections.
"Giraph originated as the open-source counterpart to Pregel, the graph processing architecture developed at Google.”
Being open source Giraph is available for download to all and Apache says: "Pre-built packages will soon be available through Apache's Maven repositories, making it easier to include Giraph in your projects."
Graph databases aren't new, but interest in the technology is starting to peak. We find out what is a graph database
Which open source big data framework is right for your enterprise?
The Neo4j graph database is proving to be popular in the medical community for connecting different entities