The NoSQL movement is growing, but what is it?


The NoSQL movement is a combination of an architectural approach for storing data and software products (such as Tokyo Cabinet, CouchDb, Redis) that can store data without using SQL. Thus the term NoSQL.

The idea is pretty simple: Not all applications need a traditional relational database management system (RDBMS) that uses SQL to perform operations on data. Rather, data can be stored and retrieved using a single key. The NoSQL products that store data using keys are called Key-Value stores (aka KV stores).

Because these KV stores are not relational and lack SQL they may be faster than RDBMS's because they don't have to maintain indexes, relationships constraints,and parse SQL. The downside of NoSQL is that you cannot easily perform queries against related data.

Bravo To the NoSQL Approach

As an analyst who focuses on helping clients achieve massive scale and blazing fast performance, I will be one of the first ones to endorse this approach for many Web applications because:

  • Scaling is easier. When data is not directly related to any other data you can store it anywhere. That means that you can handle more data by adding additional nodes.
  • The engines are faster. There is less overhead because the KV store does not have to parse SQL or maintain multiple indexes to support relationships. Often a hashing algorithm can be used to retrieve data instead of a more expensive B-tree type algorithm.
  • It is easier to change data structures. Need to add a field? No biggy.Many of these NoSQL products store data as blobs. If your data is stored as xml you may only need to add an attribute or tag rather than thinking about the impact of adding a field to a table in your database.

Many Web applications simply don't need to represent data as a set of related tables. Rather, data can be represented as an object graph or byte stream identified by a single key. For example, a user profile can be represented as an object graph (such as pojo) with a single key being the user id. Another example: documents or media files can be stored with a single key with indexing of meta data handling by a separate search engine.

Elastic Caching Platforms Are KV Stores On Steriods

Elastic caching platforms such as IBM eXtremeScale, Gigaspaces, TerracottaMicrosoft Velocity, Hazelcast, NCache, and Infinispan are essentially in-memory KV stores that provide most of the benefits of NoSQL KV Stores but add the following features:

  • Lower latency. These platforms store data in-memory. This significantly reduces the latency of data operations. In-memory storage is a downside though if you need to persist objects over time or have large objects such as video or documents.
  • Reliability. Distributed caching platforms employ clever data replication algorithms that store the data on multiple nodes. If one of the nodes goes down, the platform will serve the data from a backup node.
  • Scale-out. Most of the elastic caching platforms let you add and remove nodes during operations. The platforms use sophisticated algorithms to re-balance the data to optimize the use of all the nodes in the grid.
  • Code execution. Some, but not all, of the platforms also let developers distribute the execution of code across the grid. Using distributed code execution, developers can distribute the workload to where the data resides rather than moving the data to the application.

"Recommended For You"

How cloud computing will change application platforms Impact of the SQL-Hadoop marriage on infrastructure