Compressing and de-duping NEC grid storage

Could NEC steal a march on bigger storage players by launching a de-duping grid storage product?


A report has emerged of an NEC grid storage project called Hydrastor. It's said to be a grid-based backup device of great scalability and with data volumes shrunk through both compression and some form of de-duplication technology.

NEC's website list various areas of research by NEC Laboratories America. One is High-Availability Grid Storage with the aim of producing a 'scalable storage system based on commodity hardware that is self-managing, can recover from multiple failures and provides efficient data compression as well as duplicate elimination.' Elsewhere the site talks of 'survivable grid storage.'

A micro-kernel for grid data structures is described in NEC's website.

Previously NEC has seen grid computing as an aspect of its high-performance computing (HPC) efforts.

A research project such as Hydrastor will have had a long gestation. A contributing factor to NEC's decision to start the project may have been the vast storage needs of international enterprises such as United Airlines as an NEC publication noted a few years ago.

In March 2006 NEC Labs filed a patent entitled 'Content-Based Information Retrieval Architecture.'

It has pending patents in the data duplication area suggesting it has developed its own de-duplication technology instead of buying it in.

A scalable and survivable and grid-like storage facility is on the development radar screen of several storage suppliers but no-one has yet delivered it. For example:-

- IBM's Storage Tank - HP's Smart Cells with its RISS implementation.

Neither of these have had de-duplication associated with them.

There are also potential intelligent storage facilities with large capacities such as Sun's Thumper and Honeycomb, both of which use commodity storage building blocks, but neither of which has yet been combined in a grid-like cluster.

Sun has a nomenclature problem in that it defines grid storage and grid computing as IT services delivered over the network, like an electricity grid utility, rather than as specific IT architectures. Grid storage for Sun is simply selling storage capacity at, say $1/GB for some rtime period.

Still Sun has the building blocks in place. All it needs to do (!) is to cluster them together some way and add scalability, manageability and robustness.

NEC's addition of de-duplication is an interesting tactic. It implies that the facility it is developing has a lot of CPU power to handle the de-dupe and unde-dupe processing load.

NEC is a relatively small player in enterprise data storage. If it can launch an attractive and highly-scalable de-duping storage product then, to mix metaphors, the small NEC feline will set several large storage pigeons rattling in their cages.

That might happen later this year if the report mentioned above is right.

