We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message

Big Data - 8 real-world deployments

The amount of data in the world doubles every 18 months. Here's a look at eight real-world big data deployments in a variety of industries

The amount of data in the world is increasing exponentially; it doubles every 18 months. There's much discussion about Big Data, both in terms of the problems it causes and the potential utility it represents. But some people are doing more than talking. Here are eight real-world Big Data deployments.

National Oceanic and Atmospheric Administration (NOAA) National Weather Service

NOAA has been in the Big Data business for 50 years. It now manages 30 petabytes of new data per year, collecting more than 3.5 billion observations per day from satellites, ships, aircraft, buoys and other sensors. It then uses direct measurement of atmospheric, oceanographic and terrestrial data together with complex, high-fidelity predictive modeling to provide the National Weather Service (NWS). NWS' models generate millions of products per day—weather warnings and guidance provided public and private sector forecasters, including government agencies like the Department of Defense and NASA.

AM Biotechnologies DNA Sequence Analysis Solution

Based in Houston, AM Biotechnologies is focused on developing a proprietary new technology for producing chemically modified, DNA-based molecular entities called aptamers. Aptamers have uses ranging from the diagnostic quantification of a particular analyte in a blood sample to the targeted delivery of drugs to specific sites in the body. Developing these aptamers requires analysing up to tens of billions of short DNA sequences. It uses web-based Big Data analysis tools from CD-HIT and Galaxy to crunch its data.

NARA Electronic Records Archive

The National Archive and Records Administration (NARA) is the official record keeper of the US. It manages 142TB (and growing) of information, which represents more than 7 billion objects, including records from across the federal agency ecosystem, Congress and several presidential libraries. The records that are digitised exist in more than 4,800 different formats. NARA is also in the process of digitizing more than four million cubic feet of traditional archival holdings. By 2016, 95 percent of the electronically archived information must be available to researchers. NARA has built the Electronic Records Archive (ERA) as a "system of systems" to perform the various archival functions and records management governed by different legal frameworks.

Vestas Wind Energy Turbine Placement and Maintenance

Danish firm Vestas uses supercomputers and a Big Data modeling solution to pinpoint the optimal location for its wind turbines to maximise power generation and reduce energy cost. It uses a wind library that incorporates data from global weather systems with data collected from its existing turbines. The wind library currently holds nearly 2.8PB of data. Current parameters include temperature, barometric pressure, humidity, precipitation, wind direction and velocity from the ground level up to 300 feet, and the company's recorded historical data. Vestas plans to add global deforestation metrics, satellite images, historical metrics, geospatial data and data on phases of the moon and tides.

IRS Compliance Data Warehouse

In 1996, The Internal Revenue Service (IRS) initiated a project to upload a single year of tax return data for analysis. The project has resulted in the Compliance Data Warehouse (CDW), which contains more than 1PB of information. Most of the legacy data is structured, but new data from electronically filed tax returns, international tax treaty partners and third parties come in XML or other semi/unstructured formats. The IRS research group runs analytics on the data for jobs ranging from estimating the US tax gap to predicting identity theft, measuring the taxpayer burden and simulating the effects of policy changes on tax behaviour.

University of Ontario Institute of Technology (UOIT) Medical Monitoring

UOIT, in conjunction with IBM, has undertaken Project Artemis, an effort to improve medical monitoring technology to allow it to detect warning indicators before vital signs reach critical levels—like nosocomial infection, which is life-threatening to premature infants and first presents as a pulse that is within acceptable limits but not varying as it should. Project Artemis is based on Streams analytic software, an information processing architecture that enables near real-time decision support through continuous analysis of streaming data.

TerraEchos Perimeter Intrusion Detection

TerraEchos specialises in technology designed to protect and monitor critical infrastructure. One of its clients is the US Department of Energy Labs, which relies on it to protect its scientific intelligence, technology and resources. It needed a technology solution that would detect, classify, locate and track potential threats (mechanical and biological) — essentially distinguishing the sound of a whisper from that of the wind from miles away. To do so, the solution uses sensors, analytic software and high-performance computing to continuously consume and analyse massive amounts of information-in-motion, from human and animal movement to atmospheric conditions.

NASA Human Spaceflight Imagery Collection, Archival and Hosting

NASA's Johnson Space Center (JSC) is the hub for the US astronaut corps and home to International Space Station (ISS) mission operations. Since 1959, it has collected more than 4 million still photographs, 9.5 million feet of 16mm film and 85,000 video tapes and files representing 81,616 hours of video in analogue and digital formats. The collection is used for media content as well as by the scientific and engineering community. NASA has created an application called Imagery Online (IO) which links imagery file names to all of the meta data associated with it. But the agency still faces a big challenge in making the collection available to the public in both its raw, native form and transcoding it into smaller, more accessible...

  • National Oceanic and Atmospheric Administration (NOAA) National Weather Service
  • AM Biotechnologies DNA Sequence Analysis Solution
  • NARA Electronic Records Archive
  • Vestas Wind Energy Turbine Placement and Maintenance
  • IRS Compliance Data Warehouse
  • University of Ontario Institute of Technology (UOIT) Medical Monitoring
  • TerraEchos Perimeter Intrusion Detection
  • NASA Human Spaceflight Imagery Collection, Archival and Hosting
  • Play
  • Play
  • Backward
  • Forward

The amount of data in the world is increasing exponentially; it doubles every 18 months. There's much discussion about Big Data, both in terms of the problems it causes and the potential utility it represents. But some people are doing more than talking. Here are eight real-world Big Data deployments.

The amount of data in the world is increasing exponentially; it doubles every 18 months. There's much discussion about Big Data, both in terms of the problems it causes and the potential utility it represents. But some people are doing more than talking. Here are eight real-world Big Data deployments.

Send to a friend

Email this article to a friend or colleague:


PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.


ComputerworldUK Knowledge Vault

ComputerworldUK
Share
x
Open
* *