WASHINGTON -- The self-described nerds of President Obama's presidential campaign last year were back using big data analytics, this time to help Newark Mayor Cory Booker achieve a landside primary win Tuesday in the New Jersey Democratic primary for a vacant U.S. Senate seat.
But, notably, the Obama data scientists are doing this work as consultants, through their own recently formed firm, BlueLabs.
BlueLabs built a turnout model for the Booker campaign, predicting the likelihood of each Democratic voter in New Jersey to vote in the primary.
The primary results "proved that our model was spot on," said BlueLabs co-founder Chris Wegrzyn, one of the former senior members of the 2012 Obama campaign's analytical department.
The proof, one supposes, is in the victory. But the Republican data scientists aren't ceding anything.
At about the same time BlueLabs was formed, the chief data scientist for Mitt Romney's campaign, Alex Lundry, co-founded Deep Root Analytics.
Lundry gives credit to the Obama campaign data effort, and said "that campaign, without a doubt, in 2012, had data and analytics more fully integrated into their structure."
But since last year's election, "what you are seeing is a flurry of activity on the right to make sure that we not only catch them, but surpass them," Lundry said.
Indeed, while the Democrats were counting votes Tuesday, Deep Root announced a partnership with FourthWall Media, a major source of cable set-top box viewing data.
That data, which is anonymized, records what people watch. Change a channel and a new row of data is created. The idea is to take this data, combine it with insights about the voters, and then place ads on TV shows most likely to reach certain voters, such as swing voters. Lundry said this will improve the efficiency of campaign advertising spending.
Political campaigns have been using data for years to develop sophisticated understanding of voters. But the combination of relatively low-cost cloud computing, large quantities of data collected via online, in public repositories, and from sensors and so on, gave rise to big data analysis as researchers correlated these data sets in search of new insights.
"You are collecting everything you can, and essentially comparing it every way you can," said James Hendler, a professor in the computer and cognitive science departments at Rensselaer Polytechnic Institute, and head of its Institute for Data Exploration and Applications.
"When you do a poll and you talk to 1,000 people who represent 100,000 people, you get a margin of error plus or minus 3%," said Hendler. That's helpful, but it's not nearly as helpful as having 70,000 of those 100,000 people. "You get much more precise, and start identifying sub-communities that you can't do in a poll."
This field is new. The first graduate program in analytics was created in 2007, and universities are rushing to establish programs.
In the 2012 campaign, big data use came of age, Lundry said. This campaign "was definitely the first cycle in which the term 'data scientists' was part of the org chart in any campaign."
Wegrzyn said BlueLabs assembled a creative team of problem solvers, engineers, statisticians, data scientists and domain experts, and said they want campaigns to see analytics as "an agile, team-driven, creative process."
Wegrzyn was surprised by the attention the analytical effort received during the Obama campaign. He led the selection and deployment of the Hewlett-Packard Vertica platform that the campaign used.
"Usually the nerds in the back room don't warrant a great deal of attention, especially in politics," said Wegrzyn, "but the world is changing."