Bloomberg journalists have been breaking business news since 1990, but these days their reporting relies increasingly on data science.
This change has thrust the head of data science, Gideon Mann, into a key role in the newsroom.
The computer science graduate spent seven years as a staff research scientist at Google before he joined Bloomberg in 2014, but had little prior experience of finance and was initially surprised to see the influence that journalists have on markets.
"Before I started at Bloomberg, I didn't understand the nature of how news moves markets," Mann told Computerworld UK from Bloomberg's new £1 billion European headquarters in the heart of the City of London.
"Things happen in the real world and usually there's a journalist that's writing and talking about them and spreading the word, and that's how that information gets disseminated."
Mann's team develops data science tools and techniques that help Bloomberg's journalists analyse news, social media, financial documents and press releases quicker than ever to discover the financial insights that form the basis of their articles.
When Bloomberg News was founded in 1990, these focused on fundamental economic information such as market pricing and stock exchange data, but in recent years unstructured data has become a crucial element of their reports.
The augmented reporter
Bloomberg News today often relies on the combined efforts of humans and machines. Individual tasks are often automated but there are very few jobs without a human contribution.
"We produce automated news, but even more than that we do a lot of human-computer hybrid stories," says Mann. "The computer will come up with a first story and then a journalist will take it and elaborate it, put it into context, and explain the whole narrative act.
"The computer will have fairly simple heuristics; it'll look for major shifts or deviations. The people who are writing these programmes are a mix of journalists and computer scientists, so they use their editorial decision to say whenever anything like this happens is of interest, and then a person comes in and moulds it and forms it."
Journalists can initially be sceptical about the influence of automation, but this changes when they realise the benefits to their work.
“In a financial context, what reporters had to do a lot was write these fairly formulaic articles. They look almost the same every time. The numbers would change and sometimes the companies would change. Once you have an automated system, it takes away something no one really wanted to do in the first place.”
The data science developments mean journalists can now move markets with greater speed and intricacy as their analysis is more reactive to information and provides deeper insights
Mann believes that their changing influence is more useful than destructive.
"I think really what happens is that the effects of news ripple through very quickly," he says. "I don't think it means that there’s a more chaotic market, it just means that a piece of information enters the market and the market very quickly adapts and that time period of adaptation is just very rapid."
The numbers behind the charts
One type of unstructured information that was particularly difficult to process is data found in charts, as the existing software couldn't identify the data in a visual representation.
Mann's team responded by developing a system called Scatteract that back formats the data found in scatter plot charts to reveal the information that generated each dot.
The system uses optical character recognition (OCR) and deep learning techniques to take numerical data points from the image of a chart and then converts the results into tables.
Bloomberg claims this is the first system thats use machine learning to extract numerical data from charts. It can analyse the data from 78 percent of scatter plots found on the web and then use the results to inform a secondary analysis.
Another major change for Bloomberg's journalists is the rise of alternative data. Journalists and traders have traditionally relied on established sources of information for their analysis, but now anyone with a social media account can disseminate influential data found in surprising places.
In 2013, grocery shoppers in New Delhi helped journalists understand global market shifts, when a shortage in onions in India was blamed for a big rise in inflation.
A San Francisco start-up called Premise Data Corporation devised a new way to understand the effect of the shortage. The company paid citizens to take photos of onion prices in their local stores. The data was then aggregated to understand how the cost of onions in India is linked to inflation rates, creating an effective and early predictor of pricing patterns.
"In the eighties, the goal was just getting the data into one place," says Mann. "Now it's figuring out what piece of information you want inside of all that data and so a lot of the goal for us as a company is now putting things into context and surfacing the right needles in the haystack for customers."
Social media on markets
The tools used to analyse these emerging data formats are also changing news reporting.
Natural Language Processing (NLP) has become a particularly powerful technique. When a journalist is monitoring Twitter, a machine learning model gives them a specialised feed with the most relevant tweets for them.
"Twitter is a medium where you can express any kind of material information, in the same way that a press release was, and so CEOs and companies make announcements directly on Twitter," says Mann.
"That can count for regulatory requirements but people also go on Twitter and say all kinds of things. Elon Musk as an example, is very vocal on Twitter, and sometimes people care and sometimes they don't care."
Reporters certainly cared when Musk announced on Twitter that he was considering taking Tesla private earlier this week. His tweet caused Tesla shares to rise 11 percent by the end of the day and prompted regulators to take the unusual step of suspending trading in the stock.
This type of story can also be big news for Bloomberg clients. To help them react to reports, Bloomberg sells them live data feeds with headlines that are connected to their business, almost like Google Alerts, which they can then incorporate into their risk models and trading algorithms.