Siemens has engineered trains for almost 150 years, including the first electric passenger locomotive in 1879. However, its more recent innovations on the track are driven by data analytics. Using sensors to analyse information on trains and tracks has helped it move railway maintenance methods from reactive to proactive.
By assessing the condition of components through diagnostic sensor data the company can start to spot patterns that indicate when a failure is likely to arise. Then, by monitoring information in near real-time, Siemens can quickly react to concerns before they disrupt services. If an anomaly is detected, the component is sent for inspection.
The benefits of this approach include a reduction in delays, increased mileage, lower labour costs and more efficient maintenance scheduling. This allows Siemens to start offering clients more performance-based maintenance contracts.
Applying data science to the track
A few years ago in a locomotive factory in Germany, Siemens brought together a team of data scientists and engineers to create algorithms that predict failures of train components and railway infrastructure.
"The reason for this is that industrial data behaves differently to internet data, and a lot of the classical analytical models that we use don't work very well in this environment," Gerhard Kress, director of mobility data services at Siemens explains to Computerworld UK. "Also, because these components don't fail very often, you need extremely high prediction accuracies, much higher than anything else we've seen before."
In the last two years alone, his team has filed 30 different patents on new mathematical approaches.
In 2013, Siemens turned to big data vendor Teradata to develop these models into advanced data analytics capabilities. Siemens deployed its own version of the Teradata Unified Data Architecture (UDA) encompassing a data warehouse, Aster Discovery analytics tools, and an appliance for Hadoop.
The enhanced monitoring capabilities that the predictive analytics brings have pushed the availability of Siemens' high-speed trains in Russia to 99.96 percent and metro locomotives in Thailand to 99.98 percent.
Siemens also uses this framework to provide proactive maintenance for numerous regional trains in the UK, including London's Thameslink railway system.
The sensor setup on a train
Kress breaks his data analytics strategy for trains down into three elements: understanding the condition of the different components to predict failure; supporting the passenger experience through climate, smoothness of the ride and functioning toilets; and maximising energy efficiency to cut running costs.
"The energy consumption of a train over its lifetime costs more than buying the train," he says. "You can easily reduce that by 10 percent if you do it right."
A locomotive typically has 150-200 sensors, and a high-speed train 300-350 sensors per carriage. These could include a couple of sensors in each brake alone, which analyse the brake pressure and hydraulic oils to guarantee the train is braking in time. They measure component temperatures and pressures and compare the data to the thousands of reports of failures and fixes in their records.
The risks of sensor failure means that installing too many could cause more problems than they solve," says Kress. "We try to go to the least number of sensors that we can, simply because the more you put on, the more that can fail."
Motors, gearboxes, bearings and wheels are all mechanically connected and may not all need their own individual sensors. Siemens can instead use a virtual sensor, which calculates errors on each part through algorithms assessing, for example, the rate of heat transfer.
They can also combine the data on different assets so that sensors on the train and the track monitor each other, reducing the quantity of checks that are required.
What have been the benefits of analytics?
Siemens previously relied on incident response and routine inspections to keep its trains running. This process would require technicians to open up the train to find the cause of the failure, and then fetch the spare parts and tools before returning to make the repair.
The results had a big impact on repair times and delays. A single broken door on a train could add 10-15 seconds onto the time taken to travel between two stations. After 20 stations, the train could already be five minutes late, pushing the whole route behind schedule for the day. Siemens now monitors the doors on each train and can spot a potential failure before it emerges.
"If there's a problem on a Thameslink door, we can tell you in some cases a week and a half in advance," says Kress.
"A technician can then look at door number five, the right door wing on that carriage, and they go there, check it, put some grease there, and then it goes out again and it does not fail."
Siemens also provides maintenance for Eurostar trains, which traditionally used sensors that would send failure alerts that would forcefully stop the train. These sensors, however, were prone to failures of their own.
"This happened to us a few years ago, when my team didn't exist yet and we had to evacuate 700 people on the track," remembers Kress.
"We had the same thing about a year ago, it looked very similar. We realised, first of all, it was a sensor problem. We believe we figured this out a week and a half before the train would have seen that. We could say to the operator that you need to exchange that sensor on that bearing in that buggy on that side, and they did and there was no disturbance of operation."
"[Teradata] was the only company in the market that had understood that the world is more than a data warehouse," says Kress. "There was competition for Teradata, but given the structure of our data we needed to have a system that can do more than that, so the UDA for us was the main thing."
Siemens uses a combination of frameworks including Apache Spark and TensorFlow to develop specific machine learning methods for each individual analytic task. Experimentation with these models is encouraged in a separate and secure working environment.
"What we have to create an analytical model is a sandbox, where data scientists can play with the data and identify the structures of the model," says Kress. "Once that is clear and we want to operationalise the model, then we have a classical three-tier structure of develop, test and operationalise."
This continuous integration and deployment process uses the same underlying data lake, so even when scientists are in the sandbox they can see all the data that exists, and understand how to combine data points to discover the insights that they need. That creative process results in an analytics model that can be continuously implemented in their railway monitoring.
Siemens absorbs slightly over 50,000 data points per second for its rail services, and has to store the data for extended periods of time. The complexity and variety of the Siemens analytical workloads make Teradata essential when the models are deployed.
"I need to be able to balance all those different workloads and keep the system stable," Kress says. "If I would do that on Hadoop and one of my guys comes in there and puts a big workload there, no customer for the next two days will get any response. That's not unacceptable."