Machine learning APIs and frameworks: 15 machine learning tools for data scientists and developers
More and more businesses are looking at how they can use machine learning in their operations
Matthew Finnegan and Christina Mercer
With businesses increasingly keen on incorporating artificial intelligence into their operations, machine learning – the ability for a system to learn from large data sets rather than following preset rules – offers a number of benefits. This might mean building predictive models for fraud prevention in financial services, for example, or retailers making better recommendations to their customers.
Google, Microsoft, IBM and AWS all offer machine learning APIs via their respective cloud platforms, making it easier for developers to build services by abstracting some of the complexity of their algorithms. There are also a growing number of open source deep leaning frameworks for data scientists to use at a deeper level.
Earlier this week Amazon announced that it is open-sourcing its deep learning library, Deep Scalable Sparse Tensor Network Engine, (DSSTNE), which is now available on GitHub. And on Thursday, Google opened its SyntaxNet neural network framework for developers to build applications that can process human language.
Here are some of the top machine learning tools...
The open source deep learning library, pronounced 'destiny', allows data scientists to train and deploy deep neural networks using GPUs. It can be seen as a response to Google's open sourcing of TensforFlow.
DSSTNE was built by the retail giant's engineers to power its recommendations engine that makes product suggestions to the hundreds of millions of customers on its websites each day.
Amazon said: "We are releasing DSSTNE as open source software so that the promise of deep learning can extend beyond speech and language understanding and object recognition to other areas such as search and recommendations.
"We hope that researchers around the world can collaborate to improve it. But more importantly, we hope that it spurs innovation in many more areas."
2. Machine learning tools: Amazon Web Services Machine Learning API
AWS launched its Amazon Machine Learning service in Europe last August, with the aim of making it easier for developers of all skill levels to access complex algorithms. The service was built on the technology used by its own internal data scientists.
AWS says its machine learning service can generate billions of predictions a day, tapping into AWS data from services such as RedShift, S3 and its Relational Database Service.
3. Machine learning tools: Google APIs
Google has a host of machine learning tools on its Cloud Platform. This includes its popular Prediction API, which allows users to tap the search giant’s algorithms to analyse data and predict future outcomes. Google has been adding more APIs to allow users to build their own machine learning-based services, including Speech, Translate and Vision.
Its GPU-based Machine Learning platform also helps to create machine learning models, and is integrated with other cloud services such as Google Data Flow and Biq Query.
TensorFlow can produce C++ or Python graphs that can be processed on CPUs or GPUs. These flow graphs depict the movement of data running through a system.
5. Machine learning tools: Microsoft Distributed Machine Learning Toolkit (DMLT)
Microsoft's machine learning toolkit - which is available on Github - aims to ease crowded machine learning clusters, making it easier to run multiple (and differing) machine learning applications at the same time.
"Bigger models tend to generate better accuracies in various applications," Microsoft said. "However, it remains a challenge for common machine learning researchers and practitioners to learn big models."
6. Machine learning tools: Microsoft Computational Network Toolkit (CNTK)
Also from Microsoft, the Computational Network Toolkit enables users to create neural networks depicted in directed graphs. While primarily made for speech recognition technology, since April 2015 it has become a more general machine learning toolkit supporting image, text and RNN training (recurrent neural network - a type of neural network).
7. Machine learning tools: IBM Watson Analytics
The Watson Analytics cloud service was unveiled in 2014 as part of IBM’s plans to turn Watson from a part-time game show contestant into a bona fide enterprise software proposition.
It aims to help organisations that have little or no experience of predictive analytics put their business data to good use.
IBM had already launched its Watson Developer Cloud - in 2013 - offering access to APIs via its Bluemix platform as a service cloud, allowing developers to create their own applications based on Watson’s smarts.
8. Machine learning tools: BigML
It is not only big IT firms that are moving into artificial intelligence in the cloud. BigML is one of a number of startups in the market aiming to open artificial intelligence to a wider audience.
Founded in Oregon in 2011, BigML offers a simple user interface, allowing users to upload data sets to start making predictions.
9. Machine learning tools: Apache Spark MLlib and Singa
Apache Spark MLlib is an in-memory data processing framework. Spark offers a large and growing library of useful algorithms and utilities incorporating classification, regression, clustering, collaborative filtering and more (for in-memory data processing).
Singa is an open source framework within the Apache incubator, providing a programming tool for deep-learning networks across numerous machines.
10. Machine learning tools: Veles
Veles is Samsung's distributed deep learning platform, which is written in C++ and uses Python for coordination between nodes. Veles offers an API enabling immediate use of trained models and can be used for data analysis.
11. Machine learning tools: Alibaba’s Aliyun
In August 2015, Chinese ecommerce giant Alibaba announced that its cloud computing business, Aliyun, would offer a machine learning service to help enterprise customers streamline analytics software development.
The service is based on Aliyun’s Open Data Processing Service (ODPS) technology, which is capable of processing 100 petabytes of data in six hours.
The DT PAI platform offers a drag and drop interface to simplify the process for developers.
"What used to take days can be completed in minutes," said Xiao Wei, senior product expert with Alibaba's cloud business, as the service was announced.
12. Machine learning tools: Caffe
Caffe is a deep learning C++ framework initially created for machine vision uses (an imaging-based automatic inspection). It is developed by the Berkeley Vision and Learning Center (BVLC) as well as community developers.
The framework is already used as part of "academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia".
Google and Pintrest have also used Caffe in their operations.
13. Machine learning tools: Neon
Neon is Nervana's open source, Python-based machine learning library.
The deep learning startup, founded in 2014, has also launched a cloud service based on Neon, which it claims is ten times faster than competing services. This means that businesses can build, train and deploy deep-learning technologies much more quickly.
14. Machine learning tools: Wise.io
Wise.io also aims to democratise the use of artificial intelligence with 'machine learning as a service' that is ready for enterprise use. Founded in 2012, the Californian startup's algorithms were initially developed to help astronomers discover and map new stars, before being put to use by businesses.