How Salesforce brought artificial intelligence and machine learning into its products with Einstein

Salesforce spent all week during its Dreamforce conference talking about AI and machine learning and how it was making its range of SaaS CRM cloud products smarter for customers, we asked them how they did it.


Intelligence was the talk of Dreamforce - Salesforce's annual tech conference in San Francisco this month - with the SaaS giant's latest announcement 'Einstein' promising to bring complex data science techniques and predictive algorithms seamlessly into all of their cloud-based CRM products.

Here's how Salesforce used a spending spree on artificial intelligence (AI) startups and talent to bring these smart features to customers, all without opening up their precious data.

© Salesforce
© Salesforce

Project Einstein

Before he went on an AI acquisition binge, Salesforce CEO Marc Benioff said there was anxiety within the organisation around applying predictive algorithms to customer data they can't see, because customers want to keep their data private and secure.

Read next: What is Salesforce's AI powered Einstein product? When can customers try Einstein and how much will it cost?

The lightbulb moment for Benioff came "when we made acquisitions and they said we can provide intelligence on customer data without seeing it.

"Up to that point, if you couldn’t see or normalise the data, you can’t apply the intelligence. We have massive amounts of data, petabytes and petabytes, so we have the data that we need and the answer is that we can now operate on that data without interfering with the trust relationship with our customers."

General capabilities

Now that it could apply various machine learning techniques to this huge pool of customer data, general manager of the Einstein group John Ball and his team had to work out how to not only make the data models generally available while maintaining trust, but also make all the insights unique to their customer's domains.

Ball summed the problem up: "Given we have hundreds of thousands of customers across eight clouds we have millions of predictive models to build, and there aren't enough data scientists in the world to do that."

Ball had to ask himself: "How can we simplify this?" The solution was to automate the data wrangling process - a task which by most measures takes up 80 percent of data scientist's day-to-day job - by using the metadata.

He explained: "We quickly realised that we have metadata so we know if a field is an email or an opportunity, or if a lead object is linked to another. So we can do a bunch of automatic data preparation to feed into the predictive model."

Data Concerns

Since then Ball and the Salesforce executive team have been busy ensuring customers that their data will be secure within Einstein's underlying general models. Naturally competitors don't want their data to be used to power a predictive model which may benefit their rivals in any way, especially when the big Salesforce pitch to move to the cloud revolved around proprietary data ownership and security.

Ball disagrees with that thought process though, saying: "I have spoken to big companies that are fine with the general models." The problem, Ball insists, is finding the right language, and maths, to dispel these concerns with: "It is a nuanced discussion, the math gets super nuanced", he said.

The key is customer choice, "they can opt in," Ball says. "You can build an org-specific model only using the data in that organisation, then there is a customer who would opt-in to a global or generalised data pool where they are still getting org-specific results. You are always getting an org-specific model, the difference is training only on your data or training on anonymised and sampled data."


Ball calls Einstein "AI for CRM" and because Salesforce has eight separate cloud products they started to go out and acquire AI startups that were operating within very specific domains to help, be it marketing, sales or customer service.

"We started looking at our eight clouds where the use case is very different, so that's when we identified some great companies breaking new ground in very specific areas," he said.

Read next: The biggest AI and machine learning acquisitions 2016: From Apple to Salesforce, breaking down this year's AI acquisition binge

Salesforce acquired Implisit in May, which was focused on AI for sales reps. "They built a lot of models to figure out how to use natural language processing to extract signal from email that are relevant for a sales process, like detecting a competitor was mentioned," Ball explained.

Then there is the deep learning specialists MetaMind "which has incredible applicability in certain domains but doesn't solve all problems, you need a lot of data." MetaMind helped Salesforce build out its image recognition and classification capabilities, which are now baked into the Marketing, Services and App Clouds.

Then there is the analytics specialist BeyondCore, which helped bring predictive capabilities into the Analytics Cloud and wave apps.

On top of all the domain-specific acquisitions there is the likes of PredictionIO, which is an open source machine learning model management tool. Useful when you have "millions of models to manage," as Ball said.


It is becoming a cliche in itself that the best AI is the AI you don't see (think Amazon product recommendations) and many Einstein features aren't there yet. The predictive features are suggestions, and as many sales reps and marketers will know, professionals aren't necessarily the best at being told how to do things differently.

Salesforce has managed to neatly package up its AI capabilities under a single brand, and a cuddly avatar. Now it needs to see engagement figures to match. Einstein will need to prove itself effective to earn the trust of its users. The first step was baking it into the platform, the proof will be in the eating.

"Recommended For You"

Splunk brings machine learning capabilities into its analytics tools Microsoft launches AI features for Dynamics 365 to rival Salesforce