Introduction to AI
AI stands for Artificial Intelligence and it's been widely used in our daily life. Whether you are using Siri on your MacBook, Cortana on your Windows, playing Call of Duty, driving a smart car, or using movie recommendation services, all use Artificial Intelligence to a great extent to predict the outcome. Artificial Intelligence powers e-commerce Recommendation feeds, Facebook feeds, Fraud Detection in banking transactions, and many more use cases.
Salesforce CRM is one of the widely used CRMs today and there are tons of applications that are built on top of the Salesforce App Cloud platform used across various verticals, such as healthcare (Salesforce Health Cloud), finances (Financial Cloud and Financial Force ), insurance (Vlocity), and so on. Adding Artificial Intelligence to these types of applications will make the CRM and apps smarter. This book is an attempt to introduce developers on the Salesforce platform, the capabilities of the Salesforce Einstein (Artificial Intelligence for CRM), to bring Artificial Intelligence into the Salesforce apps, and also to introduce how Einstein can be used across marketing, sales, service, community, and various other Cloud offerings of Salesforce. We also cover PredictionIO, which is an open source machine learning server to build smarter applications.
Before we deep dive into Einstein offerings for developers and data scientists, this chapter covers the basics of Artificial Intelligence and key terminology in the world of Artificial Intelligence. Also, we will see how to use Google Prediction API's to build a simple demonstration of machine learning and Artificial Intelligence in conjunction with the Salesforce data to support the relevant theory.
In this chapter, we will cover the following topics:
- Artificial Intelligence key terms
- Programming languages used for machine learning
- Practical machine learning with Google Prediction API and Salesforce
Artificial Intelligence key terms
Artificial Intelligence is a computerized system that is designed to mimic how humans think, learn, process, and perceive information. In simple terms, it's about first understanding and then recreating the human mind.
There are some common terminologies that we need to understand before we proceed further.
Machine Learning
As per Wikipedia:
"Machine learning provides computers with the ability to learn without being explicitly programmed"
Machine learning in general comprises three major steps:
- We collect a lot of examples that specify the correct output for a given input.
- Based on the input dataset, we apply algorithms to form a model or a mathematical function that can predict the outcome.
- We pass the input to the mathematical function obtained in step 2 to obtain the necessary results. Consider the following diagram:

In this chapter, we will cover a simple experiment using Google's Prediction API with Salesforce data, and, in the later chapters, we will introduce you to the PredictionIO part of Einstein offerings from Salesforce, which is an open source Machine Learning Server that allows developers and data scientists to capture data via its Event server, build predictive models with algorithms, and then deploy it as a web service.
Neural networks
A neural network is a set of algorithms designed to recognize patterns. Neural networks are superficially based on how the brain works.
They consist of a set of nodes (similar to human brain neurons) arranged in multiple layers, with weighted interconnections between them. Each neuron combines a set of input values to produce an output value, which in turn is passed on to other neurons downstream. Artificial neural networks are used in Deep Learning.
Deep Learning
In Deep Learning, the neural network has multiple layers. At the top layer, the network trains on a specific set of features and then sends that information to the next layer. The network takes that information, combines it with other features and passes it to the next layer, and so on.
Deep Learning has increased in popularity because it has proven to outperform other methodologies for machine learning. Due to the advancement of distributed computing resources and businesses generating an influx of image, text, and voice data, Deep Learning can deliver insights that weren't previously possible.
Consider the following diagram:

From an example from the U.S. government report, in an image recognition application, a first layer of units might combine the raw data of the image to recognize simple patterns in the image; a second layer of units might combine the results of the first layer to recognize patterns of patterns; a third layer might combine the results of the second layer, and so on. We train neural networks by feeding them lots of delicious big data to learn from.
Salesforce Einstein offers Predictive Vision Services (currently in Pilot) for training and solving image recognition use cases. We will discuss in detail how to use these services to bring the power of image recognition to the CRM apps.
Natural language processing
Natural language processing (NLP) is the ability of computers to understand human language and speeches. A good example for this is Google Translator or a Google Voice Search. Modern day NLP systems use machine learning to detect patterns.
Cognitive computing
Cognitive computing involves self-learning systems that use data mining (big data), pattern recognition (machine learning), and natural language processing to mimic the way the human brain works. The difference between Artificial Intelligence and cognitive computing boils down to the idea that the former tells the user what course of action to take based on its analysis while the latter provides information to help the user decide. The goal of cognitive computing is to automatically solve IT problems without human intervention.
Pattern recognition
Humans have been finding patterns everywhere, ranging from astronomy to biology and physics. A pattern is a set of object/concept/phenomena where elements are like one another in certain aspects.
Statistical and structural patterns form the basis of machine learning.
Data mining
Data mining is the process of finding patterns or correlations among dozens of fields in a relational database.
Data mining consists of the following five major elements:
- ETL (Extraction ,Transformation and Loading ) of data from data warehouse
- Storing and managing the data in a multidimensional database system
- Providing data access to the Business Analysts and IT professionals
- Use Application Software to analyze data
- Using charts and dashboards to present the data
GPUs
Graphics processing units (GPU) basically help computers work much faster than those operating with a central processing unit (CPU) alone. Some companies have built their own versions of GPUs. For example, Google being Google, the technology giant has a chip it calls the Tensor processing unit (TPU), which supports the software engine (TensorFlow) that drives its Deep Learning services.
Programming languages used for machine learning
Programming languages used for machine learning depend on your requirements and expected predictions. MATLAB, R, and Python are commonly used languages because of their ability to provide powerful functions for statistical analysis.
If you want to explore and write your own machine learning algorithms, you might want to learn these languages; however, to use Einstein, PredicitonIO, or any other topic discussed in this book does not require you to know R or Python. Instead, we will keep it simple with Apex, Java, Scala, or Node that most of the developers are already familiar with.
Practical machine learning with Google Prediction API and Salesforce
To understand machine learning concepts practically, we will build a simple Proof of Concept (PoC) demo that uses Google Prediction API, and we will apply the predictive results on the Salesforce records. The aim of this exercise is to help you understand the basic steps of machine learning without digging into minute details and to get some idea of how we can leverage external machine learning algorithms on Salesforce data and the power we add to the Salesforce data through machine learning.
Google offers a simple Prediction API. These are predefined models.
Google Prediction API general algorithms can be categorized as follows:
- Given a new item, predict a numeric value for that item, based on similar valued examples in its training data (Regression Model)
- Given a new item, choose a category that describes it best, given a set of similar categorized items in its training data (Categorical Model)
The Prediction API integration with Salesforce is covered in the Practical Machine Learning With Google Prediction API and Salesforce section.
Business scenario
Universal Container (fictitious company) wants to find the probability of opportunity closure using their existing data. We will help them integrate Google Predictive API with Salesforce to predict the probability of the Salesforce opportunity closure based on the expected revenue and the opportunity type, which in turn can help them forecast revenue better.
Let's take an opportunity dataset from Salesforce, upload it into the Google storage, and build a regression type predictive model, train the model, and use it to predict the probability of the opportunity closure.
Prerequisites
This section covers the steps required to experiment with Google Prediction API and Salesforce:
- You have Salesforce login credentials. If you do not have one, sign up at https://developer.salesforce.com/signup.
- Sign up for a Google account at https://accounts.google.com/SignUp?hl=en.
- Enable the Prediction API by visiting https://console.cloud.google.com/home/dashboard.
- Create a Google Cloud Project, as shown in the following screenshot. Once you create a project, note the Project ID as we will be using the Project ID in the API request:

- Sign up for a free Google cloud storage at https://console.cloud.google.com/storage/browser.
- Create a folder called salesforceeinstein and upload the provided CSV (The CSV is shared in the git repository located at (https://github.com/PacktPublishing/Learning-Salesforce-Einstein/blob/master/Chapter1/SFOpportunity.csv) in the Google Cloud storage. Name the file as SFOpportunity.csv:
- Open the prediction API explorer the (https://developers.google.com/apis-explorer/#s/prediction/v1.6/) to train the model via API. We will need to first enable OAuth for the project and use the right scope. The screenshot shows the OAuth 2.0 screen and scope enablement screen. You will need to select the auth/prediction checkbox:
- We will be using the v1.6 version of Prediction API. The training and prediction is covered in the next section.
Note that the CSV data here is a report extract of opportunity data from Salesforce. You can extract data using the Salesforce standard reporting interface. You will need to create a custom probability field on the Salesforce opportunity object to track the probability from the prediction API.
Check the following screenshot of the Dataset sample. The data samples can be taken from your Salesforce organization, and, in case you want to use the one used in this book, you can get it from the git repository (https://github.com/PacktPublishing/Learning-Salesforce-Einstein/blob/master/Chapter1/SFOpportunity.csv):

Training and prediction
The regression type predictive model uses a numeric answer for the question based on the previous samples. For this example, we will use this regression type predictive model. Take a look at the file located at (https://github.com/PacktPublishing/Learning-Salesforce-Einstein/blob/master/Chapter1/SFOpportunity.csv). The first column is probability, the second column is opportunity stage and the last column is the opportunity revenue. If you look carefully, you will notice that there is some correlation between the type of opportunity, expected revenue, and probability.
If you observe the dataset sample closely, you will see that for opportunities of type Existing customers, the higher the expected revenue, the more the probability.
To train the Dataset, we will leverage the Prediction API provided by Google. The complete set of APIs is listed in the following table:
API | Description |
prediction.hostedmodels.predict | This submits an input and requests an output against a hosted model. |
prediction.trainedmodels.analyze | This gets an analysis of the model and the data the model was trained on. |
prediction.trainedmodels.delete | This deletes a trained model. |
prediction.trainedmodels.get | This checks the training status of your model. |
prediction.trainedmodels.insert | This trains a Prediction API model. |
prediction.trainedmodels.list | This lists available models. |
prediction.trainedmodels.predict | This submits a model ID and requests a prediction. |
prediction.trainedmodels.update | This adds new data to a trained model. |
If you recall, machine learning consists of three steps:
- Load the sample dataset.
- Train the data.
- Use the generated function to predict the outcome for the new Dataset.
Integration architecture
The following diagram shows the Integration Architecture that we have adopted for experimenting with Google Prediction API and Salesforce Data:

The data will be loaded into the Google Cloud bucket and trained using Prediction API manually using the console, and a model is formed. We will submit the query to the formed model by triggering an HTTP request to the Google Prediction API from Salesforce. The architecture is purposefully kept very simple to help readers grasp the fundamentals before we approach Prediction API in detail.
The following are the steps required to train the dataset via Google Prediction API. Note that we will use prediction.trainedmodels.insert to train the model:
- Load the Sample Dataset: At this point, the assumption is that you have extracted data out of the Salesforce opportunity object and loaded it into the Google Cloud Storage. The steps are covered in the prerequisites section.
- Training the Dataset: Let's use the API explorer to train the sample data using API. The API explorer of Google is located at https://developers.google.com/apis-explorer/#s/prediction/v1.6/.
The following screenshot shows how one can train the model:

The request and response JSON are shown as follows. Note that the learned-maker-155103 is the
Project ID. Replace with your current Project ID.
Request Payload:
POST https://www.googleapis.com/prediction/v1.6/projects
/learned-maker-155103/trainedmodels
{
"modelType": "regression",
"id": "opportunity-predictor",
"storageDataLocation":
"salesforceeinstein/SFOpportunity.csv"
}
Carefully, note that we have pointed the location in Cloud Storage where our data resides. Once successful, we get a response from the API, which is shown as follows:
Response Payload:
200
{
"kind": "prediction#training",
"id" : "opportunity-predictor",
"selfLink": "https://www.googleapis.com/prediction/v1.6
/projects/learned-maker-155103/trainedmodels/opportunity-
predictor", "storageDataLocation":
"salesforceeinstein/SFOpportunity.csv"
}
We can monitor the status of the data training via API prediction.trainedmodels.get.
The request to execute in the console is as follows:
Request Payload:
GET https://www.googleapis.com/prediction/v1.6/
projects/learned-maker-155103/trainedmodels/opportunity-
predictor
Response Payload:
200
{
"kind": "prediction#training",
"id": " opportunity-predictor",
"selfLink": "https://www.googleapis.com/prediction/v1.6/
projects/learned-maker-155103/trainedmodels/opportunity-
predict",
"created": "2017-01-18T19:10:27.752Z",
"trainingComplete": "2017-01-18T19:10:48.584Z",
"modelInfo": {
"numberInstances": "18",
"modelType": "regression",
"meanSquaredError": "79.61"
},
"trainingStatus": "DONE"
}
You will get a response showing the training status, which is shown here:
DONE
Note that if your data does not have any correlation, then you will see a very high value of meanSquaredError.
- Using the trained Dataset to predict the outcome: For this, we will create a simple trigger on the opportunity object in Salesforce to make an asynchronous API call to invoke the Google Prediction API to predict the probability of opportunity closure.
Before we add Triggers to Salesforce, make sure that Predicted_Probability field is created in the Salesforce. To create a field in Salesforce follow the following steps:
- Navigate to SETUP | Object Manager | Opportunity in Lightning experience or Setup | Opportunities | Fields | New in Classic experience.
- Select the Field type as Number (with length as 5 and decimal places as 2) and follow the defaults and save:

The trigger code uses Apex, which is similar to JAVA provided by the Salesforce platform to write business logic. For the purposes of demonstration, we will keep the code very simple:
//Trigger makes an API call to Google Prediction API
to predict opportunity probability
//Please note that this trigger is written for demonstration
purpose only and not bulkified or batched
trigger opportunityPredictor on Opportunity (after insert) {
if(trigger.isinsert && trigger.isAfter){
OpportunityTriggerHelper.predictProbability
(trigger.new[0].Id);
}
}
If you are using Salesforce Classic, the navigation path to add that trigger an opportunity is SETUP | Customize | Opportunities | Triggers.
For Lightning experience, the path is SETUP | Triggers | Developer Console. Use Developer Console to create a trigger
Also note that since triggers use apex classes, first save the dependent apex classes before saving the apex trigger.
The Apex class that is invoked from the trigger is as follows:
SETUP | Develop | Apex Classes
//Apex Class to make a Callout to Google Prediction API
public with sharing class opportunityTriggerHelper{
@future(callout=true)
public static void predictProbability(Id OpportunityId){
Opportunity oppData = [Select Id,Amount,Type,
Predicted_Probability__c from Opportunity where Id =
:OpportunityId];
HttpRequest req = new HttpRequest();
req.setEndpoint('callout:Google_Auth');
req.setMethod('POST');
req.setHeader('content-type','application/json');
//Form the Body
PredictionAPIInput apiInput = new PredictionAPIInput();
PredictionAPIInput.csvData csvData =
new PredictionAPIInput.csvData();
csvData.csvInstance = new list<String>{
oppData.Type,String.valueof(oppData.Amount)};
apiInput.input = csvData;
Http http = new Http();
req.setBody(JSON.serialize(apiInput));
HTTPResponse res = http.send(req);
System.debug(res.getBody());
if(res.getStatusCode() == 200){
Map<String, Object> result = (Map<String, Object>)
JSON.deserializeUntyped(res.getBody());
oppData.Predicted_Probability__c =
Decimal.valueof((string)result.get('outputValue'));
update oppData;
}
}
}
The apex class for parsing the JSON is as follows:
public class PredictionAPIInput {
public csvData input;
public class csvData {
public String[] csvInstance;
}
}
Salesforce Apex offers HTTP methods for us to make calls to an external API. We are leveraging that and the configuration in the named credential to make an HTTP request to the Google Prediction API.
Setting authentication for calling API from SFDC
To simplify the authentication, we will use a named credential in Salesforce and auth settings to keep it configurable. Note that this is not scalable, but it's a quick way to integrate Salesforce with Google for the purposes of demonstration only. You can use a service account or set up a full OAuth 2.0 if you are considering a scalable approach.
The steps to authorize Salesforce to access the Google Prediction API are as follows:
-
Generate a client ID and client secret via the Google Auth screen. To generate this, navigate to the console URL (https://console.developers.google.com/apis/credentials). Choose the subtab Oauth consent screen and fill in the Product name shown to users as Salesforce and save the OAuth consent screen form.
The following screenshot shows the OAuth consent screen and details one has to fill in to generate the Consumer Secret and Consumer Key:

Once an OAuth consent screen is created, create a web application and note the Consumer Secret and Consumer Key. This will be used in the Salesforce Auth. Provider screen:

- Use Auth in Salesforce to set up authorization, as shown in the following screenshot. The path for it is SETUP | Security Controls | Auth. Providers | New
Select the Provider Type as Google. The following screenshot shows the form. Note that we use the Consumer Key and Consumer Secret from step 1 as input to the Auth. Provider form:

The following screenshot shows the saved Auth. Provider record. Note the Callback URL as it will be fed back to the Google OAuth consent screen:

- Note down the Callback URL from the Auth. Provider screen. The Callback URL needs to be added back to the Google Auth screen. The following screenshot shows the redirect URL configuration in the Google OAuth screen:

- Use Named Credentials so that we avoid passing the authorization token. Carefully note the scope defined in the SETUP. The path for it in Classic is: Setup | Security Controls | Named Credentials:

- Testing out the results is the final step to test our model and use it with a real-time record creation screen in Salesforce. Create an opportunity record and indicate the type and the amount of opportunity. Once a Salesforce record is inserted, a trigger fires, calling the Prediction API with a new Dataset, and the result is stored back in the custom probability field.
The API that the Apex trigger code hits is as follows:
https://www.googleapis.com/prediction/v1.6/projects/learned-maker-155103/trainedmodels/opportunity-predictor/predict
To test, create the Salesforce new record and add fields such as type and amount and monitor the custom probability field on the Salesforce opportunity record.
To switch to the Salesforce Lightning experience, consider the following screenshot:

The following screenshot shows the results obtained in Lightning Experience, and, clearly, the Predicted_Probability field is populated:

Drawback of this approach
The preceding approach is not scalable, while it serves as a great experiment to understand the fundamentals of a machine learning process.
From the previous experiment, the following are the conclusions:
- The prediction system uses a larger Dataset and, hence, considering data limits on the Salesforce platform, it's always better to have a big data server collecting the data and forming the model
- There is a need to train the data periodically to get an appropriate model and keep it up-to-date
- Machine learning uses statistical analysis under the hood. In this scenario, we are using the regression model. As an app developer, there is no need to really dig into mathematics although there is no harm in doing so
Summary
The aim of the chapter was to introduce Salesforce developers to the world of AI and machine learning. In the next chapter, we will further understand how AI can make CRM and Cloud applications smarter with simple declarative features. We will also see various Cloud offerings from Salesforce and how Einstein is adding the AI across Cloud products, such as sales, service, communities, and marketing cloud.