Chapter 1: Understanding Essential Artificial Intelligence Basics for RPA Developers
In this chapter, we will cover some key artificial intelligence (AI) concepts that are relevant in your daily work as an RPA developer. We will discover where a robotic process automation (RPA) developer can make the most impact on implementing cognitive automation in RPA use cases without becoming a data scientist. We will also look at real business problems today that are solved by AI.
In this chapter, we will cover the following main topics:
- Understanding key AI concepts
- Understanding cognitive automation
- Exploring out-of-the-box (OOTB) machine learning (ML) models for RPA developers
By the end of the chapter, you will be equipped with common AI fundamentals, and you will be inspired by real-life examples to help you start thinking about how to apply AI to your potential use cases.
Understanding key AI concepts
You may have come across many terms when you started exploring the topic of AI. We will demystify AI and only present those concepts that are most relevant to you as an RPA developer. Please note that you may come across other material with slightly different definitions based on a different context.
Differentiating between artificial intelligence, machine learning, and deep learning
AI, ML, and deep learning (DL) are related but not the same. The following figure illustrates the hierarchy of these types of learning:

Figure 1.1 – AI, ML, and DL
- AI: This is equivalent to giving a machine or a robot the ability to think. It encompasses ML and DL.
- ML: This refers to how a machine or a robot learns to think through algorithms without explicit programming. ML is a subset of AI.
- DL: This refers to how an ML algorithm leverages artificial neural networks to mimic learning. DL is a subset of ML.
Next, we will look at three key considerations when choosing between ML and DL. They are listed here:
- Data requirement and availability
- Computational power
- Training time
The following figure shows a comparison of ML and DL:

Figure 1.2 – Comparison of ML and DL
In ML, the features of the studied subjects are fed into the algorithms for the machine to learn. We can think of features as us giving hints to the algorithm. This step allows for a smaller dataset, lower computational power, and less training time.
In DL, features are determined by artificial neural networks. It needs to work much harder to figure out the features and patterns to learn. As a result, it requires a large amount of data, high computational power, and a long training time.
Although DL is valuable, it is beyond the reach of most businesses to develop DL models to solve their business problems. Fortunately, many DL models have been pre-trained by companies with the time and budget to make them accessible to a large user base.
The implication of this option means that your role as an RPA developer is not to create these models. You, as the RPA developer, are the trainer of these models. It is important to understand the role of training in AI.
Appreciating the relevance of supervised learning, unsupervised learning, and reinforcement learning in AI
As we learned in the previous section, AI is about training a machine or a robot to think. Just like a human being, a robot needs to learn. There are three different types of learning for a robot.
The following figure gives some analogies for supervised learning, unsupervised learning, and reinforced learning:

Figure 1.3 – Supervised learning, unsupervised learning, and reinforcement learning analogies
The following list explains the various analogies:
- Supervised learning: This is based on past data, and the trainer specifies the inputs to predict future outcomes. This type of training is analogous to an instructor-led training course. It requires the trainer to supervise the student or the model to achieve the desired learning outcome. Classification and regression are types of supervised learning methods:
- Classification refers to the process of categorizing a given set of data into classes. For example, a set of pictures of different animals are fed into the ML model. Each picture is labeled with an animal name. The ML model is trained to identify animals from an image.
- Regression helps in the prediction of a continuous variable. For example, a profit prediction ML model is an example of a regression model. Training data consisting of R and D, marketing, and administrative spending, geographic location, and profit is fed into the model. The ML model predicts the profit.
- Unsupervised learning: This relies on an algorithm to identify unknown patterns from data. This type of training is analogous to a self-study course. It requires the students or the model to synthesize the information to achieve the desired learning outcome. Clustering is a type of unsupervised learning method:
- Clustering refers to the method used to find similarity and relationship patterns among training datasets, and then cluster those datasets into groups with similarities based on features. For example, the clustering technique is commonly used in market segmentation. The ML model looks at features such as sex, age, race, and geographic location to group customer groups into segments to better understand their buying habits.
- Reinforced learning: This uses a reward-and-punishment system to learn. There is no training data or trainer. The algorithm is improved over time based on feedback or reward and punishment. This type of training is analogous to on-the-job training. If the worker is doing the job well, the worker gains a pay raise or promotion. If the worker is performing poorly, the worker receives no raise or promotion. This is commonly used when no data or specific expertise is available.
Practical tips
AI platform providers have a mission to make AI accessible. Part of that mission is striving to develop product features to overcome the complex concepts of AI. Specifically, these are some notable democratization efforts in AI:
- Increased availability of pre-trained models to accelerate the time to result
- Simplification of the technical complexity of the ML training life cycle
We presented the key AI concepts in an easily digestible format. This overview prepares you to pick up an AI platform such as UiPath quickly. You will build, deploy, and maintain your first AI+RPA use cases in no time. You no longer need to spend years mastering AI to build a model from scratch. Instead, you are the trainer of the robots, teaching different skills that they need to master. Most importantly, you have tools that do the most complex tasks for you.
Now that you have a good understanding of the key AI concepts, let's explore cognitive automation, which is the combination of AI and RPA.
Understanding cognitive automation
Cognitive automation or intelligent process automation (IPA) refers to the use of AI and RPA together. It provides the machine or the robot with the brain (AI) and the limbs (RPA).
Although the general software development life cycle (SDLC) looks the same at a high level for RPA development and cognitive automation development, there are two important differences:
- The role of the RPA developer across the SDLC
- The final output of the RPA and cognitive automation life cycles
Let's now take a look at these differences in detail.
Understanding the expanded roles the RPA developer plays in the cognitive automation life cycle
An RPA developer plays expanded roles in the cognitive automation SDLC. A detailed comparison between a representative RPA SDLC and a representative cognitive automation SDLC is given in the following figure:

Figure 1.4 – Differences in RPA developer roles in the RPA and cognitive automation SDLCs
In the RPA SDLC, an RPA developer is like a traditional developer for any other software package. In this, the typical sequence of the process is as follows:
- The business analyst collects the end-to-end business requirements of a business workflow detailing inputs, process steps, and output.
- The RPA developer codes the RPA workflow and tests the code.
- The business user conducts a user-acceptance test of the RPA robot.
- Finally, the RPA developer creates a package to deploy to the production environment.
- Post-production, the administrator manages the operations of the RPA bots.
- The RPA developer updates the code if the business user suggests enhancements or reports bugs.
The RPA developer plays a heavy role in selected steps of the RPA SDLC (build, deploy, and improve) by converting business requirements into RPA language.
In the cognitive automation SDLC, the RPA developer has a role in almost every step, which is described as follows:
- The RPA developer collects data-specific requirements to prepare for ML model training/re-training.
- The RPA developer does not usually build the ML model. Instead, the RPA developer either uses the ML model developed by the data scientist or uses an available OOTB model.
- The RPA developer prepares the datasets for training and evaluation to train/re-train the ML model according to the specific use cases.
- When the training result is acceptable, the RPA developer creates the ML package to deploy to the production environment.
- The ML skills are then available for the RPA developer to plug and play in any RPA workflow.
- Post-production, the administrator manages the operations of the RPA bots and the ML skills.
- The RPA developer continues to re-train the model with new data points to improve the model.
In cognitive automation, an RPA developer plays a broader role across the SDLC as a trainer and a data steward.
Understanding the final output of the cognitive automation life cycle and the RPA life cycle
Another important distinction between RPA and cognitive automation is related to the characteristics of the final output produced. RPA configures RPA bots. Cognitive automation develops ML skills that are leveraged by the RPA bot. The following figure illustrates the differences in the expectations of an RPA bot and an ML skill in initial deployment to the stakeholders:

Figure 1.5 – Expectations of an RPA bot and an ML skill in the initial deployment
An RPA robot performs according to a set of rules set out by the RPA developer. The result is black and white. Only the correctly coded robot is deployed into production. The output of the cognitive automation life cycle is a trained ML skill combined with an RPA workflow. The ML skill is trained up to the acceptable threshold of confidence to be deployed into production. In almost all cases, the ML skill is not 100% correct when it is first deployed. The ML skill is expected to improve over time.
Practical tips
Businesses have seen the power and reap the benefits of automation through RPA. However, RPA has its limitations. RPA can only automate rule-based tasks, thus limiting the scope of a process it can automate. In addition, rule-based tasks are usually lower-value work. To move up the value chain, combining AI is essential for businesses to maintain a competitive advantage. Here are some of the key takeaways to bring to your leadership:
- Technology companies have simplified AI technologies to make them accessible for consumption. AI is no longer a tool that only data scientists can leverage.
- The existing RPA team can start incorporating AI without needing heavy investments in springing up a new team.
- There are impactful cognitive automation use cases throughout the organization.
- It is now time to give the machine or the robot a brain.
Now that you have a good understanding of cognitive automation, let's explore the most commonly used OOTB models that you can try as a beginner in AI.
Exploring relevant OOTB models for RPA developers
You have options when it comes to ML models. There are widely available OOTB models that you can use by re-training with your data. You can develop your ML models from scratch. Lastly, you can collaborate with the data scientists in your company on custom-built ML models.
In this book, we will provide tips on how you engage with these options. To begin, we recommend you start with the OOTB models. We will give you an overview of the most commonly used OOTB models in this section.
The commonly used OOTB models
OOTB ML models apply to a wide variety of use cases. They are pre-trained with a large amount of data. Some OOTB models can be retrained with your specific dataset, while others are not retrainable. Most automation platforms now include OOTB models. Selecting the right OOTB models can save you time and accelerate your project. The following figure illustrates the different categories of the OOTB models:

Figure 1.6 – OOTB ML models by category
These OOTB ML models convert various forms of unstructured data into a usable format. The usage of these models reduces reliance on humans to spend hours reading, processing, comprehending, and analyzing unstructured documents. Unstructured documents can come in the form of images, language, tabular text, and documents.
Let's take a closer look at each of these models:
- Image analysis: There are two image analysis OOTB models. The following figure summarizes the key characteristics of the two models:
Figure 1.7 – Image analysis OOTB models
These two OOTB image analysis models are useful for many use cases that involve analyzing an image to determine the next steps. For example, the image moderation model is often used in social media feed moderation. The OOTB image moderation model reviews millions of images and flags images that may be problematic for humans to verify.
- Language translation: As the name suggests, language translation replaces the tedious work of translation from one language to another. The following figure summarizes the key characteristics of the model:
Figure 1.8 – Language translation OOTB models
This ML skill can be used in a variety of use cases and is commonly used in customer support. For example, many chatbots are powered by an OOTB language translation model to handle inquiries in different languages.
- Language comprehension: Language comprehension is complex. It refers to the ability to extract meaning from text, just like a human. The following figure summarizes the key characteristics of the three available models:
Figure 1.9 – Language comprehension OOTB models
Language comprehension ML models can mimic the thinking of a human and make inferences. They have widespread practical usage. For example, the semantic similarity OOTB model provides recommendations based on preferences indicated by the users. The question answering OOTB model is often used as a basis to build an automated frequently asked questions (FAQ) database. Finally, the text summarization OOTB model draws insights from books and articles.
- Language analysis: Language analysis refers to the skill of drawing meaning from text. It enables a machine or a robot to understand sentences and paragraphs. The following figure summarizes the key characteristics of the three kinds of models:
Figure 1.10 – Language analysis OOTB models
Language analysis ML models know how to draw context and relationships between individual words. They have widespread practical usage. For example, the sentiment analysis OOTB model is often used in managing emails from customers. The model prioritizes negative emails for humans to review. One popular usage of the text classification model is spam email classification. Finally, a named entity recognition model is often used to extract key parts from customer feedback.
- Tabular data: Tree-based pipeline optimization tool (TPOT) is a tool to find the best pipeline for your data. The following figure summarizes the key characteristics of the two available models:

Figure 1.11 – Tabular data OOTB models
This OOTB tool automates the most tedious part of pipeline building. In addition, this is an introduction for a beginner to create a custom model.
- Documents: Processing documents is time-consuming and tedious. Many businesses spend many hours and a lot in human resources to digitize analog documents and extract structured information from them. The following figure summarizes the key characteristics of the three kinds of models:

Figure 1.12 – Documents OOTB models
There are many documents on OOTB models available to tackle document digitization. They are often pre-trained with a large dataset of the relevant document type. They can be used to accelerate cognitive automation involving documents.
Practical tips
As we learned in this section, there are many OOTB models readily available. They have been widely used and proven to be effective. They are also easy to try. Think of a simple use case that involves AI skills and try your hand at any of the OOTB models mentioned in this section. Practice makes the theory you read in this book come alive.
Summary
In this chapter, you learned about the key AI concepts to start your immersion into AI. In addition, you learned about the power of cognitive automation to extend automation benefits and your role in cognitive automation implementation. Finally, you are now aware of the commonly used OOTB models for you to start hands-on exploration.
In the next chapter, we will dive into exploring the automation spectrum, the available technologies, and a framework to reimagine and solve a business problem with the relevant application of cognitive automation.
Further reading
- MIT OpenCourseware – Artificial Intelligence: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/index.htm
- McKinsey's An executive's guide to AI: https://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%20Analytics/Our%20Insights/An%20executives%20guide%20to%20AI/An-executives-guide-to-AI.ashx