How-To Tutorials

article-image-bayesian-network-fundamentals

10 Aug 2015

25 min read

Bayesian Network Fundamentals

10 Aug 2015

In this article by Ankur Ankan and Abinash Panda, the authors of Mastering Probabilistic Graphical Models Using Python, we'll cover the basics of random variables, probability theory, and graph theory. We'll also see the Bayesian models and the independencies in Bayesian models. A graphical model is essentially a way of representing joint probability distribution over a set of random variables in a compact and intuitive form. There are two main types of graphical models, namely directed and undirected. We generally use a directed model, also known as a Bayesian network, when we mostly have a causal relationship between the random variables. Graphical models also give us tools to operate on these models to find conditional and marginal probabilities of variables, while keeping the computational complexity under control. (For more resources related to this topic, see here.) Probability theory To understand the concepts of probability theory, let's start with a real-life situation. Let's assume we want to go for an outing on a weekend. There are a lot of things to consider before going: the weather conditions, the traffic, and many other factors. If the weather is windy or cloudy, then it is probably not a good idea to go out. However, even if we have information about the weather, we cannot be completely sure whether to go or not; hence we have used the words probably or maybe. Similarly, if it is windy in the morning (or at the time we took our observations), we cannot be completely certain that it will be windy throughout the day. The same holds for cloudy weather; it might turn out to be a very pleasant day. Further, we are not completely certain of our observations. There are always some limitations in our ability to observe; sometimes, these observations could even be noisy. In short, uncertainty or randomness is the innate nature of the world. The probability theory provides us the necessary tools to study this uncertainty. It helps us look into options that are unlikely yet probable. Random variable Probability deals with the study of events. From our intuition, we can say that some events are more likely than others, but to quantify the likeliness of a particular event, we require the probability theory. It helps us predict the future by assessing how likely the outcomes are. Before going deeper into the probability theory, let's first get acquainted with the basic terminologies and definitions of the probability theory. A random variable is a way of representing an attribute of the outcome. Formally, a random variable X is a function that maps a possible set of outcomes ? to some set E, which is represented as follows: X : ? ? E As an example, let us consider the outing example again. To decide whether to go or not, we may consider the skycover (to check whether it is cloudy or not). Skycover is an attribute of the day. Mathematically, the random variable skycover (X) is interpreted as a function, which maps the day (?) to its skycover values (E). So when we say the event X = 40.1, it represents the set of all the days {?} such that , where is the mapping function. Formally speaking, . Random variables can either be discrete or continuous. A discrete random variable can only take a finite number of values. For example, the random variable representing the outcome of a coin toss can take only two values, heads or tails; and hence, it is discrete. Whereas, a continuous random variable can take infinite number of values. For example, a variable representing the speed of a car can take any number values. For any event whose outcome is represented by some random variable (X), we can assign some value to each of the possible outcomes of X, which represents how probable it is. This is known as the probability distribution of the random variable and is denoted by P(X). For example, consider a set of restaurants. Let X be a random variable representing the quality of food in a restaurant. It can take up a set of values, such as {good, bad, average}. P(X), represents the probability distribution of X, that is, if P(X = good) = 0.3, P(X = average) = 0.5, and P(X = bad) = 0.2. This means there is 30 percent chance of a restaurant serving good food, 50 percent chance of it serving average food, and 20 percent chance of it serving bad food. Independence and conditional independence In most of the situations, we are rather more interested in looking at multiple attributes at the same time. For example, to choose a restaurant, we won't only be looking just at the quality of food; we might also want to look at other attributes, such as the cost, location, size, and so on. We can have a probability distribution over a combination of these attributes as well. This type of distribution is known as joint probability distribution. Going back to our restaurant example, let the random variable for the quality of food be represented by Q, and the cost of food be represented by C. Q can have three categorical values, namely {good, average, bad}, and C can have the values {high, low}. So, the joint distribution for P(Q, C) would have probability values for all the combinations of states of Q and C. P(Q = good, C = high) will represent the probability of a pricey restaurant with good quality food, while P(Q = bad, C = low) will represent the probability of a restaurant that is less expensive with bad quality food. Let us consider another random variable representing an attribute of a restaurant, its location L. The cost of food in a restaurant is not only affected by the quality of food but also the location (generally, a restaurant located in a very good location would be more costly as compared to a restaurant present in a not-very-good location). From our intuition, we can say that the probability of a costly restaurant located at a very good location in a city would be different (generally, more) than simply the probability of a costly restaurant, or the probability of a cheap restaurant located at a prime location of city is different (generally less) than simply probability of a cheap restaurant. Formally speaking, P(C = high | L = good) will be different from P(C = high) and P(C = low | L = good) will be different from P(C = low). This indicates that the random variables C and L are not independent of each other. These attributes or random variables need not always be dependent on each other. For example, the quality of food doesn't depend upon the location of restaurant. So, P(Q = good | L = good) or P(Q = good | L = bad)would be the same as P(Q = good), that is, our estimate of the quality of food of the restaurant will not change even if we have knowledge of its location. Hence, these random variables are independent of each other. In general, random variables can be considered as independent of each other, if: They may also be considered independent if: We can easily derive this conclusion. We know the following from the chain rule of probability: P(X, Y) = P(X) P(Y | X) If Y is independent of X, that is, if X | Y, then P(Y | X) = P(Y). Then: P(X, Y) = P(X) P(Y) Extending this result on multiple variables, we can easily get to the conclusion that a set of random variables are independent of each other, if their joint probability distribution is equal to the product of probabilities of each individual random variable. Sometimes, the variables might not be independent of each other. To make this clearer, let's add another random variable, that is, the number of people visiting the restaurant N. Let's assume that, from our experience we know the number of people visiting only depends on the cost of food at the restaurant and its location (generally, lesser number of people visit costly restaurants). Does the quality of food Q affect the number of people visiting the restaurant? To answer this question, let's look into the random variable affecting N, cost C, and location L. As C is directly affected by Q, we can conclude that Q affects N. However, let's consider a situation when we know that the restaurant is costly, that is, C = high and let's ask the same question, "does the quality of food affect the number of people coming to the restaurant?". The answer is no. The number of people coming only depends on the price and location, so if we know that the cost is high, then we can easily conclude that fewer people will visit, irrespective of the quality of food. Hence, . This type of independence is called conditional independence. Installing tools Let's now see some coding examples using pgmpy, to represent joint distributions and independencies. Here, we will mostly work with IPython and pgmpy (and a few other libraries) for coding examples. So, before moving ahead, let's get a basic introduction to these. IPython IPython is a command shell for interactive computing in multiple programming languages, originally developed for the Python programming language, which offers enhanced introspection, rich media, additional shell syntax, tab completion, and a rich history. IPython provides the following features: Powerful interactive shells (terminal and Qt-based) A browser-based notebook with support for code, text, mathematical expressions, inline plots, and other rich media Support for interactive data visualization and use of GUI toolkits Flexible and embeddable interpreters to load into one's own projects Easy-to-use and high performance tools for parallel computing You can install IPython using the following command: >>> pip3 install ipython To start the IPython command shell, you can simply type ipython3 in the terminal. For more installation instructions, you can visit http://ipython.org/install.html. pgmpy pgmpy is a Python library to work with Probabilistic Graphical models. As it's currently not on PyPi, we will need to build it manually. You can get the source code from the Git repository using the following command: >>> git clone https://github.com/pgmpy/pgmpy Now cd into the cloned directory switch branch for version used and build it with the following code: >>> cd pgmpy >>> git checkout book/v0.1 >>> sudo python3 setup.py install For more installation instructions, you can visit http://pgmpy.org/install.html. With both IPython and pgmpy installed, you should now be able to run the examples. Representing independencies using pgmpy To represent independencies, pgmpy has two classes, namely IndependenceAssertion and Independencies. The IndependenceAssertion class is used to represent individual assertions of the form of or . Let's see some code to represent assertions: # Firstly we need to import IndependenceAssertion In [1]: from pgmpy.independencies import IndependenceAssertion # Each assertion is in the form of [X, Y, Z] meaning X is # independent of Y given Z. In [2]: assertion1 = IndependenceAssertion('X', 'Y') In [3]: assertion1 Out[3]: (X _|_ Y) Here, assertion1 represents that the variable X is independent of the variable Y. To represent conditional assertions, we just need to add a third argument to IndependenceAssertion: In [4]: assertion2 = IndependenceAssertion('X', 'Y', 'Z') In [5]: assertion2 Out [5]: (X _|_ Y | Z) In the preceding example, assertion2 represents . IndependenceAssertion also allows us to represent assertions in the form of . To do this, we just need to pass a list of random variables as arguments: In [4]: assertion2 = IndependenceAssertion('X', 'Y', 'Z') In [5]: assertion2 Out[5]: (X _|_ Y | Z) Moving on to the Independencies class, an Independencies object is used to represent a set of assertions. Often, in the case of Bayesian or Markov networks, we have more than one assertion corresponding to a given model, and to represent these independence assertions for the models, we generally use the Independencies object. Let's take a few examples: In [8]: from pgmpy.independencies import Independencies # There are multiple ways to create an Independencies object, we # could either initialize an empty object or initialize with some # assertions. In [9]: independencies = Independencies() # Empty object In [10]: independencies.get_assertions() Out[10]: [] In [11]: independencies.add_assertions(assertion1, assertion2) In [12]: independencies.get_assertions() Out[12]: [(X _|_ Y), (X _|_ Y | Z)] We can also directly initialize Independencies in these two ways: In [13]: independencies = Independencies(assertion1, assertion2) In [14]: independencies = Independencies(['X', 'Y'], ['A', 'B', 'C']) In [15]: independencies.get_assertions() Out[15]: [(X _|_ Y), (A _|_ B | C)] Representing joint probability distributions using pgmpy We can also represent joint probability distributions using pgmpy's JointProbabilityDistribution class. Let's say we want to represent the joint distribution over the outcomes of tossing two fair coins. So, in this case, the probability of all the possible outcomes would be 0.25, which is shown as follows: In [16]: from pgmpy.factors import JointProbabilityDistribution as Joint In [17]: distribution = Joint(['coin1', 'coin2'], [2, 2], [0.25, 0.25, 0.25, 0.25]) Here, the first argument includes names of random variable. The second argument is a list of the number of states of each random variable. The third argument is a list of probability values, assuming that the first variable changes its states the slowest. So, the preceding distribution represents the following: In [18]: print(distribution) +--------------------------------------+ ¦ coin1 ¦ coin2 ¦ P(coin1,coin2) ¦ ¦---------+---------+------------------¦ ¦ coin1_0 ¦ coin2_0 ¦ 0.2500 ¦ +---------+---------+------------------¦ ¦ coin1_0 ¦ coin2_1 ¦ 0.2500 ¦ +---------+---------+------------------¦ ¦ coin1_1 ¦ coin2_0 ¦ 0.2500 ¦ +---------+---------+------------------¦ ¦ coin1_1 ¦ coin2_1 ¦ 0.2500 ¦ +--------------------------------------+ We can also conduct independence queries over these distributions in pgmpy: In [19]: distribution.check_independence('coin1', 'coin2') Out[20]: True Conditional probability distribution Let's take an example to understand conditional probability better. Let's say we have a bag containing three apples and five oranges, and we want to randomly take out fruits from the bag one at a time without replacing them. Also, the random variables and represent the outcomes in the first try and second try respectively. So, as there are three apples and five oranges in the bag initially, and . Now, let's say that in our first attempt we got an orange. Now, we cannot simply represent the probability of getting an apple or orange in our second attempt. The probabilities in the second attempt will depend on the outcome of our first attempt and therefore, we use conditional probability to represent such cases. Now, in the second attempt, we will have the following probabilities that depend on the outcome of our first try: , , , and . The Conditional Probability Distribution (CPD) of two variables and can be represented as , representing the probability of given that is the probability of after the event has occurred and we know it's outcome. Similarly, we can have representing the probability of after having an observation for . The simplest representation of CPD is tabular CPD. In a tabular CPD, we construct a table containing all the possible combinations of different states of the random variables and the probabilities corresponding to these states. Let's consider the earlier restaurant example. Let's begin by representing the marginal distribution of the quality of food with Q. As we mentioned earlier, it can be categorized into three values {good, bad, average}. For example, P(Q) can be represented in the tabular form as follows: Quality P(Q) Good 0.3 Normal 0.5 Bad 0.2 Similarly, let's say P(L) is the probability distribution of the location of the restaurant. Its CPD can be represented as follows: Location P(L) Good 0.6 Bad 0.4 As the cost of restaurant C depends on both the quality of food Q and its location L, we will be considering P(C | Q, L), which is the conditional distribution of C, given Q and L: Location Good Bad Quality Good Normal Bad Good Normal Bad Cost High 0.8 0.6 0.1 0.6 0.6 0.05 Low 0.2 0.4 0.9 0.4 0.4 0.95 Representing CPDs using pgmpy Let's first see how to represent the tabular CPD using pgmpy for variables that have no conditional variables: In [1]: from pgmpy.factors import TabularCPD # For creating a TabularCPD object we need to pass three # arguments: the variable name, its cardinality that is the number # of states of the random variable and the probability value # corresponding each state. In [2]: quality = TabularCPD(variable='Quality', variable_card=3, values=[[0.3], [0.5], [0.2]]) In [3]: print(quality) +----------------------+ ¦ ['Quality', 0] ¦ 0.3 ¦ +----------------+-----¦ ¦ ['Quality', 1] ¦ 0.5 ¦ +----------------+-----¦ ¦ ['Quality', 2] ¦ 0.2 ¦ +----------------------+ In [4]: quality.variables Out[4]: OrderedDict([('Quality', [State(var='Quality', state=0), State(var='Quality', state=1), State(var='Quality', state=2)])]) In [5]: quality.cardinality Out[5]: array([3]) In [6]: quality.values Out[6]: array([0.3, 0.5, 0.2]) You can see here that the values of the CPD are a 1D array instead of a 2D array, which you passed as an argument. Actually, pgmpy internally stores the values of the TabularCPD as a flattened numpy array. In [7]: location = TabularCPD(variable='Location', variable_card=2, values=[[0.6], [0.4]]) In [8]: print(location) +-----------------------+ ¦ ['Location', 0] ¦ 0.6 ¦ +-----------------+-----¦ ¦ ['Location', 1] ¦ 0.4 ¦ +-----------------------+ However, when we have conditional variables, we also need to specify them and the cardinality of those variables. Let's define the TabularCPD for the cost variable: In [9]: cost = TabularCPD( variable='Cost', variable_card=2, values=[[0.8, 0.6, 0.1, 0.6, 0.6, 0.05], [0.2, 0.4, 0.9, 0.4, 0.4, 0.95]], evidence=['Q', 'L'], evidence_card=[3, 2]) Graph theory The second major framework for the study of probabilistic graphical models is graph theory. Graphs are the skeleton of PGMs, and are used to compactly encode the independence conditions of a probability distribution. Nodes and edges The foundation of graph theory was laid by Leonhard Euler when he solved the famous Seven Bridges of Konigsberg problem. The city of Konigsberg was set on both sides by the Pregel river and included two islands that were connected and maintained by seven bridges. The problem was to find a walk to exactly cross all the bridges once in a single walk. To visualize the problem, let's think of the graph in Fig 1.1: Fig 1.1: The Seven Bridges of Konigsberg graph Here, the nodes a, b, c, and d represent the land, and are known as vertices of the graph. The line segments ab, bc, cd, da, ab, and bc connecting the land parts are the bridges and are known as the edges of the graph. So, we can think of the problem of crossing all the bridges once in a single walk as tracing along all the edges of the graph without lifting our pencils. Formally, a graph G = (V, E) is an ordered pair of finite sets. The elements of the set V are known as the nodes or the vertices of the graph, and the elements of are the edges or the arcs of the graph. The number of nodes or cardinality of G, denoted by |V|, are known as the order of the graph. Similarly, the number of edges denoted by |E| are known as the size of the graph. Here, we can see that the Konigsberg city graph shown in Fig 1.1 is of order 4 and size 7. In a graph, we say that two vertices, u, v ? V are adjacent if u, v ? E. In the City graph, all the four vertices are adjacent to each other because there is an edge for every possible combination of two vertices in the graph. Also, for a vertex v ? V, we define the neighbors set of v as . In the City graph, we can see that b and d are neighbors of c. Similarly, a, b, and c are neighbors of d. We define an edge to be a self loop if the start vertex and the end vertex of the edge are the same. We can put it more formally as, any edge of the form (u, u), where u ? V is a self loop. Until now, we have been talking only about graphs whose edges don't have a direction associated with them, which means that the edge (u, v) is same as the edge (v, u). These types of graphs are known as undirected graphs. Similarly, we can think of a graph whose edges have a sense of direction associated with it. For these graphs, the edge set E would be a set of ordered pair of vertices. These types of graphs are known as directed graphs. In the case of a directed graph, we also define the indegree and outdegree for a vertex. For a vertex v ? V, we define its outdegree as the number of edges originating from the vertex v, that is, . Similarly, the indegree is defined as the number of edges that end at the vertex v, that is, . Walk, paths, and trails For a graph G = (V, E) and u,v ? V, we define a u - v walk as an alternating sequence of vertices and edges, starting with u and ending with v. In the City graph of Fig 1.1, we can have an example of a - d walk as . If there aren't multiple edges between the same vertices, then we simply represent a walk by a sequence of vertices. As in the case of the Butterfly graph shown in Fig 1.2, we can have a walk W : a, c, d, c, e: Fig 1.2: Butterfly graph—a undirected graph A walk with no repeated edges is known as a trail. For example, the walk in the City graph is a trail. Also, a walk with no repeated vertices, except possibly the first and the last, is known as a path. For example, the walk in the City graph is a path. Also, a graph is known as cyclic if there are one or more paths that start and end at the same node. Such paths are known as cycles. Similarly, if there are no cycles in a graph, it is known as an acyclic graph. Bayesian models In most of the real-life cases when we would be representing or modeling some event, we would be dealing with a lot of random variables. Even if we would consider all the random variables to be discrete, there would still be exponentially large number of values in the joint probability distribution. Dealing with such huge amount of data would be computationally expensive (and in some cases, even intractable), and would also require huge amount of memory to store the probability of each combination of states of these random variables. However, in most of the cases, many of these variables are marginally or conditionally independent of each other. By exploiting these independencies, we can reduce the number of values we need to store to represent the joint probability distribution. For instance, in the previous restaurant example, the joint probability distribution across the four random variables that we discussed (that is, quality of food Q, location of restaurant L, cost of food C, and the number of people visiting N) would require us to store 23 independent values. By the chain rule of probability, we know the following: P(Q, L, C, N) = P(Q) P(L|Q) P(C|L, Q) P(N|C, Q, L) Now, let us try to exploit the marginal and conditional independence between the variables, to make the representation more compact. Let's start by considering the independency between the location of the restaurant and quality of food over there. As both of these attributes are independent of each other, P(L|Q) would be the same as P(L). Therefore, we need to store only one parameter to represent it. From the conditional independence that we have seen earlier, we know that . Thus, P(N|C, Q, L) would be the same as P(N|C, L); thus needing only four parameters. Therefore, we now need only (2 + 1 + 6 + 4 = 13) parameters to represent the whole distribution. We can conclude that exploiting independencies helps in the compact representation of joint probability distribution. This forms the basis for the Bayesian network. Representation A Bayesian network is represented by a Directed Acyclic Graph (DAG) and a set of Conditional Probability Distributions (CPD) in which: The nodes represent random variables The edges represent dependencies For each of the nodes, we have a CPD In our previous restaurant example, the nodes would be as follows: Quality of food (Q) Location (L) Cost of food (C) Number of people (N) As the cost of food was dependent on the quality of food (Q) and the location of the restaurant (L), there will be an edge each from Q ? C and L ? C. Similarly, as the number of people visiting the restaurant depends on the price of food and its location, there would be an edge each from L ? N and C ? N. The resulting structure of our Bayesian network is shown in Fig 1.3: Fig 1.3: Bayesian network for the restaurant example Factorization of a distribution over a network Each node in our Bayesian network for restaurants has a CPD associated to it. For example, the CPD for the cost of food in the restaurant is P(C|Q, L), as it only depends on the quality of food and location. For the number of people, it would be P(N|C, L) . So, we can generalize that the CPD associated with each node would be P(node|Par(node)) where Par(node) denotes the parents of the node in the graph. Assuming some probability values, we will finally get a network as shown in Fig 1.4: Fig 1.4: Bayesian network of restaurant along with CPDs Let us go back to the joint probability distribution of all these attributes of the restaurant again. Considering the independencies among variables, we concluded as follows: P(Q,C,L,N) = P(Q)P(L)P(C|Q, L)P(N|C, L) So now, looking into the Bayesian network (BN) for the restaurant, we can say that for any Bayesian network, the joint probability distribution over all its random variables {X1,X2,...,Xn} can be represented as follows: This is known as the chain rule for Bayesian networks. Also, we say that a distribution P factorizes over a graph G, if P can be encoded as follows: Here, ParG(X) is the parent of X in the graph G. Summary In this article, we saw how we can represent a complex joint probability distribution using a directed graph and a conditional probability distribution associated with each node, which is collectively known as a Bayesian network. Resources for Article: Further resources on this subject: Web Scraping with Python [article] Exact Inference Using Graphical Models [article] wxPython: Design Approaches and Techniques [article]

0
0
47571

article-image-optimizing-salesforce-flows

Andy Forbes, Philip Safir, Joseph Kubon, Francisco Fálder

24 Jan 2024

13 min read

Optimizing Salesforce Flows

Andy Forbes, Philip Safir, Joseph Kubon, Francisco Fálder

24 Jan 2024

13 min read

0
0
47512

article-image-what-are-slowly-changing-dimensions-scd-and-why-you-need-them-in-your-data-warehouse

Savia Lobo

07 Dec 2017

5 min read

What are Slowly changing Dimensions (SCD) and why you need them in your Data Warehouse?

Savia Lobo

07 Dec 2017

5 min read

Our Data Engineering Byte Newsletter gives data engineers and practitioners what they often lack today: clear, real-world insights—where every byte tells a story.Subscribe here to stay ahead in data engineering [box type="note" align="" class="" width=""]Below given post is an excerpt from a book by Rahul Malewar titled Learning Informatica PowerCenter 10.x. The book is a quick guide to explore Informatica PowerCenter and its features such as working on sources, targets, transformations, performance optimization, and managing your data at speed. [/box]Our article explores what Slowly Changing Dimensions (SCD) are and how to implement them in Informatica PowerCenter. As the name suggests, SCD allows maintaining changes in the Dimension table in the data warehouse. These are dimensions that gradually change with time, rather than changing on a regular basis. When you implement SCDs, you actually decide how you wish to maintain historical data with the current data. Dimensions present within data warehousing and in data management include static data about certain entities such as customers, geographical locations, products, and so on.Here we talk about general SCDs: SCD1, SCD2, and SCD3. Apart from these, there are also Hybrid SCDs that you might come across. A Hybrid SCD is nothing but a combination of multiple SCDs to serve your complex business requirements.Types of SCDThe various types of SCD are described as follows:Type 1 dimension mapping (SCD1): This keeps only current data and does not maintain historical data.Note : Use SCD1 mapping when you do not want history of previous data.Type 2 dimension/version number mapping (SCD2): This keeps current as well as historical data in the table. It allows you to insert new records and changed records using a new column (PM_VERSION_NUMBER) by maintaining the version number in the table to track the changes. We use a new column PM_PRIMARYKEY to maintain the history.Note : Use SCD2 mapping when you want to keep a full history of dimension data, and track the progression of changes using a version number.Consider there is a column LOCATION in the EMPLOYEE table and you wish to track the changes in the location on employees. Consider a record for Employee ID 1001 present in your EMPLOYEE dimension table. Steve was initially working in India and then shifted to USA. We are willing to maintain history on the LOCATION field.Type 2 dimension/flag mapping: This keeps current as well as historical data in the table. It allows you to insert new records and changed records using a new column (PM_CURRENT_FLAG) by maintaining the flag in the table to track the changes. We use a new column PRIMARY_KEY to maintain the history.Note : Use SCD2 mapping when you want to keep a full history of dimension data, and track the progression of changes using a flag.Let's take an example to understand different SCDs.Type 2 dimension/effective date range mapping: This keeps current as well as historical data in the table. SCD2 allows you to insert new records and changed records using two new columns (PM_BEGIN_DATE and PM_END_DATE) by maintaining the date range in the table to track the changes. We use a new column PRIMARY_KEY to maintain the history.Note : Use SCD2 mapping when you want to keep a full history of dimension data, and track the progression of changes using start date and end date.Type 3 Dimension mapping: This keeps current as well as historical data in the table. We maintain only partial history by adding a new column PM_PREV_COLUMN_NAME, that is, we do not maintain full history.Note: Use SCD3 mapping when you wish to maintain only partial history.EMPLOYEE_IDNAMELOCATION1001STEVEINDIAYour data warehouse table should reflect the current status of Steve. To implement this, we have different types of SCDs. SCD1 As you can see in the following table, INDIA will be replaced with USA, so we end up having only current data, and we lose historical data:PM_PRIMARY_KEYEMPLOYEE_IDNAMELOCATION1001001STEVEUSANow if Steve is again shifted to JAPAN, the LOCATION data will be replaced from USA to JAPAN:PM_PRIMARY_KEYEMPLOYEE_IDNAMELOCATION1001001STEVEJAPANThe advantage of SCD1 is that we do not consume a lot of space in maintaining the data. The disadvantage is that we don't have historical data.SCD2 - Version numberAs you can see in the following table, we are maintaining the full history by adding a new record to maintain the history of the previous records:PM_PRIMARYKEYEMPLOYEE_IDNAMELOCATIONPM_VERSION_NUMBER1001001STEVEINDIA01011001STEVEUSA11021001STEVEJAPAN22001002MIKEUK0We add two new columns in the table: PM_PRIMARYKEY to handle the issues of duplicate records in the primary key in the EMPLOYEE_ID (supposed to be the primary key) column, and PM_VERSION_NUMBER to understand current and history records.SCD2 - FLAGAs you can see in the following table, we are maintaining the full history by adding new records to maintain the history of the previous records: PM_PRIMARYKEYEMPLOYEE_IDNAMELOCATIONPM_CURRENT_FLAG1001001STEVEINDIA01011001STEVEUSA1We add two new columns in the table: PM_PRIMARYKEY to handle the issues of duplicate records in the primary key in the EMPLOYEE_ID column, and PM_CURRENT_FLAG to understand current and history records.Again, if Steve is shifted, the data looks like this:PM_PRIMARYKEYEMPLOYEE_IDNAMELOCATIONPM_CURRENT_FLAG1001001STEVEINDIA01011001STEVEUSA01021001STEVEJAPAN1SCD2 - Date rangeAs you can see in the following table, we are maintaining the full history by adding new records to maintain the history of the previous records:PM_PRIMARYKEYEMPLOYEE_IDNAMELOCATIONPM_BEGIN_DATEPM_END_DATE1001001STEVEINDIA01-01-1431-05-141011001STEVEUSA01-06-1499-99-9999We add three new columns in the table: PM_PRIMARYKEY to handle the issues of duplicate records in the primary key in the EMPLOYEE_ID column, and PM_BEGIN_DATE and PM_END_DATE to understand the versions in the data.The advantage of SCD2 is that you have complete history of the data, which is a must for data warehouse.The disadvantage of SCD2 is that it consumes a lot of space.SCD3As you can see in the following table, we are maintaining the history by adding new columns:PM_PRIMARYKEYEMPLOYEE_IDNAMELOCATIONPM_PREV_LOCATION1001001STEVEUSAINDIAAn optional column PM_PRIMARYKEY can be added to maintain the primary key constraints. We add a new column PM_PREV_LOCATION in the table to store the changes in the data. As you can see, we added a new column to store data as against SCD2,where we added rows to maintain history.If Steve is now shifted to JAPAN, the data changes to this:PM_PRIMARYKEYEMPLOYEE_IDNAMELOCATIONPM_PREV_LOCATION1001001STEVEJAPANUSAAs you can notice, we lost INDIA from the data warehouse, that is why we say we are maintaining partial history.Note : To implement SCD3, decide how many versions of a particular column you wish to maintain. Based on this, the columns will be added in the table.SCD3 is best when you are not interested in maintaining the complete but only partial history.The drawback of SCD3 is that it doesn't store the full history.At this point, you should be very clear about the different types of SCDs. We need to implement these concepts practically in Informatica PowerCenter. Informatica PowerCenter provides a utility called wizard to implement SCD. Using the wizard, you can easily implement any SCD. In the next topics, you will learn how to use the wizard to implement SCD1, SCD2, and SCD3.Before you proceed to the next section, please make sure you have a proper understanding of the transformations in Informatica PowerCenter. You should be clear about the source qualifier, expression, filter, router, lookup, update strategy, and sequence generator transformations. Wizard creates a mapping using all these transformations to implement the SCD functionality.When we implement SCD, there will be some new records that need to be loaded into the target table, and there will be some existing records for which we need to maintain the history.Note : The record that comes for the first time in the table will be referred to as the NEW record, and the record for which we need to maintain history will be referred to as the CHANGED record. Based on the comparison of the source data with the target data, we will decide which one is the NEW record and which is the CHANGED record.To start with, we will use a sample file as our source and the Oracle table as the target to implement SCDs. Before we implement SCDs, let's talk about the logic that will serve our purpose, and then we will fine-tune the logic for each type of SCD.Extract all records from the source.Look up on the target table, and cache all the data.Compare the source data with the target data to flag the NEW and CHANGED records.Filter the data based on the NEW and CHANGED flags.Generate the primary key for every new row inserted into the table.Load the NEW record into the table, and update the existing record if needed.In this article we concentrated on a very important table feature called slowly changing dimensions. We also discussed different types of SCDs, i.e., SCD1, SCD2, and SCD3. Note: This isn't related to the article.If you’d like to learn more about Data Warehousing and Slowly Changing Dimensions (SCD), this book is a strong resource: Data Engineering with dbt by Francesco Zagni. It offers a practical guide to building dependable cloud-based data platforms with SQL and dbt, covering core data modeling concepts and real-world transformation patterns.

0
1
47429

article-image-how-to-build-a-music-recommendation-system-with-pagerank-algorithm

Vijin Boricha

13 Feb 2018

6 min read

How to Build a music recommendation system with PageRank Algorithm

Vijin Boricha

13 Feb 2018

6 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book Mastering Spark for Data Science written by Andrew Morgan and Antoine Amend. In this book, you will learn about advanced Spark architectures, how to work with geographic data in Spark, and how to tune Spark algorithms to scale them linearly.[/box] In today’s tutorial, we will learn to build a recommender with PageRank algorithm. The PageRank algorithm Instead of recommending a specific song, we will recommend playlists. A playlist would consist of a list of all our songs ranked by relevance, most to least relevant. Let's begin with the assumption that people listen to music in a similar way to how they browse articles on the web, that is, following a logical path from link to link, but occasionally switching direction, or teleporting, and browsing to a totally different website. Continuing with the analogy, while listening to music one can either carry on listening to music of a similar style (and hence follow their most expected journey), or skip to a random song in a totally different genre. It turns out that this is exactly how Google ranks websites by popularity using a PageRank algorithm. For more details on the PageRank algorithm visit: h t t p ://i l p u b s . s t a n f o r d . e d u :8090/422/1/1999- 66. p d f . The popularity of a website is measured by the number of links it points to (and is referred from). In our music use case, the popularity is built as the number hashes a given song shares with all its neighbors. Instead of popularity, we introduce the concept of song commonality. Building a Graph of Frequency Co-occurrence We start by reading our hash values back from Cassandra and re-establishing the list of song IDs for each distinct hash. Once we have this, we can count the number of hashes for each song using a simple reduceByKey function, and because the audio library is relatively small, we collect and broadcast it to our Spark executors: val hashSongsRDD = sc.cassandraTable[HashSongsPair]("gzet", "hashes") val songHashRDD = hashSongsRDD flatMap { hash => hash.songs map { song => ((hash, song), 1) } } val songTfRDD = songHashRDD map { case ((hash,songId),count) => (songId, count) } reduceByKey(_+_) val songTf = sc.broadcast(songTfRDD.collectAsMap()) Next, we build a co-occurrence matrix by getting the cross product of every song sharing a same hash value, and count how many times the same tuple is observed. Finally, we wrap the song IDs and the normalized (using the term frequency we just broadcast) frequency count inside of an Edge class from GraphX: implicit class Crossable[X](xs: Traversable[X]) { def cross[Y](ys: Traversable[Y]) = for { x <- xs; y <- ys } yield (x, y) val crossSongRDD = songHashRDD.keys .groupByKey() .values .flatMap { songIds => songIds cross songIds filter { case (from, to) => from != to }.map(_ -> 1) }.reduceByKey(_+_) .map { case ((from, to), count) => val weight = count.toDouble / songTfB.value.getOrElse(from, 1) Edge(from, to, weight) }.filter { edge => edge.attr > minSimilarityB.value } val graph = Graph.fromEdges(crossSongRDD, 0L) We are only keeping edges with a weight (meaning a hash co-occurrence) greater than a predefined threshold in order to build our hash frequency graph. Running PageRank Contrary to what one would normally expect when running a PageRank, our graph is undirected. It turns out that for our recommender, the lack of direction does not matter, since we are simply trying to find similarities between Led Zeppelin and Spirit. A possible way of introducing direction could be to look at the song publishing date. In order to find musical influences, we could certainly introduce a chronology from the oldest to newest songs giving directionality to our edges. In the following pageRank, we define a probability of 15% to skip, or teleport as it is known, to any random song, but this can be obviously tuned for different needs: val prGraph = graph.pageRank(0.001, 0.15) Finally, we extract the page ranked vertices and save them as a playlist in Cassandra via an RDD of the Song case class: case class Song(id: Long, name: String, commonality: Double) val vertices = prGraph .vertices .mapPartitions { vertices => val songIds = songIdsB .value .vertices .map { case (songId, pr) => val songName = songIds.get(vId).get Song(songId, songName, pr) } } vertices.saveAsCassandraTable("gzet", "playlist") The reader may be pondering the exact purpose of PageRank here, and how it could be used as a recommender? In fact, our use of PageRank means that the highest ranking songs would be the ones that share many frequencies with other songs. This could be due to a common arrangement, key theme, or melody; or maybe because a particular artist was a major influence on a musical trend. However, these songs should be, at least in theory, more popular (by virtue of the fact they occur more often), meaning that they are more likely to have mass appeal. On the other end of the spectrum, low ranking songs are ones where we did not find any similarity with anything we know. Either these songs are so avant-garde that no one has explored these musical ideas before, or alternatively are so bad that no one ever wanted to copy them! Maybe they were even composed by that up-and-coming artist you were listening to in your rebellious teenage years. Either way, the chance of a random user liking these songs is treated as negligible. Surprisingly, whether it is a pure coincidence or whether this assumption really makes sense, the lowest ranked song from this particular audio library is Daft Punk's–Motherboard it is a title that is quite original (a brilliant one though) and a definite unique sound. To summarize, we have learnt how to build a complete recommendation system for a song playlist. You can check out the book Mastering Spark for Data Science to deep dive into Spark and deliver other production grade data science solutions. Read our post on how deep learning is revolutionizing the music industry. And here is how you can analyze big data using the pagerank algorithm.

0
0
47410

article-image-ten-tips-to-successfully-migrate-from-on-premise-to-microsoft-azure

Savia Lobo

13 Dec 2019

11 min read

Ten tips to successfully migrate from on-premise to Microsoft Azure

Savia Lobo

13 Dec 2019

11 min read

The decision to start using Azure Cloud Services for your IT infrastructure seems simple. However, to succeed, a cloud migration requires hard work and good planning. At Microsoft Ignite 2018, Eric Berg, an Azure Lead Architect at COMPAREX, a Microsoft MVP Azure + Cloud and Data Center Management, shared ‘Ten tips for a successful migration from on-premises to Azure’, based on their day-to-day learnings. Eric shares known issues, common pitfalls, and best practices to get started. Further Reading To gain a deep understanding of various Azure services related to infrastructure, applications, and environments, you can check out our book Microsoft Azure Administrator – Exam Guide AZ-103 by Sjoukje Zaal. This book is also an effective guide for acquiring the skills needed to pass the Exam AZ-103, with effective mock tests and solutions so that you can confidently crack this exam. Tip #1: Have your Azure Governance Set One needs to have a basic plan of what they are going to do with Azure. Consider Azure Governance as the basis for Cloud Adoption. Berg says, “if you don't have a plan for what you do with Azure, it will hurt you.” To run something on Azure is good, but to keep it secure is the key thing. Here, Governance rule sets help users to audit and figure out if everything is running as expected. One of the key parts of Azure Governance is Networking. Hence one should consider a networking concept that suits both the company and the business. Microsoft is moving really fast; in 2018, to connect to the US and Europe you had to use a VPN then came global v-net peering, and now we have ESRI virtual WAN. Such advancements allow a concept to further grow and always use the top of the edge technologies while adoption of such a rule set enables customers to try a lot of things on their own. Tip #2: Think about different requirements From an IT perspective, every organization wants control, focus on its IT, and also to ensure that everything is compliant. Many organizations also want to write policies in place. On the other hand, the human resource department section wants to be totally agile and innovative and wants to consume services and self-service without feeling the need to communicate with IT. “I've seen so many human resource departments doing their own contracts with external partners building some fancy new hiring platforms and IT didn't know anything about it,” Berg points out. When it comes to Cloud, each and every member of the company should be aware and should be involved. It is simply not just an IT-dependent decision, but is company dependent. Tip #3: Assess your infrastructure Berg says organizations should assess their environment. Migrating your servers as they are to Azure is not the right thing to do. This is because in Azure the decision between 8 and 16 gigabytes of RAM is a decision between 100 and 200 percent of the cost. Hence, right scaling or a good assessment is extremely important and this cannot be achieved by running a script once for 10 minutes and you know what your VMs are doing. Instead, you should at least run an assessment for one month or even three months to see some peaks and some low times. This is like a good assessment where you know what you really need to migrate your systems better. Keep a check on your inventory and also on your contracts to check if you are allowed to migrate your ERP system or CRM system to Azure. As some contracts state that the “deployment of this solution outside of the premises of the company needs some extra contract and some extra cost,” Berg warns. Migrating to Azure is technically easy but difficult from a contract perspective. Also, you should define your needs for migration to a cloud platform. If you don't get value out of your migration don't do it. Berg advises, don't migrate to Azure because everybody does or because it's cool or fancy. Tip #4: Do not rebuild your on-premises structures on Cloud Cloud needs trust. Organizations often try to bring in the old stuff on the on-premises infrastructures such as the external DMZ, the internal DMZ, and also 15 security layers. Berg said they use intune, a cloud-based service in the enterprise mobility management (EMM) space that helps enable your workforce to be productive while keeping your corporate data protected, along with Office 365 on a cloud. In tune doesn't stick to a DMZ; even if you want to deploy your application or use the latest tech such as BOTS, cognitive services, etc. It may not fit totally into a structured network design on the cloud. On the other hand, there will be disconnected subscriptions, i.e. there will be subscriptions with no connection to your on-premises network. This problem has to be dealt with on a security level. New services need new ways. If you are not agile your IT won't be agile. If you need 16 days or six weeks to deploy a server and you want to stick to those rules and processes, then Azure won't be beneficial for you as there will be no value in it for you. Tip #5: Azure consumption is billed If you spin up a VM that costs $25,000 a month you have to pay for it. The M-series VMs have 128 cores 4 terabytes of RAM and are simply amazing. If you deployed using Windows Server and SQL Server Enterprise, the cost goes up to $58,000 a month for just one VM. When you migrate to Azure and you start integrating new things you probably have to change your own business model. To implement tech such as facial recognition, and others you have to set up a cost management tool for usage tracking. There are many usage APIs and third-party tools available. Proper cost management into the Azure infrastructure helps to divide costs. If you put everything into one subscription, one resource group, where everyone is the owner. Here, the problem won’t be the functioning but you will not be able to figure out who's responsible for what. Instead, a good structure of subscriptions, a good role-based access control, a good tagging policy will help you to figure out cost better. Tip #6: Identity is the new perimeter Azure Ad is the center of everything. To access a user’s data center is not easy these days as it needs access within the premises, then into the data center, then log into the user’s own premises infrastructure. If anyone has a user’s login ID, they are inside the user’s Azure AD, the user’s visa VPN, and also on their on-premises data center. Hence identity is a key part of security. “So, don’t think about using MFA, use MFA. Don't think about using Privileged Identity Management, use it because that's the only way to secure your infrastructure probably and get an insight into who is using what in my infrastructure and how is it going,” Berg warns. In the modern workplace, one can work from anywhere. However, one needs to have proper security levels in place. Secure devices, secure identity, secure access ways to MFA, and so on. Stay cautious. Tip #7: Include your users Users are the most important part of any ecosystem. So, when you migrate servers or the entire on-premise architecture, inform them. What if you have a CRM system fully in the cloud and there's no local cache on the system anymore? This won't fit the needs of your customers or internal customers and this is why organizations should inform them of their plans. They should also ask them what they really need and this will, in turn, help the organizations. Berg illustrated this point with a project in Germany that includes a customer with a very specific project that wanted the product to decrease their response times. The client needs up to two days to answer a customer's email because the project product is very complex and they have a very spread documentation library and it's hard. Their internal goal is to bring down the product response to ten minutes--from two days to 10 minutes. Berg said they considered using a bot, some cognitive services and Azure search, and a plug-in an Outlook. So you get the mail you just search for your product and everything will be figured out. The documentation, the fact sheets, and the standard email template for answering such a thing. The solution proposed was good; both Berg and the IT liked it. However, when the sales team was asked, they said such a solution would steal their jobs. The mistake here was Sales was not included in the process of finding this solution. To rectify this, organizations should include all stakeholders. Focus on benefits, have some key users because they will help you to spread the word over. In the above case, explain and evangelize the sales teams as they are afraid because they don't know and don't understand what happens if you have a bot and some cognitive services to figure out which document is right. This won’t steal their job but instead, help to do better at their job with improved efficiency. Train and educate so they are able to use it, check processes and consider changes. Managed services can help you focus. Back up, monitoring, patching, this is something somebody can do for you. Instead, organizations can now focus on after the migration such as integrating new services, improving right scaling, optimizing cost, optimizing performance, staying up-to-date with all the changes in Azure, etc. Tip #8: Consider Transformation instead of Migration Consider a transformation instead of a migration. Build some logical blocks, don't move an ERP system without your database or the other way around. Berg suggests: To adopt technical and licensing showstoppers define your infrastructure requirements check your compatibility to migrate update helpdesk about SLAs Ask if Azure is really helping me (to figure out or to cover my assets or is it getting better or maybe worse). Tip #9: Keep up to date Continuous learning and continuous knowledge are key to growth. As Azure releases a lot of changes very often, users are notified of these latest updates via emails or via Azure news. Organizations should review their architecture on a regular basis, Berg says. VPN to global v-net peering to Global WAN so that you can change your infrastructure quite fast. Audit your governance not on a yearly basis may be monthly or quarterly. Consider changes fast; don't think two years about a change because then it will not be any more interesting. If there's a new opportunity, grab it, use it and three weeks later probably drop it away. But avoid thinking for two months or more else it will be too late. Tip #10: Plan for the future Do some end to end planning, think about the end-to-end solution; who's using it, what's my back end on this, and so on. Save money and forecast your costs. Keep an eye on resources that probably spread because someone runs the script without knowing what they are doing. Simply migrating an IIS server with a static website to Azure is not actual cloud migration. Instead, customers should consider moving their servers to a static storage website, to a web app, etc. but not in the Windows VM. Berg concludes by saying that an important migration step is to move from infrastructure. Everybody migrates infrastructure to Azure because that's easy because it's just migrating from one VM to another VM. Customers should not ‘only’ migrate. They should also start an optimization, move forward to platform services, be more agile, think about new ways and most importantly get rid of all on-premise old stuff. Berg adds, “In five years probably nobody will talk about infrastructure as a service anymore because everybody has migrated and optimized it already.” To stay more compliant with corporate standards and SLAs, learn how to configure Azure subscription policies with “Microsoft Azure Administrator – Exam Guide AZ-103” by Packt Publishing. 5 reasons Node.js developers might actually love using Azure [Sponsored by Microsoft] Azure Functions 3.0 released with support for .NET Core 3.1! Microsoft announces Azure Quantum, an open cloud ecosystem to learn and build scalable quantum solutions

0
0
47349

Packt

23 Jan 2014

19 min read

The Design Documentation

Packt

23 Jan 2014

19 min read

0
2
47263

article-image-working-with-shaders-in-c-to-create-3d-games

Amarabha Banerjee

15 Jun 2018

28 min read

Working with shaders in C++ to create 3D games

Amarabha Banerjee

15 Jun 2018

28 min read

0
0
47216

article-image-create-a-travel-app-with-xamarin

Sugandha Lahoti

20 Jun 2018

14 min read

Create a travel app with Xamarin [Tutorial]

Sugandha Lahoti

20 Jun 2018

14 min read

0
1
47209

article-image-basic-website-using-nodejs-and-mysql-database

Packt

14 Jul 2016

5 min read

Basic Website using Node.js and MySQL database

Packt

14 Jul 2016

5 min read

0
0
47023

How-To Tutorials

article-image-setting-up-openai-playground

Henry Habib

13 Feb 2024

9 min read

Setting Up OpenAI Playground

Henry Habib

13 Feb 2024

9 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, OpenAI API Cookbook, by Henry Habib. Integrate the ChatGPT API into various domains ranging from simple wrappers to knowledge-based assistants, multi-model, and conversational applicationsIntroductionThe OpenAI Playground is an interactive web-based interface designed to allow users to experiment with OpenAI’s language models, including ChatGPT. It’s a place where you can learn about the capabilities of these models by entering prompts and seeing the responses generated in real time. This platform acts as a sandbox where developers, researchers, and curious individuals alike can experiment, learn, and even prototype their ideas.In the Playground, you have the freedom to engage in a wide range of activities. You can test out different versions of the AI models, experimenting with various prompts to see how the model responds, and you can play around with different parameters to influence the responses generated. It provides a real-time glimpse into how these powerful AI models think, react, and create based on your input.Setting up your OpenAI Playground environmentGetting readyBefore you start, you need to create an OpenAI Platform account.Navigate to https://platform.openai.com/and sign in to your OpenAI account. If you do not have an account, you can sign up for free with an email address. Alternatively, you can log in to OpenAI with a valid Google, Microsoft, or Apple account. Follow the instructions to complete the creation of your account. You may need to verify your identity with a valid phone number.How to do it…1. After you have successfully logged in, navigate to Profile in the top right-hand menu, select Personal, and then select Usage from the left-hand side menu. Alternatively, you can navigate to https://platform.openai.com/account/usage after logging in. This page shows the usage of your API, but more importantly, it shows you how many credits you have available.2. Normally, OpenAI provides you a $5 credit with a new account, which you should be able to see under the Free Trial Usage section of the page. If you do have credits, proceed to step 4. If, however, you do not have any credits, you will need to upgrade and set up a paid account.3. You need not set up a paid account if you have received free credits. If you run out of free credits, however, here is how you can set up a paid account: select Billing from the left-hand side menu and then select Overview. Then, select the Set up paid account button. You will be prompted to enter your payment details and set a dollar threshold, which can be set to any level of spend that you are comfortable with. Note that the amount of credits required to collectively execute every single recipe contained in this book is not likely to exceed $5.4. After you have created an OpenAI Platform account, you should be able to access the Playground by selecting Playground from the top menu bar, or by navigating to https://platform. openai.com/playground.How it works…The OpenAI Playground interface is, in my experience, clean, intuitive, and designed to provide users easy access to OpenAI’s powerful language models. The Playground is an excellent place to learn how the models perform under different settings, allowing you to experiment with parameters such as temperature and max tokens, which influence the randomness and length of the outputs respectively. The changes you make are instantly reflected in the model’s responses, offering immediate feedback.As shown in Figure 1.1, the Playground consists of three sections: the System Message, the Chat Log, and the Parameters. You will learn more about these three features in the Running a completion request in the OpenAI Playground recipe.Figure 1.1 – The OpenAI PlaygroundNow, your Playground is set up and ready to be used. You can use it to run completion requests and see how varying your prompts and parameters affect the response from OpenAI.Running a completion request in the OpenAI PlaygroundIn this recipe, we will actually put the Playground in action and execute a completion request from OpenAI. Here, you will see the power of the OpenAI API and how it can be used to provide completions for virtually any prompt.Getting readyEnsure you have an OpenAI Platform account with available usage credits. If you don’t, please follow the Setting up your OpenAI API Playground environment recipe. All the recipes in this chapter will have this same requirement.How to do it…Let’s go ahead and start testing the model with the Playground. Let’s create an assistant that writes marketing slogans:1. Navigate to the OpenAI Playground.2. In the System Message, type in the following: You are an assistant that creates marketing slogans based on descriptions of companies. Here, we are clearly instructing the model of its role and context.3. In the Chat Log, populate the USER message with the following: A company that writes engaging mystery novels.4. Select the Submit button on the bottom of the page.5. You should now see a completion response from OpenAI. In my case (Figure 1.2), the response isUnlock the Thrilling Pages of Suspense with Our Captivating Mystery Novels!Figure 1.2 – The OpenAI Playground with prompt and completionNoteSince OpenAI’s LLMs are probabilistic, you will likely not see the same outputs as me. In fact, if you run this recipe multiple times, you will likely see different answers, and that is expected because it is built into the randomness of the model.How it works…OpenAI’s text generation models utilize a specific neural network architecture termed a transformer. Before delving deeper into this, let’s unpack some of these terms:Neural network architecture: At a high level, this refers to a system inspired by the human brain’s interconnected neuron structure. It’s designed to recognize patterns and can be thought of as the foundational building block for many modern AI systems.Transformer: This is a type of neural network design that has proven particularly effective for understanding sequences, making it ideal for tasks involving human language. It focuses on the relationships between words and their context within a sentence or larger text segment.In machine learning, unsupervised learning typically refers to training a model without any labeled data, letting the model figure out patterns on its own. However, OpenAI’s methodology is more nuanced. The models are initially trained on a vast corpus of text data, supervised with various tasks. This helps them predict the next word in a sentence, for instance. Subsequent refinements are made using Reinforcement Learning through Human Feedback (RLHF), where the model is further improved based on feedback from human evaluators.Through this combination of techniques and an extensive amount of data, the model starts to capture the intricacies of human language, encompassing context, tone, humor, and even sarcasm.In this case, the completion response is provided based on both the System Message and the Chat Log. The System Message serves a critical role in shaping and guiding the responses you receive from Open AI, as it dictates the model’s persona, role, tone, and context, among other attributes. In our case, the System Message contains the persona we want the model to take: You are an assistant that creates marketing slogans based on descriptions of companies.The Chat Log contains the history of messages that the model has access to before providing its response, which contains our prompt, A company that writes engaging mystery novels.Finally, the parameters contain more granular settings that you can change for the model, such as temperature. These significantly change the completion response from OpenAI. We will discuss temperature and other parameters in greater detail in Chapter 3.There’s more…It is worth noting that ChatGPT does not read and understand the meaning behind text – instead, the responses are based on statistical probabilities based on patterns it observed during training.The model does not understand the text in the same way that humans do; instead, the completions are generated based on statistical associations and patterns that have been trained in the model’s neural network from a large body of similar text. Now, you know how to run completion requests with the OpenAI Playground. You can try this feature out for your own prompts and see what completions you get. Try creative prompts such as write me a song about lightbulbs or more professional prompts such as explain Newton's first law.ConclusionIn conclusion, the OpenAI Playground offers a dynamic environment for exploring the capabilities of language models like ChatGPT. By setting up your account and navigating through its features, you can unlock endless possibilities for creativity, learning, and innovation. Experiment with prompts, adjust parameters, and observe real-time responses to gain insights into AI's potential. Whether you're a developer, researcher, or curious individual, the Playground provides a sandbox for unleashing your imagination and understanding AI's intricacies. With each completion request, you delve deeper into the world of artificial intelligence, discovering its nuances and expanding your horizons. Start your journey today and witness the power of AI in action.Author BioHenry Habib is a Manager at one of the world's top management consulting firms, advising F500 companies on analytics and operations, with a particular focus on building intelligent AI-driven solutions and tools to create impact. He is a passionate online instructor and educator, amassing a of more than 150K paid students and facilitating technical programs at large banks and governmental.A proponent in the no-code and LLM revolution, he believes that anyone can now create powerful and intelligent applications without any deep technical skills. Henry resides in Toronto, Canada with his wife, and enjoys reading AI research papers and playing tennis in his free time.

0
0
47001

article-image-build-a-foodie-bot-with-javascript

Gebin George

03 May 2018

7 min read

Build a foodie bot with JavaScript

Gebin George

03 May 2018

7 min read

0
0
46967

How-To Tutorials

article-image-introducing-swift-programming-language

Packt

19 Apr 2016

25 min read

Introducing the Swift Programming Language

Packt

19 Apr 2016

25 min read

0
0
46898

How-To Tutorials

article-image-building-scalable-microservices

Packt

18 Jan 2017

33 min read

Building Scalable Microservices

Packt

18 Jan 2017

33 min read

0
0
46772

article-image-fuzzy-logic-ai-characters-unity-3d-games

Kunal Chaudhari

01 Jun 2018

16 min read

Implementing fuzzy logic to bring AI characters alive in Unity based 3D games

Kunal Chaudhari

01 Jun 2018

16 min read

Fuzzy logic is a fantastic way to represent the rules of your game in a more nuanced way. Perhaps more so than other concepts, fuzzy logic is a very math-heavy topic. Most of the information can be represented purely by mathematical functions. For the sake of teaching the important concepts as they apply to Unity, most of the math has been simplified and implemented using Unity's built-in features. In this tutorial, we will take a look at the concepts behind fuzzy logic systems and implement in your AI system. Implementing fuzzy logic will make your game characters more believable and depict real-world attributes. This article is an excerpt from a book written by Ray Barrera, Aung Sithu Kyaw, and Thet Naing Swe titled Unity 2017 Game AI Programming - Third Edition. This book will help you leverage the power of artificial intelligence to program smart entities for your games. Defining fuzzy logic The simplest way to define fuzzy logic is by comparison to binary logic. Generally, transition rules as looked at as true or false or 0 or 1 values. Is something visible? Is it at least a certain distance away? Even in instances where multiple values were being evaluated, all of the values had exactly two outcomes; thus, they were binary. In contrast, fuzzy values represent a much richer range of possibilities, where each value is represented as a float rather than an integer. We stop looking at values as 0 or 1, and we start looking at them as 0 to 1. A common example used to describe fuzzy logic is temperature. Fuzzy logic allows us to make decisions based on non-specific data. I can step outside on a sunny Californian summer's day and ascertain that it is warm, without knowing the temperature precisely. Conversely, if I were to find myself in Alaska during the winter, I would know that it is cold, again, without knowing the exact temperature. These concepts of cold, cool, warm, and hot are fuzzy ones. There is a good amount of ambiguity as to at what point we go from warm to hot. Fuzzy logic allows us to model these concepts as sets and determine their validity or truth by using a set of rules. When people make decisions, people have some gray areas. That is to say, it's not always black and white. The same concept applies to agents that rely on fuzzy logic. Say you hadn't eaten in a few hours, and you were starting to feel a little hungry. At which point were you hungry enough to go grab a snack? You could look at the time right after a meal as 0, and 1 would be the point where you approached starvation. The following figure illustrates this point: When making decisions, there are many factors that determine the ultimate choice. This leads into another aspect of fuzzy logic controllers—they can take into account as much data as necessary. Let's continue to look at our "should I eat?" example. We've only considered one value for making that decision, which is the time since the last time you ate. However, there are other factors that can affect this decision, such as how much energy you're expending and how lazy you are at that particular moment. Or am I the only one to use that as a deciding factor? Either way, you can see how multiple input values can affect the output, which we can think of as the "likeliness to have another meal." Fuzzy logic systems can be very flexible due to their generic nature. You provide input, the fuzzy logic provides an output. What that output means to your game is entirely up to you. We've primarily looked at how the inputs would affect a decision, which, in reality, is taking the output and using it in a way the computer, our agent, can understand. However, the output can also be used to determine how much of something to do, how fast something happens, or for how long something happens. For example, imagine your agent is a car in a sci-fi racing game that has a "nitro-boost" ability that lets it expend a resource to go faster. Our 0 to 1 value can represent a normalized amount of time for it to use that boost or perhaps a normalized amount of fuel to use. Picking fuzzy systems over binary systems With most things in game programming, we must evaluate the requirements of our game and the technology and hardware limitations when deciding on the best way to tackle a problem. As you might imagine, there is a performance cost associated with going from a simple yes/no system to a more nuanced fuzzy logic one, which is one of the reasons we may opt out of using it. Of course, being a more complex system doesn't necessarily always mean it's a better one. There will be times when you just want the simplicity and predictability of a binary system because it may fit your game better. While there is some truth to the old adage, "the simpler, the better", one should also take into account the saying, "everything should be made as simple as possible, but not simpler". Though the quote is widely attributed to Albert Einstein, the father of relativity, it's not entirely clear who said it. The important thing to consider is the meaning of the quote itself. You should make your AI as simple as your game needs it to be, but not simpler. Pac-Man's AI works perfectly for the game–it's simple enough. However, rules say that simple would be out of place in a modern shooter or strategy game. Using fuzzy logic Once you understand the simple concepts behind fuzzy logic, it's easy to start thinking of the many ways in which it can be useful. In reality, it's just another tool in our belt, and each job requires different tools. Fuzzy logic is great at taking some data, evaluating it in a similar way to how a human would (albeit in a much simpler way), and then translating the data back to information that is usable by the system. Fuzzy logic controllers have several real-world use cases. Some are more obvious than others, and while these are by no means one-to-one comparisons to our usage in game AI, they serve to illustrate a point: Heating ventilation and air conditioning (HVAC) systems: The temperature example when talking about fuzzy logic is not only a good theoretical approach to explaining fuzzy logic, but also a very common real-world example of fuzzy logic controllers in action. Automobiles: Modern automobiles come equipped with very sophisticated computerized systems, from the air conditioning system (again), to fuel delivery, to automated braking systems. In fact, putting computers in automobiles has resulted in far more efficient systems than the old binary systems that were sometimes used. Your smartphone: Ever notice how your screen dims and brightens depending on how much ambient light there is? Modern smartphone operating systems look at ambient light, the color of the data being displayed, and the current battery life to optimize screen brightness. Washing machines: Not my washing machine necessarily, as it's quite old, but most modern washers (from the last 20 years) make some use of fuzzy logic. Load size, water dirtiness, temperature, and other factors are taken into account from cycle to cycle to optimize water use, energy consumption, and time. If you take a look around your house, there is a good chance you'll find a few interesting uses of fuzzy logic, and I mean besides your computer, of course. While these are neat uses of the concept, they're not particularly exciting or game-related. I'm partial to games involving wizards, magic, and monsters, so let's look at a more relevant example. Implementing a simple fuzzy logic system For this example, we're going to use my good friend, Bob, the wizard. Bob lives in an RPG world, and he has some very powerful healing magic at his disposal. Bob has to decide when to cast this magic on himself based on his remaining health points (HPs). In a binary system, Bob's decision-making process might look like this: if(healthPoints <= 50) { CastHealingSpell(me); } We see that Bob's health can be in one of two states—above 50, or not. Nothing wrong with that, but let's have a look at what the fuzzy version of this same scenario might look like, starting with determining Bob's health status: Before the panic sets in upon seeing charts and values that may not quite mean anything to you right away, let's dissect what we're looking at. Our first impulse might be to try to map the probability that Bob will cast a healing spell to how much health he is missing. That would, in simple terms, just be a linear function. Nothing really fuzzy about that—it's a linear relationship, and while it is a step above a binary decision in terms of complexity, it's still not truly fuzzy. Enter the concept of a membership function. It's key to our system, as it allows us to determine how true a statement is. In this example, we're not simply looking at raw values to determine whether or not Bob should cast his spell; instead, we're breaking it up into logical chunks of information for Bob to use in order to determine what his course of action should be. In this example, we're comparing three statements and evaluating not only how true each one is, but which is the most true: Bob is in a critical condition Bob is hurt Bob is healthy If you're into official terminology, we call this determining the degree of membership to a set. Once we have this information, our agent can determine what to do with it next. At a glance, you'll notice it's possible for two statements to be true at a time. Bob can be in a critical condition and hurt. He can also be somewhat hurt and a little bit healthy. You're free to pick the thresholds for each, but, in this example, let's evaluate these statements as per the preceding graph. The vertical value represents the degree of truth of a statement as a normalized float (0 to 1): At 0 percent health, we can see that the critical statement evaluates to 1. It is absolutely true that Bob is critical when his health is gone. At 40 percent health, Bob is hurt, and that is the truest statement. At 100 percent health, the truest statement is that Bob is healthy. Anything outside of these absolutely true statements is squarely in fuzzy territory. For example, let's say Bob's health is at 65 percent. In that same chart, we can visualize it like this: The vertical line drawn through the chart at 65 represents Bob's health. As we can see, it intersects both sets, which means that Bob is a little bit hurt, but he's also kind of healthy. At a glance, we can tell, however, that the vertical line intercepts the Hurt set at a higher point in the graph. We can take this to mean that Bob is more hurt than he is healthy. To be specific, Bob is 37.5 percent hurt, 12.5 percent healthy, and 0 percent critical. Let's take a look at this in code; open up our FuzzySample scene in Unity. The hierarchy will look like this: The important game object to look at is Fuzzy Example. This contains the logic that we'll be looking at. In addition to that, we have our Canvas containing all of the labels and the input field and button that make this example work. Lastly, there's the Unity-generated EventSystem and Main Camera, which we can disregard. There isn't anything special going on with the setup for the scene, but it's a good idea to become familiar with it, and you are encouraged to poke around and tweak it to your heart's content after we've looked at why everything is there and what it all does. With the Fuzzy Example game object selected, the inspector will look similar to the following image: Our sample implementation is not necessarily something you'll take and implement in your game as it is, but it is meant to illustrate the previous points in a clear manner. We use Unity's AnimationCurve for each different set. It's a quick and easy way to visualize the very same lines in our earlier graph. Unfortunately, there is no straightforward way to plot all the lines in the same graph, so we use a separate AnimationCurve for each set. In the preceding screenshot, they are labeled Critical, Hurt, and Healthy. The neat thing about these curves is that they come with a built-in method to evaluate them at a given point (t). For us, t does not represent time, but rather the amount of health Bob has. As in the preceding graph, the Unity example looks at a HP range of 0 to 100. These curves also provide a simple user interface for editing the values. You can simply click on the curve in the inspector. That opens up the curve editing window. You can add points, move points, change tangents, and so on, as shown in the following screenshot: Unity's curve editor window Our example focuses on triangle-shaped sets. That is, linear graphs for each set. You are by no means restricted to this shape, though it is the most common. You could use a bell curve or a trapezoid, for that matter. To keep things simple, we'll stick to the triangle. You can learn more about Unity's AnimationCurve editor at http://docs.unity3d.com/ScriptReference/AnimationCurve.html. The rest of the fields are just references to the different UI elements used in code that we'll be looking at later in this chapter. The names of these variables are fairly self-explanatory, however, so there isn't much guesswork to be done here. Next, we can take a look at how the scene is set up. If you play the scene, the game view will look something similar to the following screenshot: A simple UI to demonstrate fuzzy values We can see that we have three distinct groups, representing each question from the "Bob, the wizard" example. How healthy is Bob, how hurt is Bob, and how critical is Bob? For each set, upon evaluation, the value that starts off as 0 true will dynamically adjust to represent the actual degree of membership. There is an input box in which you can type a percentage of health to use for the test. No fancy controls are in place for this, so be sure to enter a value from 0 to 100. For the sake of consistency, let's enter a value of 65 into the box and then press the Evaluate! button. This will run some code, look at the curves, and yield the exact same results we saw in our graph earlier. While this shouldn't come as a surprise (the math is what it is, after all), there are fewer things more important in game programming than testing your assumptions, and sure enough, we've tested and verified our earlier statement. After running the test by hitting the Evaluate! button, the game scene will look similar to the following screenshot: This is how Bob is doing at 65 percent health Again, the values turn out to be 0.125 (or 12.5 percent) healthy and 0.375 (or 37.5 percent) hurt. At this point, we're still not doing anything with this data, but let's take a look at the code that's handling everything: using UnityEngine; using UnityEngine.UI; using System.Collections; public class FuzzySample1 : MonoBehaviour { private const string labelText = "{0} true"; public AnimationCurve critical; public AnimationCurve hurt; public AnimationCurve healthy; public InputField healthInput; public Text healthyLabel; public Text hurtLabel; public Text criticalLabel; private float criticalValue = 0f; private float hurtValue = 0f; private float healthyValue = 0f; We start off by declaring some variables. The labelText is simply a constant we use to plug into our label. We replace {0} with the real value. Next, we declare the three AnimationCurve variables that we mentioned earlier. Making these public or otherwise accessible from the inspector is key to being able to edit them visually (though it is possible to construct curves by code), which is the whole point of using them. The following four variables are just references to UI elements that we saw earlier in the screenshot of our inspector, and the last three variables are the actual float values that our curves will evaluate into: private void Start () { SetLabels(); } /* * Evaluates all the curves and returns float values */ public void EvaluateStatements() { if (string.IsNullOrEmpty(healthInput.text)) { return; } float inputValue = float.Parse(healthInput.text); healthyValue = healthy.Evaluate(inputValue); hurtValue = hurt.Evaluate(inputValue); criticalValue = critical.Evaluate(inputValue); SetLabels(); } The Start() method doesn't require much explanation. We simply update our labels here so that they initialize to something other than the default text. The EvaluateStatements() method is much more interesting. We first do some simple null checking for our input string. We don't want to try and parse an empty string, so we return out of the function if it is empty. As mentioned earlier, there is no check in place to validate that you've input a numerical value, so be sure not to accidentally input a non-numerical value or you'll get an error. For each of the AnimationCurve variables, we call the Evaluate(float t) method, where we replace t with the parsed value we get from the input field. In the example we ran, that value would be 65. Then, we update our labels once again to display the values we got. The code looks similar to this: /* * Updates the GUI with the evluated values based * on the health percentage entered by the * user. */ private void SetLabels() { healthyLabel.text = string.Format(labelText, healthyValue); hurtLabel.text = string.Format(labelText, hurtValue); criticalLabel.text = string.Format(labelText, criticalValue); } } We simply take each label and replace the text with a formatted version of our labelText constant that replaces the {0} with the real value. To summarize, we learned how fuzzy logic is used in the real world, and how it can help illustrate vague concepts in a way binary systems cannot. We also learned to implement our own fuzzy logic controllers using the concepts of member functions, degrees of membership, and fuzzy sets. If you enjoyed this excerpt, check out the book Unity 2017 Game AI Programming - Third Edition, to build exciting and richer games by mastering advanced Artificial Intelligence concepts in Unity. Unity Machine Learning Agents: Transforming Games with Artificial Intelligence Put your game face on! Unity 2018.1 is now available How to create non-player Characters (NPC) with Unity 2018

0
0
46762

article-image-does-it-make-sense-to-talk-about-devops-engineers-or-devops-tools

Richard Gall

10 May 2019

6 min read

Does it make sense to talk about DevOps engineers or DevOps tools?

Richard Gall

10 May 2019

6 min read

DevOps engineers are in high demand - the job represents an engineering unicorn, someone that understands both development and operations and can help to foster a culture where the relationship between the two is almost frictionless. But there's some debate as to whether it makes sense to talk about a DevOps engineer at all. If DevOps is a culture of a set of practices that improves agility and empowers engineers to take more ownership over their work, should we really be thinking about DevOps as a single job that someone can simply train for? The quotes in this piece are taken from DevOps Paradox by Viktor Farcic, which will be published in June 2019. The book features interviews with a diverse range of figures drawn from across the DevOps world. Is DevOps engineer a 'real' job or just recruitment spin? Nirmal Mehta (@normalfaults), Technology Consultant at Booz Allen Hamilton, says "There's no such thing as a DevOps engineer. There shouldn't even be a DevOps team, because to me DevOps is more of a cultural and philosophical methodology, a process, and a way of thinking about things and communicating within an IT organization..." Mehta is cyncical about organizations that put out job descriptions asking for DevOps engineers. It is, he argues, a way of cutting costs - a way of simply doing more with less. "A DevOps engineer is just a job posting that signals an organization wants to hire one less person to do twice as much work rather than hire both a developer and an operator." This view is echoed by other figures associated with the DevOps world. Mike Kail (@mdkail), CTO at Everest, says "I certainly don't view DevOps as a tool or a job title. In my view, at the core, it's a cultural approach to leveraging automation and orchestration to streamline both code development, infrastructure, application deployments and subsequently, the managing of those resources." Similarly, Damian Duportal (@DamienDuportal), Træfik's Developer Advocate, says "there is no such thing as a DevOps engineer or even a DevOps team. The main purpose of DevOps is to focus on value, finding the optimal for the organization, and the value it will bring." For both Duportal and Kail, then, DevOps is primarily a cultural thing, something which needs to be embedded inside the practices of an organization. Is it useful to talk about a DevOps team? There are big question marks over the concept of a DevOps engineer. But what about a specific team? It's all well and good talking about organizational philosophy, but how do you actually affect change in a practical manner? Julian Simpson (@builddoctor), Neo4J's Global IT Manager is sceptical about the concept of a DevOps team: “Can we have something called a DevOps team? I don't believe so. You might spin up a team to solve a DevOps problem, but then I wouldn't even say we specifically have a DevOps problem. I'd say you just have a problem." DevOps consultant Chris Riley (@HoardingInfo) has a similar take, saying: “DevOps Engineer as a title makes sense to me, but I don't think you necessarily have DevOps departments, nor do you seek that out. Instead, I think DevOps is a principle that you spread throughout your entire development organization. Rather, you look to reform your organization in a way that supports those initiatives versus just saying that we need to build this DevOps unit, and there we go, we're done, we're DevOps. Because by doing that you really have to empower that unit and most organizations aren't willing to do that." However, Red Hat Solutions Architect Wian Vos (@wianvos) has a different take. For Vos the idea of a DevOps team is actually crucial if you are to cultivate a DevOps mindset inside your organization: "Imagine... you and I were going to start a company. We're going to need a DevOps team because we have a burning desire to put out this awesome application. The questions we have when we're putting together a DevOps team is both ‘Who are we hiring?’ and ‘What are we hiring for? Are we going to hire DevOps engineers? No. In that team, we want the best application developers, the best tester, and maybe we want a great infrastructure guy and a frontend/backend developer. I want people with specific roles who fit together as a team to be that DevOps team." For Vos, it's not so much about finding and hiring DevOps engineers - people with a specific set of skills and experience - but rather building a team that's constructed in such a way that it can put DevOps principles into practice. Is there such a thing as a DevOps tool? One of the interesting things about DevOps is that the debate seems to lead you into a bit of a bind. It's almost as if the more concrete we try and make it - turning it into a job, or a team - the less useful it becomes. This is particularly true when we consider tooling. Surely thinking about DevOps technologically, rather than speculatively makes it more real? In general, it appears there is a consensus against the idea of DevOps tools. On this point Julian Simpson said "my original thinking about the movement from 2009 onwards, when the name was coined, was that it would be about collaboration and perhaps the tools would sort of come out of that collaboration." James Turnbull (@kartar), CEO of Rethink Robotics is critical of the notion of DevOps tools. He says "I don't think there are such things as DevOps tools. I believe there are tools that make the process of being a cross-functional team better... Any tool that facilitates building that cross-functionality is probably a DevOps tool to the point where the term is likely meaningless." When it comes to DevOps, everyone's still learning With even industry figures disagreeing on what terms mean, or which ones are relevant, it's pretty clear that DevOps will remain a field that's contested and debated. But perhaps this is important - if we expect it to simply be a solution to the engineering challenges we face, it's already failed as a concept. However, if we understand it as a framework or mindset for solving problems then that is when it acquires greater potency. Viktor Farcic is a Developer Advocate at CloudBees, a member of the Google Developer Experts and Docker Captains groups, and published author. His big passions are DevOps, Microservices, Continuous Integration, Delivery and Deployment (CI/CD) and Test-Driven Development (TDD).

0
0
46702

Bayesian Network Fundamentals

Optimizing Salesforce Flows

What are Slowly changing Dimensions (SCD) and why you need them in your Data Warehouse?

How to Build a music recommendation system with PageRank Algorithm

Ten tips to successfully migrate from on-premise to Microsoft Azure

The Design Documentation

Working with shaders in C++ to create 3D games

Create a travel app with Xamarin [Tutorial]

Basic Website using Node.js and MySQL database

Setting Up OpenAI Playground

Trending Topics

Build a foodie bot with JavaScript

Introducing the Swift Programming Language

Building Scalable Microservices

Implementing fuzzy logic to bring AI characters alive in Unity based 3D games

Does it make sense to talk about DevOps engineers or DevOps tools?

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access