Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
Packt
25 Aug 2014
16 min read
Save for later

Solving problems – closest good restaurant

Packt
25 Aug 2014
16 min read
In this article by Steven F. Lott author of Python for Secret Agents, we will use Python to meet our secret informant at a good restaurant that's a reasonable distance from our base. In order to locate a good restaurant, we need to gather some additional information. In this case, good means a passing grade from the health inspectors. Before we can even have a meeting, we'll need to use basic espionage skills to locate the health code survey results for local restaurants. (For more resources related to this topic, see here.) We'll create a Python application to combine many things to sort through the results. We'll perform the following steps: We'll start with the restaurant health score information. We need to geocode the restaurant addresses if it hasn't been done already. In some cases, geocoding is done for us. In other cases, we'll be using a web service for this. We need to filter and organize restaurants by good scores. We'll also need to use our haversine() function to compute the distance from our base. Finally, we need to communicate this to our network, ideally using a short NAC code embedded within an image that we post to a social media site. In many cities, the health code data is available online. A careful search will reveal a useful dataset. In other cities, the health inspection data isn't readily available online. We might have to dig considerably deep to track down even a few restaurants near our base of operations. Some cities use Yelp to publicize restaurant health code inspection data. We can read about the YELP API to search for restaurants on the following link: http://www.yelp.com/developers/documentation We might also find some useful data on InfoChimps at http://www.infochimps.com/tags/restaurant. One complexity we often encounter is the use of HTML-based APIs for this kind of information. This is not intentional obfuscation, but the use of HTML complicates analysis of the data. Parsing HTML to extract meaningful information isn't easy; we'll need an extra library to handle this. We'll look at two approaches: good, clean data and more complex HTML data parsing. In both cases, we need to create a Python object that acts as a container for a collection of attributes. First, we'll divert to look at the SimpleNamespace class. Then, we'll use this to collect information. Creating simple Python objects We have a wide variety of ways to define our own Python objects. We can use the central built-in types such as dict to define an object that has a collection of attribute values. When looking at information for a restaurant, we could use something like this: some_place = { 'name': 'Secret Base', 'address': '333 Waterside Drive' } Since this is a mutable object, we can add attribute values and change the values of the existing attributes. The syntax is a bit clunky, though. Here's what an update to this object looks like: some_place['lat']= 36.844305 some_place['lng']= -76.29112 One common solution is to use a proper class definition. The syntax looks like this: class Restaurant: def __init__(self, name, address): self.name= name self.address= address We've defined a class with an initialization method, __init__(). The name of the initialization method is special, and only this name can be used. When the object is built, the initialization method is evaluated to assign initial values to the attributes of the object. This allows us to create an object more succinctly: some_place= Restaurant( name='Secret Base', address='333 Waterside Drive' ) We've used explicit keyword arguments. The use of name= and address= isn't required. However, as class definitions become more complex, it's often more flexible and more clear to use keyword argument values. We can update the object nicely too, as follows: This works out best when we have a lot of unique processing that is bound to each object. In this case, we don't actually have any processing to associate with the attributes; we just want to collect those attributes in a tidy capsule. The formal class definition is too much overhead for such a simple problem. Python also gives us a very flexible structure called a namespace. This is a mutable object that we can access using simple attribute names, as shown in the following code: from types import SimpleNamespace some_place= SimpleNamespace( name='Secret Base', address='333 Waterside Drive' ) The syntax to create a namespace must use keyword arguments (name='The Name'). Once we've created this object, we can update it using a pleasant attribute access, as shown in the following snippet: some_place.lat= 36.844305 some_place.lng= -76.29112 The SimpleNamespace class gives us a way to build an object that contains a number of individual attribute values. We can also create a namespace from a dictionary using Python's ** notation. Here's an example: >>> SimpleNamespace( **{'name': 'Secret Base', 'address': '333 Waterside Drive'} ) namespace(address='333 Waterside Drive', name='Secret Base') The ** notation tells Python that a dictionary object contains keyword arguments for the function. The dictionary keys are the parameter names. This allows us to build a dictionary object and then use it as the arguments to a function. Recall that JSON tends to encode complex data structures as a dictionary. Using this ** technique, we can transform a JSON dictionary into SimpleNamespace, and replace the clunky object['key'] notation with a cleaner object.key notation. Working with HTML web services – tools In some cases, the data we want is tied up in HTML websites. The City of Norfolk, for example, relies on the State of Virginia's VDH health portal to store its restaurant health code inspection data. In order to make sense of the intelligence encoded in the HTML notation on the WWW, we need to be able to parse the HTML markup that surrounds the data. Our job is greatly simplified by the use of special higher-powered weaponry; in this case, BeautifulSoup. Start with https://pypi.python.org/pypi/beautifulsoup4/4.3.2 or http://www.crummy.com/software/BeautifulSoup/. If we have Easy Install (or PIP), we can use these tools to install BeautifulSoup. We can use Easy Install to install BeautifulSoup like this: sudo easy_install-3.3 beautifulsoup4 Mac OS X and GNU/Linux users will need to use the sudo command. Windows users won't use the sudo command. Once we have BeautifulSoup, we can use it to parse the HTML code looking for specific facts buried in an otherwise cryptic jumble of HTML tags. Before we can go on, you'll need to read the quickstart documentation and bring yourself up to speed on BeautifulSoup. Once you've done that, we'll move to extracting data from HTML web pages. Start with http://www.crummy.com/software/BeautifulSoup/bs4/doc/#quick-start. An alternative tool is scrapy. For information see http://scrapy.org. Also, read Instant Scrapy Web Mining and Scraping, Travis Briggs, Packt Publishing, for details on using this tool. Unfortunately, as of this writing, scrapy is focused on Python 2, not Python 3. Working with HTML web services – getting the page In the case of VDH health data for the City of Norfolk, the HTML scraping is reasonably simple. We can leverage the strengths of BeautifulSoup to dig into the HTML page very nicely. Once we've created a BeautifulSoup object from the HTML page, we will have an elegant technique to navigate down through the hierarchy of the HTML tags. Each HTML tag name (html, body, and so on) is also a BeautifulSoup query that locates the first instance of that tag. An expression such as soup.html.body.table can locate the first <table> in the HTML <body> tag. In the case of the VDH restaurant data, that's precisely the data we want. Once we've found the table, we need to extract the rows. The HTML tag for each row is <tr> and we can use the BeautifulSoup table.find_all("tr") expression to locate all rows within a given <table> tag. Each tag's text is an attribute, .text. If the tag has attributes, we can treat the tag as if it's a dictionary to extract the attribute values. We'll break down the processing of the VDH restaurant data into two parts: the web services query that builds Soup from HTML and the HTML parsing to gather restaurant information. Here's the first part, which is getting the raw BeautifulSoup object: scheme_host= "http://healthspace.com" def get_food_list_by_name(): path= "/Clients/VDH/Norfolk/Norolk_Website.nsf/Food-List-ByName" form = { "OpenView": "", "RestrictToCategory": "FAA4E68B1BBBB48F008D02BF09DD656F", "count": "400", "start": "1", } query= urllib.parse.urlencode( form ) with urllib.request.urlopen(scheme_host + path + "?" + query) as data: soup= BeautifulSoup( data.read() ) return soup This repeats the web services queries we've seen before. We've separated three things here: the scheme_host string, the path string, and query. The reason for this is that our overall script will be using the scheme_host with other paths. And we'll be plugging in lots of different query data. For this basic food_list_by_name query, we've built a form that will get 400 restaurant inspections. The RestrictToCategory field in the form has a magical key that we must provide to get the Norfolk restaurants. We found this via a basic web espionage technique: we poked around on the website and checked the URLs used when we clicked on each of the links. We also used the Developer mode of Safari to explore the page source. In the long run, we want all of the inspections. To get started, we've limited ourselves to 400 so that we don't spend too long waiting to run a test of our script. The response object was used by BeautifulSoup to create an internal representation of the web page. We assigned this to the soup variable and returned it as the result of the function. In addition to returning the soup object, it can also be instructive to print it. It's quite a big pile of HTML. We'll need to parse this to get the interesting details away from the markup. Working with HTML web services – parsing a table Once we have a page of HTML information parsed into a BeautifulSoup object, we can examine the details of that page. Here's a function that will locate the table of restaurant inspection details buried inside the page. We'll use a generator function to yield each individual row of the table, as shown in the following code: def food_table_iter( soup ): """Columns are 'Name', '', 'Facility Location', 'Last Inspection', Plus an unnamed column with a RestrictToCategory key """ table= soup.html.body.table for row in table.find_all("tr"): columns = [ td.text.strip() for td in row.find_all("td") ] for td in row.find_all("td"): if td.a: url= urllib.parse.urlparse( td.a["href"] ) form= urllib.parse.parse_qs( url.query ) columns.append( form['RestrictToCategory'][0] ) yield columns Notice that this function begins with a triple-quoted string. This is a docstring and it provides documentation about the function. Good Python style insists on a docstring in every function. The Python help system will display the docstrings for functions, modules, and classes. We've omitted them to save space. Here, we included it because the results of this particular iterator can be quite confusing. This function requires a parsed Soup object. The function uses simple tag navigation to locate the first <table> tag in the HTML <body> tag. It then uses the table's find_all() method to locate all of the rows within that table. For each row, there are two pieces of processing. First, a generator expression is used to find all the <td> tags within that row. Each <td> tag's text is stripped of excess white space and the collection forms a list of cell values. In some cases, this kind of processing is sufficient. In this case, however, we also need to decode an HTML <a> tag, which has a reference to the details for a given restaurant. We use a second find_all("td") expression to examine each column again. Within each column, we check for the presence of an <a> tag using a simple if td.a: loop. If there is an <a> tag, we can get the value of the href attribute on that tag. When looking at the source HTML, this is the value inside the quotes of <a href="">. This value of an HTML href attribute is a URL. We don't actually need the whole URL. We only need the query string within the URL. We've used the urllib.parse.urlparse() function to extract the various bits and pieces of the URL. The value of the url.query attribute is just the query string, after the ?. It turns out, we don't even want the entire query string; we only want the value for the key RestrictToCategory. We can parse the query string with urllib.parse.parse_qs() to get a form-like dictionary, which we assigned to the variable form. This function is the inverse of urllib.parse.urlencode(). The dictionary built by the parse_qs() function associates each key with a list of values. We only want the first value, so we use form['RestrictToCategory'][0] to get the key required for a restaurant. Since this food_table_iter () function is a generator, it must be used with a for statement or another generator function. We can use this function with a for statement as follows: for row in food_table_iter(get_food_list_by_name()): print(row) This prints each row of data from the HTML table. It starts like this: ['Name', '', 'Facility Location', 'Last Inspection'] ["Todd's Refresher", '', '150 W. Main St #100', '6-May-2014', '43F6BE8576FFC376852574CF005E3FC0'] ["'Chick-fil-A", '', '1205 N Military Highway', '13-Jun-2014', '5BDECD68B879FA8C8525784E005B9926'] This goes on for 400 locations. The results are unsatisfying because each row is a flat list of attributes. The name is in row[0] and the address in row[2]. This kind of reference to columns by position can be obscure. It would be much nicer to have named attributes. If we convert the results to a SimpleNamespace object, we can then use the row.name and row.address syntax. Making a simple Python object from columns of data We really want to work with an object that has easy-to-remember attribute names and not a sequence of anonymous column names. Here's a generator function that will build a SimpleNamespace object from a sequence of values produced by a function such as the food_table_iter() function: def food_row_iter( table_iter ): heading= next(table_iter) assert ['Name', '', 'Facility Location', 'Last Inspection'] == heading for row in table_iter: yield SimpleNamespace( name= row[0], address= row[2], last_inspection= row[3], category= row[4] ) This function's argument must be an iterator like food_table_iter(get_food_list_by_name()). The function uses next(table_iter) to grab the first row, since that's only going to be a bunch of column titles. We'll assert that the column titles really are the standard column titles in the VDH data. If the assertion ever fails, it's a hint that VDH web data has changed. For every row after the first row, we build a SimpleNamespace object by taking the specific columns from each row and assigning them nice names. We can use this function as follows: soup= get_food_list_by_name() raw_columns= food_table_iter(soup) for business in food_row_iter( raw_column ): print( business.name, business.address ) The processing can now use nice attribute names, for example, business.name, to refer to the data we extracted from the HTML page. This makes the rest of the programming meaningful and clear. What's also important is that we've combined two generator functions. The food_table_iter() function will yield small lists built from HTML table rows. The food_row_iter() function expects a sequence of lists that can be iterated, and will build SimpleNamespace objects from that sequence of lists. This defines a kind of composite processing pipeline built from smaller steps. Each row of the HTML table that starts in food_table_iter() is touched by food_row_iter() and winds up being processed by the print() function. Continuing down this path The next steps are also excellent examples of the strengths of Python for espionage purposes. We need to geocode the restaurant addresses if it hasn't been done already. In some cases, geocoding is done for us. In other cases, we'll be using a web service for this. It varies from city to city whether or not the data is geocoded. One popular geocoding service (Google) can be accessed using Python's httplib and json modules. In a few lines of code we can extract the location of an address. We'll also need to implement the haversine formula for computing the distances between two points on the globe. This is not only easy, but the code is available on the web as a tidy example of good Python programming. Well worth an agent's time to search for this code. Once we have the raw data on good restaurants close to our secret lair, we still need to filter and make the final decision. Given the work done in the previous steps, it's a short, clear Python loop that will show a list of restaurants with top health scores within short distances of our lair. As we noted above, we'll also need to communicate this. We can use steganography to encode a message into an image file. In addition to data scraping from the web, and using web services, Python is also suitable for this kind of bit-and-byte-level fiddling with the internals of a TIFF image. Every secret agent can leverage Python for gathering, analyzing and distributing information. Summary In this article we learned about different functionalities in OpenCV 3.0. Resources for Article: Further resources on this subject: Getting Started with Python 2.6 Text Processing [article] Python 3: Building a Wiki Application [article] Python 3: Designing a Tasklist Application [article]
Read more
  • 0
  • 0
  • 21797

article-image-measures-and-measure-groups-microsoft-analysis-services-part-2
Packt
15 Oct 2009
20 min read
Save for later

Measures and Measure Groups in Microsoft Analysis Services: Part 2

Packt
15 Oct 2009
20 min read
Measure groups All but the simplest data warehouses will contain multiple fact tables, and Analysis Services allows you to build a single cube on top of multiple fact tables through the creation of multiple measure groups. These measure groups can contain different dimensions and be at different granularities, but so long as you model your cube correctly, your users will be able to use measures from each of these measure groups in their queries easily and without worrying about the underlying complexity. Creating multiple measure groups To create a new measure group in the Cube Editor, go to the Cube Structure tab and right-click on the cube name in the Measures pane and select 'New Measure Group'. You'll then need to select the fact table to create the measure group from and then the new measure group will be created; any columns that aren't used as foreign key columns in the DSV will automatically be created as measures, and you'll also get an extra measure of aggregation type Count. It's a good idea to delete any measures you are not going to use at this stage. Once you've created a new measure group, BIDS will try to set up relationships between it and any existing dimensions in your cube based on the relationships you've defined in your DSV. Since doing this manually can be time-consuming, this is another great reason for defining relationships in the DSV. You can check the relationships that have been created on the Dimension Usage tab of the Cube Editor: In Analysis Services 2005, it was true in some cases that query performance was better on cubes with fewer measure groups, and that breaking a large cube with many measure groups up into many smaller cubes with only one or two measure groups could result in faster queries. This is no longer the case in Analysis Services 2008. Although there are other reasons why you might want to consider creating separate cubes for each measure group, this is still something of a controversial subject amongst Analysis Services developers. The advantages of a single cube approach are: All of your data is in one place. If your users need to display measures from multiple measure groups, or you need to create calculations that span measure groups, everything is already in place. You only have one cube to manage security and calculations on; with multiple cubes the same security and calculations might have to be duplicated. The advantages of the multiple cube approach are: If you have a complex cube but have to use Standard Edition, you cannot use Perspectives to hide complexity from your users. In this case, creating multiple cubes might be a more user-friendly approach. Depending on your requirements, security might be easier to manage with multiple cubes. It's very easy to grant or deny a role access to a cube; it's much harder to use dimension security to control which measures and dimensions in a multi-measure group cube a role can access. If you have complex calculations, especially MDX Script assignments, it's too easy to write a calculation that has an effect on part of the cube you didn't want to alter. With multiple cubes, the chances of this happening are reduced. Creating measure groups from dimension tables Measure groups don't always have to be created from fact tables. In many cases, it can be useful to build measure groups from dimension tables too. One common scenario where you might want to do this is when you want to create a measure that counts the number of days in the currently selected time period, so if you had selected a year on your Time dimension's hierarchy, the measure would show the number of days in the year. You could implement this with a calculated measure in MDX, but it would be hard to write code that worked in all possible circumstances, such as when a user multi-selects time periods. In fact, it's a better idea to create a new measure group from your Time dimension table containing a new measure with AggregateFunction Count, so you're simply counting the number of days as the number of rows in the dimension table. This measure will perform faster and always return the values you expect. This post on Mosha Pasumansky's blog discusses the problem in more detail: http://tinyurl.com/moshadays MDX formulas vs pre-calculating valuesIf you can somehow model a calculation into the structure of your cube, or perform it in your ETL, you should do so in preference to doing it in MDX only so long as you do not compromise the functionality of your cube. A pure MDX approach will be the most flexible and maintainable since it only involves writing code, and if calculation logic needs to change, then you just need to redeploy your updated MDX Script; doing calculations upstream in the ETL can be much more time-consuming to implement and if you decide to change your calculation logic, then it could involve reloading one or more tables. However, an MDX calculation, even one that is properly tuned, will of course never perform as well as a pre-calculated value or a regular measure. The day count measure, discussed in the previous paragraph, is a perfect example of where a cube-modeling approach trumps MDX. If your aim was to create a measure that showed average daily sales, though, it would make no sense to try to pre-calculate all possible values since that would be far too time-consuming and would result in a non-aggregatable measure. The best solution here would be a hybrid: create real measures for sales and day count, and then create an MDX calculated measure that divided the former by the latter. However, it's always necessary to consider the type of calculation, the volume of data involved and the chances of the calculation algorithm changing in the future before you can make an informed decision on which approach to take. Handling different dimensionality When you have different measure groups in a cube, they are almost always going to have different dimensions associated with them; indeed, if you have measure groups that have identical dimensionality, you might consider combining them into a single measure group if it is convenient to do so. As we've already seen, the Dimension Usage tab shows us which dimensions have relationships with which measure groups. When a dimension has a relationship with a measure group it goes without saying that making a selection on that dimension will affect the values that are displayed for measures on that measure group. But what happens to measures when you make a selection on a dimension that has no relationship with a measure group? In fact, you have two options here, controlled by the IgnoreUnrelatedDimensions property of a measure group: IgnoreUnrelatedDimensions=False displays a null value for all members below the root (the intersection of all of the All Members or default members on every hierarchy) of the dimension, except the Unknown member, or IgnoreUnrelatedDimensions=True repeats the value displayed at the root of the dimension for every member on every hierarchy of the dimension. This is the default state. The screenshot below shows what happens for two otherwise identical measures from measure groups which have IgnoreUnrelatedDimensions set to True and to False when they're displayed next to a dimension they have no relationship with: It's usually best to keep IgnoreUnrelatedDimensions set to True since if the users are querying measures from multiple measure groups, then they don't want some of their selected measures suddenly returning null if they slice by a dimension that has a regular relationship with their other selected measures. Handling different granularities Even when measure groups share the same dimensions, they may not share the same granularity. For example, we may hold sales information in one fact table down to the day level, but also hold sales quotas in another fact table at the quarter level. If we created measure groups from both these fact tables, then they would both have regular relationships with our Time dimension but at different granularities. Normally, when you create a regular relationship between a dimension and a measure group, Analysis Services will join the columns specified in the KeyColumns property of the key attribute of the dimension with the appropriate foreign key columns of the fact table (note that during processing, Analysis Services won't usually do the join in SQL, it does it internally). However, when you have a fact table of a higher granularity, you need to change the granularity attribute property of the relationship to choose the attribute from the dimension you do want to join on instead: In the previous screenshot, we can see an amber warning triangle telling us that by selecting a non-key attribute, the server may have trouble aggregating measure values. What does this mean exactly? Let's take a look at the attribute relationships defined on our Time dimension again: If we're loading data at the Quarter level, what do we expect to see at the Month and Date level? We can only expect to see useful values at the level of the granularity attribute we've chosen, and for only those attributes whose values can be derived from that attribute; this is yet another good reason to make sure your attribute relationships have been optimized. Below the granularity attribute, we've got the same options regarding what gets displayed as we had with dimensions that have no relationship at all with a measure group: either repeated values or null values. The IgnoreUnrelatedDimensions property is again used to control this behavior. Unfortunately, the default True setting for IgnoreUnrelatedDimensions is usually not the option you want to use in this scenario (users usually prefer to see nulls below the granularity of a measure in our experience) and this may conflict with how we want to set IgnoreUnrelatedDimensions to control the behavior of dimensions which have no relationship with a measure group. There are ways of resolving this conflict such as using MDX Script assignments to set cell values to null or by using the ValidMeasure() MDX function, but none are particularly elegant. Non-aggregatable measures: a different approach We've already seen how we can use parent/child hierarchies to load non-aggregatable measure values into our cube. However, given the problems associated with using parent/child hierarchies and knowing what we now know about measure groups, let's consider a different approach to solving this problem. A non-aggregatable measure will have, by its very nature, data stored for many different granularities of a dimension. Rather than storing all of these different granularities of values in the same fact table, we could create multiple fact tables for each granularity of value. Having built measure groups from these fact tables, we would then be able to join our dimension to each of them with a regular relationship but at different granularities. We'd then be in the position of having multiple measures representing the different granularities of a single, logical measure. What we actually want is a single non-aggregatable measure, and we can get this by using MDX Script assignments to combine the different granularities. Let's say we have a regular (non-parent/child) dimension called Employee with three attributes Manager, Team Leader and Sales Person, and a logical non-aggregatable measure called Sales Quota appearing in three measure groups as three measures called Sales Amount Quota_Manager, Sales Amount Quota_TeamLead and Sales Amount Quota for each of these three granularities. Here's a screenshot showing what a query against this cube would show at this stage: We can combine the three measures into one like this: SCOPE([Measures].[Sales Amount Quota]); SCOPE([Employee].[Salesperson].[All]); THIS=[Measures].[Sales Amount Quota_TeamLead]; END SCOPE; SCOPE([Employee].[Team Lead].[All]); THIS=[Measures].[Sales Amount Quota_Manager]; END SCOPE;END SCOPE; This code takes the lowest granularity measure Sales Amount Quota, and then overwrites it twice: the first assignment replaces all of the values above the Sales Person granularity with the value of the measure containing Sales Amount Quota for Team Leaders; the second assignment then replaces all of the values above the Team Leader granularity with the value of the measure containing Sales Quotas for Managers. Once we've set Visible=False for the Sales Amount Quota_TeamLead and Sales Amount Quota_Manager measures, we're left with just the Sales Amount Quota measure visible, thus displaying the non-aggregatable values that we wanted. The user would then see this: Using linked dimensions and measure groups Creating linked dimensions and measure groups allows you to share the same dimensions and measure groups across separate Analysis Services databases, and the same measure group across multiple cubes. To do this, all you need to do is to run the 'New Linked Object' wizard from the Cube Editor, either by clicking on the button in the toolbar on the Cube Structure or Dimension Usage tabs, or by selecting it from the right-click menu in the Measures pane of the Cube Structure tab. Doing this has the advantage of reducing the amount of processing and maintenance needed: instead of having many identical dimensions and measure groups to maintain and keep synchronized, all of which need processing separately, you can have a single object which only needs to be changed and processed once. At least that's the theory—in practice, linked objects are not as widely used as they could be because there are a number of limitations in their use: Linked objects represent a static snapshot of the metadata of the source object, and any changes to the source object are not passed through to the linked object. So for example, if you create a linked dimension and then add an attribute to the source dimension, you then have to delete and recreate the linked dimension—there's no option to refresh a linked object. You can also import the calculations defined in the MDX Script of the source cube using the wizard. However, you can only import the entire script and this may include references to objects present in the source cube that aren't in the target cube, and which may need to be deleted to prevent errors. The calculations that remain will also need to be updated manually when those in the source cube are changed, and if there are a lot, this can add an unwelcome maintenance overhead. A linked measure group can only be used with dimensions from the same database as the source measure group. This isn't a problem when you're sharing measure groups between cubes in the same database, but could be if you wanted to share measure groups across databases. As you would expect, when you query a linked measure group, your query is redirected to the source measure group. If the source measure group is on a different server, this may introduce some latency and hurt query performance. Analysis Services does try to mitigate this by doing some caching on the linked measure group's database, though. By default, it will cache data on a per-query basis, but if you change the RefreshPolicy property from ByQuery to ByInterval you can specify a time limit for data to be held in cache. Linked objects can be useful when cube development is split between multiple development teams, or when you need to create multiple cubes containing some shared data, but, in general, we recommend against using them widely because of these limitations. Role-playing dimensions It's also possible to add the same dimension to a cube more than once, and give each instance a different relationship to the same measure group. For example, in our Sales fact table, we might have several different foreign key columns that join to our Time dimension table: one which holds the date an order was placed on, one which holds the date it was shipped from the warehouse, and one which holds the date the order should arrive with the customer. In Analysis Services, we can create a single physical Time dimension in our database, which is referred to as a database dimension, and then add it three times to the cube to create three 'cube dimensions', renaming each cube dimension to something like Order Date, Ship Date and Due Date. These three cube dimensions are referred to as role-playing dimensions: the same dimension is playing three different roles in the same cube. Role playing dimensions are a very useful feature. They reduce maintenance overheads because you only need to edit one dimension, and unlike linked dimensions, any changes made to the underlying database dimension are propagated to all of the cube dimensions that are based on it. They also reduce processing time because you only need to process the database dimension once. However, there is one frustrating limitation with role-playing dimensions and that is that while you can override certain properties of the database dimension on a per-cube dimension basis, you can't change the name of any of the attributes or hierarchies of a cube dimension. So if you have a user hierarchy called 'Calendar' on your database dimension, all of your cube dimensions will also have a user hierarchy called 'Calendar', and your users might find it difficult to tell which hierarchy is which in certain client tools (Excel 2003 is particularly bad in this respect) or in reports. Unfortunately, we have seen numerous cases where this problem alone meant role-playing dimensions couldn't be used. Dimension/measure group relationships So far we've seen dimensions either having no relationship with a measure group or having a regular relationship, but that's not the whole story: there are many different types of relationships that a dimension can have with a measure group. Here's the complete list: No relationship Regular Fact Referenced Many-to-Many Data Mining Fact relationships Fact or degenerate dimensions are dimensions that are built directly from columns in a fact table, not from a separate dimension table. From an Analysis Services dimension point of view, they are no different from any other kind of dimension, except that there is a special fact relationship type that a dimension can have with a measure group. There are in fact very few differences between a fact relationship and a regular relationship, and they are: A fact relationship will result in marginally more efficient SQL being generated when the fact dimension is used in ROLAP drillthrough. Fact relationships are visible to client tools in the cube's metadata, so client tools may choose to display fact dimensions differently. A fact relationship can only be defined on dimensions and measure groups that are based on the same table in the DSV. A measure group can only have a fact relationship with one database dimension. It can have more than one fact relationship, but all of them have to be with cube dimensions based on the same database dimension. It still makes sense though to define relationships as fact relationships when you can. Apart from the reasons given above, the functionality might change in future versions of Analysis Services and fact relationship types might be further optimized in some way. Referenced relationships A referenced relationship is where a dimension joins to a measure group through another dimension. For example, you might have a Customer dimension that includes geographic attributes up to and including a customer's country; also, your organization might divide the world up into international regions such as North America, Europe, Middle East and Africa (EMEA), Latin America (LATAM) and Asia-Pacific and so on for financial reporting, and you might build a dimension for this too. If your sales fact table only contained a foreign key for the Customer dimension, but you wanted to analyze sales by international region, you would be able to create a referenced relationship from the Region dimension through the Customer dimension to the Sales measure group. When setting up a referenced relationship in the Define Relationship dialog in the Dimension Usage tab, you're asked to first choose the dimension that you wish to join through and then which attribute on the reference dimension joins to which attribute on the intermediate dimension: When the join is made between the attributes you've chosen on the reference dimension, once again it's the values in the columns that are defined in the KeyColumns property of each attribute that you're in fact joining on. The Materialize checkbox is automatically checked, and this ensures maximum query performance by resolving the join between the dimensions at processing time, which can lead to a significant decrease in processing performance. Unchecking this box means that no penalty is paid at processing time but query performance may be worse. The question you may well be asking yourself at this stage is: why bother to use referenced relationships at all? It is in fact a good question to ask, because, in general, it's better to include all of the attributes you need in a single Analysis Services dimension built from multiple tables rather than use a referenced relationship. The single dimension approach will perform better and is more user-friendly: for example, you can't define user hierarchies that span a reference dimension and its intermediate dimension. That said, there are situations where referenced relationships are useful because it's simply not feasible to add all of the attributes you need to a dimension. You might have a Customer dimension, for instance, that has a number of attributes representing dates—the date of a customer's first purchase, the date of a customer's tenth purchase, the date of a customer's last purchase and so on. If you had created these attributes with keys that matched the surrogate keys of your Time dimension, you could create multiple, referenced (but not materialized) role-playing Time dimensions joined to each of these attributes that would give you the ability to analyze each of these dates. You certainly wouldn't want to duplicate all of the attributes from your Time dimension for each of these dates in your Customer dimension. Another good use for referenced relationships is when you want to create multiple parent/child hierarchies from the same dimension table Data mining relationships The data mining functionality of Analysis Services is outside the scope of this article, so we won't spend much time on the data mining relationship type. Suffice to say that when you create an Analysis Services mining structure from data sourced from a cube, you have the option of using that mining structure as the source for a special type of dimension, called a data mining dimension. The wizard will also create a new cube containing linked copies of all of the dimensions and measure groups in the source cube, plus the new data mining dimension, which then has a data mining relationships with the measure groups. Summary In this part, we focused on how to create new measure groups and handle the problems of different dimensionality and granularity, and looked at the different types of relationships that are possible between dimensions and measure groups.
Read more
  • 0
  • 0
  • 21796

article-image-perform-data-partitioning-postgresql-10
Sugandha Lahoti
09 Mar 2018
11 min read
Save for later

How to perform data partitioning in PostgreSQL 10

Sugandha Lahoti
09 Mar 2018
11 min read
Partitioning refers to splitting; logically it means breaking one large table into smaller physical pieces. PostgreSQL supports basic table partitioning. It can store up to 32 TB of data inside a single table, which are by default 8k blocks. Infact, if we compile PostgreSQL with 32k blocks, we can even put up to 128 TB into a single table. However, large tables like these are not necessarily too convenient and it makes sense to partition tables to enable processing easier, and in some cases, a bit faster. With PostgreSQL 10.0, partitioning data has improved and offers significantly easier handling of partitioning data to the end users. In this article, we will talk about both, the classic way to partition data as well as the new features available on PostgreSQL 10.0 to perform data partitioning. Creating partitions First, we will learn the old method to partition data. Before digging deeper into the advantages of partitioning, I want to show how partitions can be created. The entire thing starts with a parent table: test=# CREATE TABLE t_data (id serial, t date, payload text); CREATE TABLE In this example, the parent table has three columns. The date column will be used for partitioning but more on that a bit later. Now that the parent table is in place, the child tables can be created. This is how it works: test=# CREATE TABLE t_data_2016 () INHERITS (t_data); CREATE TABLE test=# d t_data_2016 Table "public.t_data_2016" Column | Type  | Modifiers ---------+---------+----------------------------------------------------- id   | integer | not null default nextval('t_data_id_seq'::regclass) t    | date | payload | text   | Inherits: t_data The table is called t_data_2016 and inherits from t_data.  () means that no extra columns are added to the child table. As you can see, inheritance means that all columns from the parents are available in the child table. Also note that the id column will inherit the sequence from the parent so that all children can share the very same numbering. Let's create more tables: test=# CREATE TABLE t_data_2015 () INHERITS (t_data); CREATE TABLE test=# CREATE TABLE t_data_2014 () INHERITS (t_data); CREATE TABLE So far, all the tables are identical and just inherit from the parent. However, there is more: child tables can actually have more columns than parents. Adding more fields is simple: test=# CREATE TABLE t_data_2013 (special text) INHERITS (t_data); CREATE TABLE In this case, a special column has been added. It has no impact on the parent, but just enriches the children and makes them capable of holding more data. After creating a handful of tables, a row can be added: test=# INSERT INTO t_data_2015 (t, payload) VALUES ('2015-05-04', 'some data'); INSERT 0 1 The most important thing now is that the parent table can be used to find all the data in the child tables: test=# SELECT * FROM t_data; id |   t   | payload ----+------------+----------- 1 | 2015-05-04 | some data (1 row) Querying the parent allows you to gain access to everything below the parent in a simple and efficient manner. To understand how PostgreSQL does partitioning, it makes sense to take a look at the plan: test=# EXPLAIN SELECT * FROM t_data; QUERY PLAN ----------------------------------------------------------------- Append (cost=0.00..84.10 rows=4411 width=40) -> Seq Scan on t_data (cost=0.00..0.00 rows=1 width=40) -> Seq Scan on t_data_2016 (cost=0.00..22.00 rows=1200 width=40) -> Seq Scan on t_data_2015 (cost=0.00..22.00 rows=1200 width=40) -> Seq Scan on t_data_2014 (cost=0.00..22.00 rows=1200 width=40) -> Seq Scan on t_data_2013 (cost=0.00..18.10 rows=810 width=40) (6 rows) Actually, the process is quite simple. PostgreSQL will simply unify all tables and show us all the content from all the tables inside and below the partition we are looking at. Note that all tables are independent and are just connected logically through the system catalog. Applying table constraints What happens if filters are applied? test=# EXPLAIN SELECT * FROM t_data WHERE t = '2016-01-04'; QUERY PLAN ----------------------------------------------------------------- Append (cost=0.00..95.12 rows=23 width=40) -> Seq Scan on t_data (cost=0.00..0.00 rows=1 width=40) Filter: (t = '2016-01-04'::date) -> Seq Scan on t_data_2016 (cost=0.00..25.00 rows=6 width=40) Filter: (t = '2016-01-04'::date) -> Seq Scan on t_data_2015 (cost=0.00..25.00 rows=6 width=40) Filter: (t = '2016-01-04'::date) -> Seq Scan on t_data_2014 (cost=0.00..25.00 rows=6 width=40) Filter: (t = '2016-01-04'::date) -> Seq Scan on t_data_2013 (cost=0.00..20.12 rows=4 width=40) Filter: (t = '2016-01-04'::date) (11 rows) PostgreSQL will apply the filter to all the partitions in the structure. It does not know that the table name is somehow related to the content of the tables. To the database, names are just names and have nothing to do with what you are looking for. This makes sense, of course, as there is no mathematical justification for doing anything else. The point now is: how can we teach the database that the 2016 table only contains 2016 data, the 2015 table only contains 2015 data, and so on? Table constraints are here to do exactly that. They teach PostgreSQL about the content of those tables and therefore allow the planner to make smarter decisions than before. The feature is called constraint exclusion and helps dramatically to speed up queries in many cases. The following listing shows how table constraints can be created: test=# ALTER TABLE t_data_2013 ADD CHECK (t < '2014-01-01'); ALTER TABLE test=# ALTER TABLE t_data_2014 ADD CHECK (t >= '2014-01-01' AND t < '2015-01-01'); ALTER TABLE test=# ALTER TABLE t_data_2015 ADD CHECK (t >= '2015-01-01' AND t < '2016-01-01'); ALTER TABLE test=# ALTER TABLE t_data_2016 ADD CHECK (t >= '2016-01-01' AND t < '2017-01-01'); ALTER TABLE For each table, a CHECK constraint can be added. PostgreSQL will only create the constraint if all the data in those tables is perfectly correct and if every single row satisfies the constraint. In contrast to MySQL, constraints in PostgreSQL are taken seriously and honored under any circumstances. In PostgreSQL, those constraints can overlap--this is not forbidden and can make sense in some cases. However, it is usually better to have non-overlapping constraints because PostgreSQL has the option to prune more tables. Here is what happens after adding those table constraints: test=# EXPLAIN SELECT * FROM t_data WHERE t = '2016-01-04'; QUERY PLAN ----------------------------------------------------------------- Append (cost=0.00..25.00 rows=7 width=40) -> Seq Scan on t_data (cost=0.00..0.00 rows=1 width=40) Filter: (t = '2016-01-04'::date) -> Seq Scan on t_data_2016 (cost=0.00..25.00 rows=6 width=40) Filter: (t = '2016-01-04'::date) (5 rows) The planner will be able to remove many of the tables from the query and only keep those which potentially contain the data. The query can greatly benefit from a shorter and more efficient plan. In particular, if those tables are really large, removing them can boost speed considerably. Modifying inherited structures Once in a while, data structures have to be modified. The ALTER  TABLE clause is here to do exactly that. The question is: how can partitioned tables be modified? Basically, all you have to do is tackle the parent table and add or remove columns. PostgreSQL will automatically propagate those changes through to the child tables and ensure that changes are made to all the relations, as follows: test=# ALTER TABLE t_data ADD COLUMN x int; ALTER TABLE test=# d t_data_2016 Table "public.t_data_2016" Column |   Type   | Modifiers ---------+---------+----------------------------------------------------- id | integer | not null default t | date | payload |  text | x | integer | Check constraints: nextval('t_data_id_seq'::regclass) "t_data_2016_t_check" CHECK (t >= '2016-01-01'::date AND t < '2017-01-01'::date) Inherits: t_data As you can see, the column is added to the parent and automatically added to the child table here. Note that this works for columns, and so on. Indexes are a totally different story. In an inherited structure, every table has to be indexed separately. If you add an index to the parent table, it will only be present on the parent-it won't be deployed on those child tables. Indexing all those columns in all those tables is your task and PostgreSQL is not going to make those decisions for you. Of course, this can be seen as a feature or as a limitation. On the upside, you could say that PostgreSQL gives you all the flexibility to index things separately and therefore potentially more efficiently. However, people might also argue that deploying all those indexes one by one is a lot more work. Moving tables in and out of partitioned structures Suppose you have an inherited structure. Data is partitioned by date and you want to provide the most recent years to the end user. At some point, you might want to remove some data from the scope of the user without actually touching it. You might want to put data into some sort of archive or something. PostgreSQL provides a simple means to achieve exactly that. First, a new parent can be created: test=# CREATE TABLE t_history (LIKE t_data); CREATE TABLE The LIKE keyword allows you to create a table which has exactly the same layout as the t_data table. If you have forgotten which columns the t_data table actually has, this might come in handy as it saves you a lot of work. It is also possible to include indexes, constraints, and defaults. Then, the table can be moved away from the old parent table and put below the new one. Here is how it works: test=# ALTER TABLE t_data_2013 NO INHERIT t_data; ALTER TABLE test=# ALTER TABLE t_data_2013 INHERIT t_history; ALTER TABLE The entire process can of course be done in a single transaction to assure that the operation stays atomic. Cleaning up data One advantage of partitioned tables is the ability to clean data up quickly. Suppose that we want to delete an entire year. If data is partitioned accordingly, a simple DROP  TABLE clause can do the job: test=# DROP TABLE t_data_2014; DROP TABLE As you can see, dropping a child table is easy. But what about the parent table? There are depending objects and therefore PostgreSQL naturally errors out to make sure that nothing unexpected happens: test=# DROP TABLE t_data; ERROR: cannot drop table t_data because other objects depend on it DETAIL: default for table t_data_2013 column id depends on sequence t_data_id_seq table t_data_2016 depends on table t_data table t_data_2015 depends on table t_data HINT: Use DROP ... CASCADE to drop the dependent objects too. The DROP  TABLE clause will warn us that there are depending objects and refuses to drop those tables. The CASCADE clause is needed to force PostgreSQL to actually remove those objects, along with the parent table: test=# DROP TABLE t_data CASCADE; NOTICE:   drop   cascades to 3 other objects DETAIL:   drop   cascades to default for table    t_data_2013 column id drop cascades to table      t_data_2016 drop   cascades to table t_data_2015 DROP TABLE Understanding PostgreSQL 10.0 partitioning For many years, the PostgreSQL community has been working on built-in partitioning. Finally, PostgreSQL 10.0 offers the first implementation of in-core partitioning, which will be covered in this chapter. For now, the partitioning functionality is still pretty basic. However, a lot of infrastructure for future improvements is already in place. To show you how partitioning works, I have compiled a simple example featuring range partitioning: CREATE TABLE data ( payload   integer )  PARTITION BY RANGE (payload); CREATE TABLE negatives PARTITION OF data FOR VALUES FROM (MINVALUE) TO (0); CREATE TABLE positives PARTITION OF data FOR VALUES FROM (0) TO (MAXVALUE); In this example, one partition will hold all negative values while the other one will take care of positive values. While creating the parent table, you can simply specify which way you want to partition data. In PostgreSQL 10.0, there is range partitioning and list partitioning. Support for hash partitioning and the like might be available as soon as PostgreSQL 11.0. Once the parent table has been created, it is already time to create the partitions. To do that, the PARTITION  OF clause has been added. At this point, there are still some limitations. The most important one is that a tuple (= a row) cannot move from one partition to the other, for example: UPDATE data SET payload = -10 WHERE id = 5 If there were rows satisfying this condition, PostgreSQL would simply error out and refuse to change the value. However, in case of a good design, it is a bad idea to change the partitioning key anyway. Also, keep in mind that you have to think about indexing each partition. We learnt both, the old way of data partitioning and new data partitioning features introduced in PostgreSQL 10.0. [box type="note" align="" class="" width=""]You read an excerpt from the book Mastering PostgreSQL 10, written by Hans-Jürgen Schönig.  To know about, query optimization, stored procedures and other techniques in PostgreSQL 10.0, you may check out this book Mastering PostgreSQL 10..[/box]
Read more
  • 0
  • 0
  • 21791

article-image-conversational-ai-in-2018-an-arms-race-of-new-products-acquisitions-and-more
Bhagyashree R
21 Jan 2019
5 min read
Save for later

Conversational AI in 2018: An arms race of new products, acquisitions, and more

Bhagyashree R
21 Jan 2019
5 min read
Conversational AI is one of the most interesting applications of artificial intelligence in recent years. While the trend isn’t yet ubiquitous in the way that recommendation systems are (perhaps unsurprising), it has been successfully productized by a number of tech giants, in the form of Google Home and Amazon Echo (which is ‘powered by’ Alexa). The conversational AI arms race Arguably, 2018 has seen a bit of an arms race in conversational AI. As well as Google and Amazon, the likes of IBM, Microsoft, and Apple have wanted a piece of the action. Here are some of the new conversational AI tools and products these companies introduced this year: Google Google worked towards enhancing its conversational interface development platform, Dialogflow. In July, at the Google Cloud Next event, it announced several improvements and new capabilities to Dialogflow including Text to Speech via DeepMind's WaveNet and Dialogflow Phone Gateway for telephony integration. It also launched a new product called Contact Center AI that comes with Dialogflow Enterprise Edition and additional capabilities to assist live agents and perform analytics. Google Assistant became better in having a back-and-forth conversation with the help of Continued Conversation, which was unveiled at the Google I/O conference. The assistant became multilingual in August, which means users can speak to it in more than one language at a time, without having to adjust their language settings. Users can enable this multilingual functionality by selecting two of the supported languages. Following the footsteps of Amazon, Google also launched its own smart display named Google Home Hub at the ‘Made by Google’ event held in October. Microsoft Microsoft in 2018 introduced and improved various bot-building tools for developers. In May, at the Build conference, Microsoft announced major updates in their conversational AI tools: Azure Bot Service, Microsoft Cognitive Services Language Understanding, and QnAMaker. To enable intelligent bots to learn from example interactions and handle common small talk, it launched new experimental projects from named Conversation Learner and Personality Chat. At Microsoft Ignite, Bot Framework SDK V4.0 was made generally available. Later in November, Microsoft announced the general availability of the Bot Framework Emulator V4 and Web Chat control. In May, to drive more research and development in its conversational AI products, Microsoft acquired Semantic Machines and established conversational AI center of excellence in Berkeley. In November, the organization's acquisition of Austin-based bot startup XOXCO was a clear indication that it wants to get serious about using artificial intelligence for conversational bots. Producing guidelines on developing ‘responsible’ conversational AI further confirmed Microsoft wants to play a big part in the future evolution of the area. Microsoft were the chosen tech partner by UK based conversational AI startup ICS.ai. The team at ICS are using Azure and LUIS from Microsoft in their public sector AI chatbots, aimed at higher education, healthcare trusts and county councils. Amazon Amazon with the aims to improve Alexa’s capabilities released Alexa Skills Kit (ASK) which consists of APIs, tools, documentation, and code samples using which developers can build new skills for Alexa. In September, it announced a preview of a new design language named Alexa Presentation Language (APL). With APL, developers can build visual skills that include graphics, images, slideshows, and video, and to customize them for different device types. Amazon’s smart speaker Echo Dot saw amazing success with becoming the best seller in smart speaker category on Amazon. At its 2018 hardware event in Seattle, Amazon announced the launch of redesigned Echo Dot and a new addition to Alexa-powered A/V device called Echo Plus. As well as the continuing success of Alexa and the Amazon Echo, Amazon’s decision to launch the Alexa Fellowship at a number of leading academic institutions also highlights that for the biggest companies conversational AI is as much about research and exploration as it is products. Like Microsoft, it appears that Amazon is well aware that conversational AI is an area only in its infancy, still in development - as much as great products, it requires clear thinking and cutting-edge insight to ensure that it develops in a way that is both safe and impactful. What’s next? This huge array of products is a result of advances in deep learning researches. Now conversational AI is not just limited to small tasks like setting an alarm or searching the best restaurant. We can have a back and forth conversation with the conversational agent. But, needless to say, it still needs more work. Conversational agents are yet to meet user expectations related to sensing and responding with emotion. In the coming years, we will see these systems understand and do a good job at generating natural language. They will be able to have reasonably natural conversations with humans in certain domains, grounded in context. Also, the continuous development in IoT will provide AI systems with more context. Apple has introduced Shortcuts for iOS 12 to automate your everyday tasks Microsoft amplifies focus on conversational AI: Acquires XOXCO; shares guide to developing responsible bots Amazon is supporting research into conversational AI with Alexa fellowships
Read more
  • 0
  • 0
  • 21786

article-image-derivatives-pricing
Packt
18 Nov 2013
10 min read
Save for later

Derivatives Pricing

Packt
18 Nov 2013
10 min read
(For more resources related to this topic, see here.) Derivatives are financial instruments which derive their value from (or are dependent on) the value of another product, called the underlying. The three basic types of derivatives are forward and futures contracts, swaps, and options. In this article we will focus on this latter class and show how basic option pricing models and some related problems can be handled in R. We will start with overviewing how to use the continuous Black-Scholes model and the binomial Cox-Ross-Rubinstein model in R, and then we will proceed with discussing the connection between these models. Furthermore, with the help of calculating and plotting of the Greeks, we will show how to analyze the most important types of market risks that options involve. Finally, we will discuss what implied volatility means and will illustrate this phenomenon by plotting the volatility smile with the help of real market data. The most important characteristics of options compared to futures or swaps is that you cannot be sure whether the transaction (buying or selling the underlying) will take place or not. This feature makes option pricing more complex and requires all models to make assumptions regarding the future price movements of the underlying product. The two models we are covering here differ in these assumptions: the Black-Scholes model works with a continuous process while the Cox-Ross-Rubinstein model works with a discrete stochastic process. However, the remaining assumptions are very similar and we will see that the results are close too. The Black-Scholes model The assumptions of the Black-Scholes model (Black and Sholes, 1973, see also Merton, 1973) are as follows: The price of the underlying asset (S) follows geometric Brownian motion: Here μ (drift) and σ (volatility) are constant parameters and W is a standard Wiener process. The market is arbitrage-free. The underlying is a stock paying no dividends. Buying and (short) selling the underlying asset is possible in any (even fractional) amount. There are no transaction costs. The short-term interest rate (r) is known and constant over time. The main result of the model is that under these assumptions, the price of a European call option (c) has a closed form: Here X is the strike price, T-tis the time to maturity of the option, and N denotes the cumulative distribution function of the standard normal distribution. The equation giving the price of the option is usually referred to as the Black-Scholes formula. It is easy to see from put-call parity that the price of a European put option (p) with the same parameters is given by: Now consider a call and put option on a Google stock in June 2013 with a maturity of September 2013 (that is, with 3 months of time to maturity).Let us assume that the current price of the underlying stock is USD 900, the strike price is USD 950, the volatility of Google is 22%, and the risk-free rate is 2%. We will calculate the value of the call option with the GBSOption function from the fOptions package. Beyond the parameters already discussed, we also have to set the cost of carry (b); in the original Black-Scholes model, (with underlying paying no dividends) it equals the risk-free rate. > library(fOptions) > GBSOption(TypeFlag = "c", S = 900, X =950, Time = 1/4, r = 0.02, + sigma = 0.22, b = 0.02) Title: Black Scholes Option Valuation Call: GBSOption(TypeFlag = "c", S = 900, X = 950, Time = 1/4, r = 0.02, b = 0.02, sigma = 0.22) Parameters: Value: TypeFlag c S 900 X 950 Time 0.25 r 0.02 b 0.02 sigma 0.22 Option Price: 21.79275 Description: Tue Jun 25 12:54:41 2013 This prolonged output returns the passed parameters with the result just below the Option Price label. Setting the TypeFlag to p would compute the price of the put option and now we are only interested in the results (found in the price slot—see the str of the object for more details) without the textual output: > GBSOption(TypeFlag = "p", S = 900, X =950, Time = 1/4, r = 0.02, sigma = 0.22, b = 0.02)@price [1] 67.05461 We also have the choice to compute the preceding values with a more user-friendly calculator provided by the GUIDE package. Running the blackscholes() function would trigger a modal window with a form where we can enter the same parameters. Please note that the function uses the dividend yield instead of cost of carry, which is zero in this case. The Cox-Ross-Rubinstein model The Cox-Ross-Rubinstein(CRR) model (Cox, Ross and Rubinstein, 1979) assumes that the price of the underlying asset follows a discrete binomial process. The price might go up or down in each period and hence changes according to a binomial tree illustrated in the following plot, where u and dare fixed multipliers measuring the price changes when it goes up and down. The important feature of the CRR model is that u=1/d and the tree is recombining; that is, the price after two periods will be the same if it first goes up and then goes down or vice versa, as shown in the following figure: To build a binomial tree, first we have to decide how many steps we are modeling (n); that is, how many steps the time to maturity of the option will be divided into. Alternatively, we can determine the length of one time step ∆t,(measured in years) on the tree: If we know the volatility (σ) of the underlying, the parameters u and dare determined according to the following formulas: And consequently: When pricing an option in a binomial model, we need to determine the tree of the underlying until the maturity of the option. Then, having all the possible prices at maturity, we can calculate the corresponding possible option values, simply given by the following formulas: To determine the option price with the binomial model, in each node we have to calculate the expected value of the next two possible option values and then discount it. The problem is that it is not trivial what expected return to use for discounting. The trick is that we are calculating the expected value with a hypothetic probability, which enables us to discount with the risk-free rate. This probability is called risk neutral probability (pn) and can be determined as follows: The interpretation of the risk-neutral probability is quite plausible: if the probability that the underlying price goes up from any of the nodes was pn, then the expected return of the underlying would be the risk-free rate. Consequently, an expected value calculated with pn can be discounted by rand the price of the option in any node of the tree is determined as: In the preceding formula, g is the price of an option in general (it may be call or put as well) in a given node, gu and gd are the values of this derivative in the two possible nodes one period later. For demonstrating the CRR model in R, we will use the same parameters as in the case of the Black-Scholes formula. Hence, S=900, X=950, σ=22%, r=2%, b=2%, T-t=0.25. We also have to set n, the number of time steps on the binomial tree. For illustrative purposes, we will work with a 3-period model: > CRRBinomialTreeOption(TypeFlag = "ce", S = 900, X = 950, + Time = 1/4, r = 0.02, b = 0.02, sigma = 0.22, n = 3)@price [1] 20.33618 > CRRBinomialTreeOption(TypeFlag = "pe", S = 900, X = 950, + Time = 1/4, r = 0.02, b = 0.02, sigma = 0.22, n = 3)@price [1] 65.59803 It is worth observing that the option prices obtained from the binomial model are close to (but not exactly the same as) the Black-Scholes prices calculated earlier. Apart from the final result, that is, the current price of the option, we might be interested in the whole option tree as well: > CRRTree <- BinomialTreeOption(TypeFlag = "ce", S = 900, X = 950, + Time = 1/4, r = 0.02, b = 0.02, sigma = 0.22, n = 3) > BinomialTreePlot(CRRTree, dy = 1, xlab = "Time steps", + ylab = "Number of up steps", xlim = c(0,4)) > title(main = "Call Option Tree") Here we first computed a matrix by BinomialTreeOption with the given parameters and saved the result in CRRTree that was passed to the plot function with specified labels for both the x and y axis with the limits of the x axis set from 0 to 4, as shown in the following figure. The y-axis (number of up steps) shows how many times the underlying price has gone up in total. Down steps are defined as negative up steps. The European put option can be shown similarly by changing the TypeFlag to pe in the previous code: Connection between the two models After applying the two basic option pricing models, we give some theoretical background to them. We do not aim to give a detailed mathematical derivation, but we intend to emphasize (and then illustrate in R) the similarities of the two approaches. The financial idea behind the continuous and the binomial option pricing is the same: if we manage to hedge the option perfectly by holding the appropriate quantity of the underlying asset, it means we created a risk-free portfolio. Since the market is supposed to be arbitrage-free, the yield of a risk-free portfolio must equal the risk-free rate. One important observation is that the correct hedging ratio is holding underlying asset per option. Hence, the ratio is the partial derivative (or its discrete correspondent in the binomial model) of the option value with respect to the underlying price. This partial derivative is called the delta of the option. Another interesting connection between the two models is that the delta-hedging strategy and the related arbitrage-free argument yields the same pricing principle: the value of the derivative is the risk-neutral expected value of its future possible values, discounted by the risk-free rate. This principle is easily tractable on the binomial tree where we calculated the discounted expected values node by node; however, the continuous model has the same logic as well, even if the expected value is mathematically more complicated to compute. This is the reason why we gave only the final result of this argument, which was the Black-Scholes formula. Now we know that the two models have the same pricing principles and ideas (delta-hedging and risk-neutral valuation), but we also observed that their numerical results are not equal. The reason is that the stochastic processes assumed to describe the price movements of the underlying asset are not identical. Nevertheless, they are very similar; if we determine the value of u and d from the volatility parameter as we did it in The Cox-Ross-Rubinstein model section, the binomial process approximates the geometric Brownian motion. Consequently, the option price of the binomial model converges to that of the Black-Scholes model if we increase the number of time steps (or equivalently, decrease the length of the steps). To illustrate this relationship, we will compute the option price in the binomial model with increasing numbers of time steps. In the following figure, we compare the results with the Black-Scholes price of the option: The plot was generated by a loop running N from 1 to 200 to compute CRRBinomialTreeOption with fixed parameters: > prices <- sapply(1:200, function(n) { + CRRBinomialTreeOption(TypeFlag = "ce", S = 900, X = 950, + Time = 1/4, r = 0.02, b = 0.02, sigma = 0.22, n = n)@price + }) Now the prices variable holds 200 computed values: > str(prices) num [1:200] 26.9 24.9 20.3 23.9 20.4... Let us also compute the option with the generalized Black-Scholes option: > price <- GBSOption(TypeFlag = "c", S = 900, X = 950, Time = 1/4, r = 0.02, sigma = 0.22, b = 0.02)@price And show the prices in a joint plot with the GBS option rendered in red: > plot(1:200, prices, type='l', xlab = 'Number of steps', + ylab = 'Prices') > abline(h = price, col ='red') > legend("bottomright", legend = c('CRR-price', 'BS-price'), + col = c('black', 'red'), pch = 19)
Read more
  • 0
  • 0
  • 21748

article-image-opengl-40-using-uniform-blocks-and-uniform-buffer-objects
Packt
03 Aug 2011
10 min read
Save for later

OpenGL 4.0: Using Uniform Blocks and Uniform Buffer Objects

Packt
03 Aug 2011
10 min read
  OpenGL 4.0 Shading Language Cookbook Over 60 highly focused, practical recipes to maximize your OpenGL Shading language use         Read more about this book       If your OpenGL/GLSL program involves multiple shader programs that use the same uniform variables, one has to manage the variables separately for each program. Uniform blocks were designed to ease the sharing of uniform data between programs. In this article by David Wolff, author of OpenGL 4.0 Shading Language Cookbook, we will create a buffer object for storing the values of all the uniform variables, and bind the buffer to the uniform block. Then when changing programs, the same buffer object need only be re-bound to the corresponding block in the new program. (For more resources on this subject, see here.) Uniform locations are generated when a program is linked, so the locations of the uniforms may change from one program to the next. The data for those uniforms may have to be re-generated and applied to the new locations. A uniform block is simply a group of uniform variables defined within a syntactical structure known as a uniform block. For example, in this recipe, we'll use the following uniform block: uniform BlobSettings { vec4 InnerColor; vec4 OuterColor; float RadiusInner; float RadiusOuter;}; This defines a block with the name BlobSettings that contains four uniform variables. With this type of block definition, the variables within the block are still part of the global scope and do not need to be qualified with the block name. The buffer object used to store the data for the uniforms is often referred to as a uniform buffer object. We'll see that a uniform buffer object is simply just a buffer object that is bound to a certain location. For this recipe, we'll use a simple example to demonstrate the use of uniform buffer objects and uniform blocks. We'll draw a quad (two triangles) with texture coordinates, and use our fragment shader to fill the quad with a fuzzy circle. The circle is a solid color in the center, but at its edge, it gradually fades to the background color, as shown in the following image. Getting ready Start with an OpenGL program that draws two triangles to form a quad. Provide the position at vertex attribute location 0, and the texture coordinate (0 to 1 in each direction) at vertex attribute location 1 (see: Sending data to a shader using per-vertex attributes and vertex buffer objects). We'll use the following vertex shader: #version 400 layout (location = 0) in vec3 VertexPosition;layout (location = 1) in vec3 VertexTexCoord; out vec3 TexCoord; void main(){ TexCoord = VertexTexCoord; gl_Position = vec4(VertexPosition,1.0);} The fragment shader contains the uniform block, and is responsible for drawing our fuzzy circle: #version 400in vec3 TexCoord;layout (location = 0) out vec4 FragColor; uniform BlobSettings { vec4 InnerColor; vec4 OuterColor; float RadiusInner; float RadiusOuter;}; void main() { float dx = TexCoord.x - 0.5; float dy = TexCoord.y - 0.5; float dist = sqrt(dx * dx + dy * dy); FragColor = mix( InnerColor, OuterColor, smoothstep( RadiusInner, RadiusOuter, dist ) );} The uniform block is named BlobSettings. The variables within this block define the parameters of our fuzzy circle. The variable OuterColor defines the color outside of the circle. InnerColor is the color inside of the circle. RadiusInner is the radius defining the part of the circle that is a solid color (inside the fuzzy edge), and the distance from the center of the circle to the inner edge of the fuzzy boundary. RadiusOuter is the outer edge of the fuzzy boundary of the circle (when the color is equal to OuterColor). The code within the main function computes the distance of the texture coordinate to the center of the quad located at (0.5, 0.5). It then uses that distance to compute the color by using the smoothstep function. This function provides a value that smoothly varies between 0.0 and 1.0 when the value of the third argument is between the values of the first two arguments. Otherwise it returns 0.0 or 1.0 depending on whether it is less than the first or greater than the second, respectively. The mix function is then used to linearly interpolate between InnerColor and OuterColor based on the value returned by the smoothstep function. How to do it... In the OpenGL program, after linking the shader program, use the following steps to send data to the uniform block in the fragment shader: Get the index of the uniform block using glGetUniformBlockIndex. GLuint blockIndex = glGetUniformBlockIndex(programHandle, "BlobSettings"); Allocate space for the buffer to contain the data for the uniform block. We get the size of the block using glGetActiveUniformBlockiv. GLint blockSize; glGetActiveUniformBlockiv(programHandle, blockIndex, GL_UNIFORM_BLOCK_DATA_SIZE, &blockSize); GLubyte * blockBuffer= (GLubyte *) malloc(blockSize); Query for the offset of each variable within the block. To do so, we first find the index of each variable within the block. // Query for the offsets of each block variableconst GLchar *names[] = { "InnerColor", "OuterColor", "RadiusInner", "RadiusOuter" }; GLuint indices[4];glGetUniformIndices(programHandle, 4, names, indices); GLint offset[4];glGetActiveUniformsiv(programHandle, 4, indices, GL_UNIFORM_OFFSET, offset); Place the data into the buffer at the appropriate offsets. GLfloat outerColor[] = {0.0f, 0.0f, 0.0f, 0.0f};GLfloat innerColor[] = {1.0f, 1.0f, 0.75f, 1.0f};GLfloat innerRadius = 0.25f, outerRadius = 0.45f; memcpy(blockBuffer + offset[0], innerColor, 4 * sizeof(GLfloat));memcpy(blockBuffer + offset[1], outerColor, 4 * sizeof(GLfloat));memcpy(blockBuffer + offset[2], &innerRadius, sizeof(GLfloat));memcpy(blockBuffer + offset[3], &outerRadius, sizeof(GLfloat)); Create the OpenGL buffer object and copy the data into it. GLuint uboHandle;glGenBuffers( 1, &uboHandle );glBindBuffer( GL_UNIFORM_BUFFER, uboHandle );glBufferData( GL_UNIFORM_BUFFER, blockSize, blockBuffer, GL_DYNAMIC_DRAW ); Bind the buffer object to the uniform block. glBindBufferBase( GL_UNIFORM_BUFFER, blockIndex, uboHandle ); How it works... Phew! This seems like a lot of work! However, the real advantage comes when using multiple programs where the same buffer object can be used for each program. Let's take a look at each step individually. First, we get the index of the uniform block by calling glGetUniformBlockIndex, then we query for the size of the block by calling glGetActiveUniformBlockiv. After getting the size, we allocate a temporary buffer named blockBuffer to hold the data for our block. The layout of data within a uniform block is implementation dependent, and implementations may use different padding and/or byte alignment. So, in order to accurately layout our data, we need to query for the offset of each variable within the block. This is done in two steps. First, we query for the index of each variable within the block by calling glGetUniformIndices. This accepts an array of variable names (third argument) and returns the indices of the variables in the array indices (fourth argument). Then we use the indices to query for the offsets by calling glGetActiveUniformsiv. When the fourth argument is GL_UNIFORM_OFFSET, this returns the offset of each variable in the array pointed to by the fifth argument. This function can also be used to query for the size and type; however, in this case we choose not to do so, to keep the code simple (albeit less general). The next step involves filling our temporary buffer blockBuffer with the data for the uniforms at the appropriate offsets. Here we use the standard library function memcpy to accomplish this. Now that the temporary buffer is populated with the data with the appropriate layout, we can create our buffer object and copy the data into the buffer object. We call glGenBuffers to generate a buffer handle, and then bind that buffer to the GL_UNIFORM_BUFFER binding point by calling glBindBuffer. The space is allocated within the buffer object and the data is copied when glBufferData is called. We use GL_DYNAMIC_DRAW as the usage hint here, because uniform data may be changed somewhat often during rendering. Of course, this is entirely dependent on the situation. Finally, we associate the buffer object with the uniform block by calling glBindBufferBase. This function binds to an index within a buffer binding point. Certain binding points are also so-called "indexed buffer targets". This means that the target is actually an array of targets, and glBindBufferBase allows us to bind to one index within the array. There's more... If the data for a uniform block needs to be changed at some later time, one can call glBufferSubData to replace all or part of the data within the buffer. If you do so, don't forget to first bind the buffer to the generic binding point GL_UNIFORM_BUFFER. Using an instance name with a uniform block A uniform block can have an optional instance name. For example, with our BlobSettings block, we could have used the instance name Blob, as shown here: uniform BlobSettings { vec4 InnerColor; vec4 OuterColor; float RadiusInner; float RadiusOuter;} Blob; In this case, the variables within the block are placed within a namespace qualified by the instance name. Therefore our shader code needs to refer to them prefixed with the instance name. For example: FragColor = mix( Blob.InnerColor, Blob.OuterColor, smoothstep( Blob.RadiusInner, Blob.RadiusOuter, dist ) ); Additionally, we need to qualify the variable names within the OpenGL code when querying for variable indices. The OpenGL specification says that they must be qualified with the block name (BlobSettings). However, my tests using the ATI Catalyst (10.8) drivers required me to use the instance name (Blob). Using layout qualifiers with uniform blocks Since the layout of the data within a uniform buffer object is implementation dependent, it required us to query for the variable offsets. However, one can avoid this by asking OpenGL to use the standard layout std140. This is accomplished by using a layout qualifier when declaring the uniform block. For example: layout( std140 ) uniform BlobSettings { ...}; The std140 layout is described in detail within the OpenGL specification document (available at http://www.opengl.org). Other options for the layout qualifier that apply to uniform block layouts include packed and shared. The packed qualifier simply states that the implementation is free to optimize memory in whatever way it finds necessary (based on variable usage or other criteria). With the packed qualifier, we still need to query for the offsets of each variable. The shared qualifier guarantees that the layout will be consistent between multiple programs and program stages provided that the uniform block declaration does not change. If you are planning to use the same buffer object between multiple programs and/or program stages, it is a good idea to use the shared option. There are two other layout qualifiers that are worth mentioning: row_major and column_major. These define the ordering of data within the matrix type variables within the uniform block. One can use multiple qualifiers for a block. For example, to define a block with both the row_major and shared qualifiers, we would use the following syntax: layout( row_major, shared ) uniform BlobSettings { ...}; Summary This article covered the topic of Using Uniform Blocks and Uniform Buffer Objects. Further resources on this subject: OpenGL 4.0: Building a C++ Shader Program Class [Article] Tips and Tricks for Getting Started with OpenGL and GLSL 4.0 [Article] The Basics of GLSL 4.0 Shaders [Article] GLSL 4.0: Using Subroutines to Select Shader Functionality [Article] GLSL 4.0: Discarding Fragments to Create a Perforated Look [Article]
Read more
  • 0
  • 0
  • 21735
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-creating-a-langchain-agent-azure-openai-python-with-the-react-approach
Valentina Alto
11 Jun 2023
17 min read
Save for later

Creating a LangChain Agent: Azure OpenAI & Python with the ReAct Approach

Valentina Alto
11 Jun 2023
17 min read
In my latest article, we introduced the concept of Agents powered by Large Language Models and how they overcome one of the current limitations of our beloved LLMs: the capability of taking action. An Agent can be seen as a kind of wrapper that uses an LLM as a reasoning engine, plus it has the capability of interacting with tools that we can provide and take actions with those. Tools can be the accessed through Wikipedia rather than the possibility to interact with our File System or access the command line. If prompt was an important component while working with LLMs, with agents it becomes key. In fact, agents need to be instructed with a reasoning template, which can follow various techniques. We’ve already seen an example of the Read-Retrieve-Read technique in my latest article. In this article, we are going to explore the ReAct approach.What is ReAct?ReAct (Reason and Act) approach is a general paradigm that combines reasoning and acting with LLMs. It prompts LLMs to generate verbal reasoning traces and actions for a task. As per the Read-Retrieve-Read approach, also the ReAct paradigm implies an interaction with external tools to retrieve additional information. However, with the ReAct approach, we introduce a greater synergy between the reasoning and acting phases:The reasoning phase helps the model to set up action plans, track them, and even modify them in case (also in case of handling exceptions);The action phase allows the model to interact with the external world and retrieve the needed information according to the plan of the previous phaseIn the aforementioned paper, the authors show how the ReAct approach was able to overcome typical drawbacks of LLMs as hallucination and error propagation (as it has occurred in the simple version of the Chain of Thought (CoT) method of prompt engineering).Let’s see how those kinds of agents work in practice.Implementing the AgentLangChain makes it easier to build agents thanks to lightweight libraries which provide our LLM with the ReAct-based prompt template that makes the agent capable of both reasoning and acting. To achieve this goal, we need to install the following libraries:!pip install wikipedia from langchain import Wikipedia from langchain.llms import AzureOpenAI from langchain.agents import initialize_agent, Tool from langchain.agents import AgentType from langchain.agents.react.base import DocstoreExplorerIn this case, as an external tool, we will use Wikipedia. However, you can decide to add further tools like search APIs such as Bing (you can read about that in my previous article) or the File System of our personal computer.Next, we can build our agent and the document store, where it will be able to navigate through in order to retrieve information. To assist the agent with the exploration of the document store, we will use the previously imported class DocstoreExplorer.docstore=DocstoreExplorer(Wikipedia()) tools = [    Tool(        name="Search",        func=docstore.search,        description="useful for when you need to ask with search"    ),    Tool(        name="Lookup",        func=docstore.lookup,        description="useful for when you need to ask with lookup"    ) ]Finally, we need to set up the reasoning engine of our Agent. In our case, we will use a text-davinci-003 model available in the Azure OpenAI service (to set up an Azure OpenAI instance, you can read my former article here).llm = AzureOpenAI(deployment_name="text-davinci-003", openai_api_version="xxx", openai_api_key="xxx", openai_api_base="xxx")Great, now we have all the ingredients needed to initialize our agent. To test my agents, I will take some questions from a more recent version of the question-answering dataset HotpotQA (you can refer to the original paper here) called BeerQA. In this experiment, I will challenge my agent on some reasoning abilities within the world of Harry Potter:#initializing our agent react = initialize_agent(tools, llm, agent=AgentType.REACT_DOCSTORE, verbose=True) #start with questions question = "Rupert Grint played the friend of which fictional character from the Harry Potter series?" react.run(question)Below you can find the whole reasoning and response:> Entering new AgentExecutor chain... Thought: The question simplifies to which fictional character from the Harry Potter series is Rupert Grint's friend. I need to search Rupert Grint and Harry Potter series, and find the character. Action: Search[Rupert Grint] Observation: Could not find [Rupert Grint]. Similar: ['Rupert Grint', 'Knock at the Cabin', 'Emma Watson', "Harry Potter and the Philosopher's Stone (film)", 'Harry Potter 20th Anniversary: Return to Hogwarts', 'Snatch (TV series)', 'Harry Potter (film series)', 'Servant (TV series)', 'Sick Note (TV series)', 'Georgia Groome'] Thought: To find the character, I can search Harry Potter (film series). Action: Search[Harry Potter (film series)] Observation: Harry Potter is a film series based on the eponymous novels by J. K. Rowling. The series is produced and distributed by Warner Bros. Pictures and consists of eight fantasy films, beginning with Harry Potter and the Philosopher's Stone (2001) and culminating with Harry Potter and the Deathly Hallows – Part 2 (2011). A spin-off prequel series, planned to consist of five films, started with Fantastic Beasts and Where to Find Them (2016), marking the beginning of the Wizarding World shared media franchise.The series was mainly produced by David Heyman, and stars Daniel Radcliffe, Rupert Grint, and Emma Watson as the three leading characters: Harry Potter, Ron Weasley, and Hermione Granger. Four directors worked on the series: Chris Columbus, Alfonso Cuarón, Mike Newell, and David Yates. Michael Goldenberg wrote the screenplay for Harry Potter and the Order of the Phoenix (2007), while the remaining films' screenplays were written by Steve Kloves. Production took place over ten years, with the main story arc following Harry's quest to overcome his arch-enemy Lord Voldemort.Harry Potter and the Deathly Hallows, the seventh and final novel in the series, was adapted into two feature-length parts. Part 1 was released in November 2010, and Part 2 was released in July 2011.Philosopher's Stone and Deathly Hallows – Part 2 are among the 50 highest-grossing films of all time—at 50th- and 16th-highest, respectively—with both grossing over $1 billion. It is the fourth-highest-grossing film series, with $7.7 billion in worldwide receipts. Thought: Rupert Grint played the friend of Harry Potter, Ron Weasley. Action: Finish[Ron Weasley] > Finished chain. 'Ron Weasley'Now, you might think that this was a very easy question to address (and be right). Let’s challenge it with some more complex reasoning tasks:question = "Which of the Harry Potter film series main stars debuted in stage acting first?" react.run(question)Below is the response:> Entering new AgentExecutor chain... Thought: I need to search the main stars of the Harry Potter film series, find their stage acting debut, then find which debuted first. Action: Search[Harry Potter film series main stars] Observation: Harry Potter is a film series based on the eponymous novels by J. K. Rowling. The series is produced and distributed by Warner Bros. Pictures and consists of eight fantasy films, beginning with Harry Potter and the Philosopher's Stone (2001) and culminating with Harry Potter and the Deathly Hallows – Part 2 (2011). A spin-off prequel series, planned to consist of five films, started with Fantastic Beasts and Where to Find Them (2016), marking the beginning of the Wizarding World shared media franchise.The series was mainly produced by David Heyman, and stars Daniel Radcliffe, Rupert Grint, and Emma Watson as the three leading characters: Harry Potter, Ron Weasley, and Hermione Granger. Four directors worked on the series: Chris Columbus, Alfonso Cuarón, Mike Newell, and David Yates. Michael Goldenberg wrote the screenplay for Harry Potter and the Order of the Phoenix (2007), while the remaining films' screenplays were written by Steve Kloves. Production took place over ten years, with the main story arc following Harry's quest to overcome his arch-enemy Lord Voldemort.Harry Potter and the Deathly Hallows, the seventh and final novel in the series, was adapted into two feature-length parts. Part 1 was released in November 2010, and Part 2 was released in July 2011.Philosopher's Stone and Deathly Hallows – Part 2 are among the 50 highest-grossing films of all time—at 50th- and 16th-highest, respectively—with both grossing over $1 billion. It is the fourth-highest-grossing film series, with $7.7 billion in worldwide receipts. Thought: The main stars of the Harry Potter film series are Daniel Radcliffe, Rupert Grint, and Emma Watson. I need to search each of them and find their stage acting debut. Action: Search[Daniel Radcliffe] Observation: Daniel Jacob Radcliffe (born 23 July 1989) is an English actor. He rose to fame at age twelve, when he began portraying Harry Potter in the film series of the same name. Over his career, Radcliffe has received various awards and nominations. Radcliffe made his acting debut at age 10 in the BBC One television film David Copperfield (1999), followed by his feature film debut in The Tailor of Panama (2001). The same year, he starred as Harry Potter in the film adaptation of the J.K. Rowling fantasy novel, Harry Potter and the Philosopher's Stone. Over the next decade, he played the eponymous role in seven sequels, culminating with Harry Potter and the Deathly Hallows – Part 2 (2011). During this period, he became one of the world's highest-paid actors and gained worldwide fame, popularity, and critical acclaim. Following the success of Harry Potter, Radcliffe challenged himself acting in a variety of genres starring in What If? (2013), The Woman in Black (2012), Victor Frankenstein (2015), Swiss Army Man (2016), Now You See Me 2 (2016), and The Lost City (2022). He also portrayed Allen Ginsberg in the drama film Kill Your Darlings (2013), FBI agent Nate Foster in the crime drama Imperium (2016), and Weird Al Yankovic in the musical film Weird: The Al Yankovic Story (2022). Since 2019, he has starred in the TBS anthology series Miracle Workers. Radcliffe branched out to stage acting in 2007, starring in the West End and Broadway productions of Equus. He returned to Broadway in the musical How to Succeed in Business Without Really Trying (2011), earning a Grammy Award nomination, and Martin McDonagh's The Cripple of Inishmaan (2014), the satirical play The Lifespan of a Fact (2018), and Stephen Sondheim's Merrily We Roll Along (2023). He returned to the West End in revivals of Tom Stoppard's Rosencrantz and Guildenstern Are Dead (2017), and Samuel Becket's Endgame (2020). Radcliffe has contributed to many charities, including Demelza Hospice Care for Children and the Trevor Project; the latter awarded him its Hero Award in 2011 for his advocacy with LGBTQ youth. Thought: Daniel Radcliffe made his stage acting debut in the West End and Broadway productions of Equus in 2007. I need to search Rupert Grint and Emma Watson next and find their stage acting debut. Action: Search[Rupert Grint] Observation: Could not find [Rupert Grint]. Similar: ['Rupert Grint', 'Knock at the Cabin', 'Emma Watson', "Harry Potter and the Philosopher's Stone (film)", 'Harry Potter 20th Anniversary: Return to Hogwarts', 'Snatch (TV series)', 'Harry Potter (film series)', 'Servant (TV series)', 'Sick Note (TV series)', 'Georgia Groome'] Thought: To find Rupert Grint's stage acting debut, I can search Rupert Grint. Action: Search[Rupert Grint] Observation: Could not find [Rupert Grint]. Similar: ['Rupert Grint', 'Knock at the Cabin', 'Emma Watson', "Harry Potter and the Philosopher's Stone (film)", 'Harry Potter 20th Anniversary: Return to Hogwarts', 'Snatch (TV series)', 'Harry Potter (film series)', 'Servant (TV series)', 'Sick Note (TV series)', 'Georgia Groome'] Thought: To find Rupert Grint's stage acting debut, I can search Rupert Grint stage acting. Action: Search[Rupert Grint stage acting] Observation: Rupert Alexander Lloyd Grint (; born 24 August 1988) is an English actor. Grint rose to fame for his role as Ron Weasley in the Harry Potter film series, for which he was cast at age eleven, having previously acted only in school plays and his local theatre group. Since then, he continued his work on film, television, and theatre. Beginning in 2002, he began to work outside of the Harry Potter franchise, with a co-leading role in Thunderpants. He starred in the dramedy Driving Lessons (2006) and the drama Cherrybomb (2009) and co-starred in the comedy Wild Target (2010). After the Harry Potter series concluded, he appeared in the films Into the White (2012), Charlie Countryman (2013), CBGB (2013), and Moonwalkers (2015). He made his stage debut in 2013 in Mojo at the Harold Pinter Theatre. He executive-produced and starred in the television series Snatch, based on the film of the same name. He has also starred in the black comedy series Sick Note, the mystery thriller miniseries The ABC Murders, and the psychological horror series Servant. Thought: Rupert Grint made his stage acting debut in Mojo at the Harold Pinter Theatre in 2013. I need to search Emma Watson next and find her stage acting debut. Action: Search[Emma Watson] Observation: Emma Charlotte Duerre Watson (born 15 April 1990) is an English actress, model and activist. Known for her roles in both blockbusters and independent films, as well as for her women's rights work, she has received a selection of accolades, including a Young Artist Award and three MTV Movie Awards. Watson has been ranked among the world's highest-paid actresses by Forbes and Vanity Fair, and was named one of the 100 most influential people in the world by Time magazine in 2015.Watson attended the Dragon School and trained in acting at the Oxford branch of Stagecoach Theatre Arts. As a child, she rose to stardom after landing her first professional acting role as Hermione Granger in the Harry Potter film series, having previously acted only in school plays. Watson also starred in the 2007 television adaptation of the novel Ballet Shoes and lent her voice to The Tale of Despereaux (2008). After the final Harry Potter film, she took on a supporting role in My Week with Marilyn (2011), before starring as Sam, a flirtatious, free-spirited student in The Perks of Being a Wallflower (2012), to critical success. Further acclaim came from portraying Alexis Neiers in Sofia Coppola's The Bling Ring (2013) and the titular character's adoptive daughter in Darren Aronofsky's biblical epic Noah (2014). That same year, Watson was honoured by the British Academy of Film and Television Arts, winning British Artist of the Year. She also starred as Belle in the live-action musical romantic fantasy Beauty and the Beast (2017), and Meg March in Greta Gerwig's coming-of-age drama Little Women (2019). From 2011 to 2014, Watson split her time between working on films and continuing her education, graduating from Brown University with a bachelor's degree in English literature in May 2014. That year, she was appointed a UN Women Goodwill Ambassador and helped launch the UN Women campaign HeForShe, which advocates for gender equality. In 2018, she helped launch Time's Up UK as a founding member. Watson was appointed to a G7 advisory body for women's rights in 2019, consulting with leaders on foreign policy. Her modelling work has included campaigns for Burberry and Lancôme. She also lent her name to a clothing line for the sustainable brand People Tree. In 2020, she joined the board of directors of Kering, a luxury brand group, in her capacity as an advocate for sustainable fashion. Thought: Emma Watson made her stage acting debut in the 2007 television adaptation of the novel Ballet Shoes. Daniel Radcliffe made his stage acting debut in 2007 (Equus), while Rupert Grint and Emma Watson made their stage acting debut in 2013 (Mojo and Ballet Shoes, respectively). So Daniel Radcliffe debuted in stage acting first. Action: Finish[Daniel Radcliffe] > Finished chain. 'Daniel Radcliffe'In this case, you can see that the Agent needs to plan several steps to find the right information. It is incredible to see how it can formulate the most effective query for the tool it is enabled to navigate through. At the end of the Chain of Thoughts, it gives the correct answer.For the sake of HP fandom (which I’m part of), let’s also ask something that it is only present within the books:question = "In Harry Potter and the Goblet of Fire, in the third task to get to the Goblet a riddle of which magical creature needs to be solved?" react.run(question)Below is the result:> Entering new AgentExecutor chain... Thought: I need to search Harry Potter and the Goblet of Fire, find the third task, and find the magical creature. Action: Search[Harry Potter and the Goblet of Fire] Observation: Harry Potter and the Goblet of Fire is a fantasy novel written by British author J. K. Rowling and the fourth novel in the Harry Potter series. It follows Harry Potter, a wizard in his fourth year at Hogwarts School of Witchcraft and Wizardry, and the mystery surrounding the entry of Harry's name into the Triwizard Tournament, in which he is forced to compete. The book was published in the United Kingdom by Bloomsbury and in the United States by Scholastic. In both countries, the release date was 8 July 2000. This was the first time a book in the series was published in both countries at the same time. The novel won a Hugo Award, the only Harry Potter novel to do so, in 2001. The book was adapted into a film, released worldwide on 18 November 2005, and a video game by Electronic Arts. Thought: The paragraph does not mention the third task. I need to look up third task. Action: Lookup[third task] Observation: No Results Thought: I need to instead search Harry Potter and the Goblet of Fire third task. Action: Search[Harry Potter and the Goblet of Fire third task] Observation: Harry Potter and the Goblet of Fire is a fantasy novel written by British author J. K. Rowling and the fourth novel in the Harry Potter series. It follows Harry Potter, a wizard in his fourth year at Hogwarts School of Witchcraft and Wizardry, and the mystery surrounding the entry of Harry's name into the Triwizard Tournament, in which he is forced to compete. The book was published in the United Kingdom by Bloomsbury and in the United States by Scholastic. In both countries, the release date was 8 July 2000. This was the first time a book in the series was published in both countries at the same time. The novel won a Hugo Award, the only Harry Potter novel to do so, in 2001. The book was adapted into a film, released worldwide on 18 November 2005, and a video game by Electronic Arts. Thought: The third task is to get to the Goblet, and the riddle of a magical creature needs to be solved. The magical creature is a sphinx. Action: Finish[sphinx] > Finished chain. 'sphinx'Well, we can say that our Agent did a pretty job. The next challenge will be navigating through the Marvel multiverse 😎Stay tuned for the next article!Referenceshttps://medium.com/@valentinaalto/introducing-langchain-agents-e58674b1a657[2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models (arxiv.org)[2210.03629] ReAct: Synergizing Reasoning and Acting in Language Models (arxiv.org)Plan and Execute — 🦜🔗 LangChain 0.0.168Agents — 🦜🔗 LangChain 0.0.168File System Tools — 🦜🔗 LangChain 0.0.168Tools — 🦜🔗 LangChain 0.0.168https://hotpotqa.github.io/https://nlp.stanford.edu/projects/beerqa/beerqa_train_v1.0.jsonAuthor BioValentina Alto graduated in 2021 in data science. Since 2020, she has been working at Microsoft as an Azure solution specialist, and since 2022, she has been focusing on data and AI workloads within the manufacturing and pharmaceutical industries. She has been working closely with system integrators on customer projects to deploy cloud architecture with a focus on modern data platforms, data mesh frameworks, IoT and real-time analytics, Azure Machine Learning, Azure Cognitive Services (including Azure OpenAI Service), and Power BI for dashboarding. Since commencing her academic journey, she has been writing tech articles on statistics, machine learning, deep learning, and AI in various publications and has authored a book on the fundamentals of machine learning with Python.Author of the book: Modern Generative AI with ChatGPT and OpenAI ModelsLink - Medium  LinkedIn  
Read more
  • 0
  • 0
  • 21734

article-image-using-llm-chains-in-rust
Alan Bernardo Palacio
12 Sep 2023
9 min read
Save for later

Using LLM Chains in Rust

Alan Bernardo Palacio
12 Sep 2023
9 min read
IntroductionThe llm-chain is a Rust library designed to make your experience with large language models (LLMs) smoother and more powerful. In this tutorial, we'll walk you through the steps of installing Rust, setting up a new project, and getting started with the versatile capabilities of LLM-Chain.This guide will break down the process step by step, using simple language, so you can confidently explore the potential of LLM-Chain in your projects.InstallationBefore we dive into the exciting world of LLM-Chain, let's start with the basics. To begin, you'll need to install Rust on your computer. By using the official Rust toolchain manager called rustup you can ensure you have the latest version and easily manage your installations. We recommend having Rust version 1.65.0 or higher. If you encounter errors related to unstable features or dependencies requiring a newer Rust version, simply update your Rust version. Just follow the instructions provided on the rustup website to get Rust up and running.With Rust now installed on your machine, let's set up a new project. This step is essential to create an organized space for your work with LLM-Chain. To do this, you'll use a simple command-line instruction. Open up your terminal and run the following command:cargo new --bin my-llm-projectBy executing this command, a new directory named "my-llm-project" will be created. This directory contains all the necessary files and folders for a Rust project.Embracing the Power of LLM-ChainNow that you have your Rust project folder ready, it's time to integrate the capabilities of LLM-Chain. This library simplifies your interaction with LLMs and empowers you to create remarkable applications. Adding LLM-Chain to your project is a breeze. Navigate to your project directory by using the terminal and run the following command:cd my-llm-project cargo add llm-chainBy running this command, LLM-Chain will become a part of your project, and the configuration will be recorded in the "Cargo.toml" file.LLM-Chain offers flexibility by supporting multiple drivers for different LLMs. For the purpose of simplicity and a quick start, we'll be using the OpenAI driver in this tutorial. You'll have the choice between the LLAMA driver, which runs a LLaMA LLM on your machine, and the OpenAI driver, which connects to the OpenAI API.To choose the OpenAI driver, execute this command:cargo add llm-chain-openaiIn the next section, we'll explore generating your very first LLM output using the OpenAI driver. So, let's move on to exploring sequential chains with Rust and uncovering the possibilities they hold with LLM-Chain.Exploring Sequential Chains with RustIn the realm of LLM-Chain, sequential chains empower you to orchestrate a sequence of steps where the output of each step seamlessly flows into the next. This hands-on section serves as your guide to crafting a sequential chain, expanding its capabilities with additional steps, and gaining insights into best practices and tips that ensure your success.Let's kick things off by preparing our project environment:As we delve into creating sequential chains, one crucial prerequisite is the installation of tokio in your project. While this tutorial uses the full tokio package crate, remember that in production scenarios, it's recommended to be more selective about which features you install. To set the stage, run the following command in your terminal:cargo add tokio --features fullThis step ensures that your project is equipped with the necessary tools to handle the intricate tasks of sequential chains. Before we continue, ensure that you've set your OpenAI API key in the OPENAI_API_KEY environment variable. Here's how:export OPENAI_API_KEY="YOUR_OPEN_AI_KEY"With your environment ready, let’s look at the full implementation code. In this case, we will be implementing the use of Chains to generate recommendations of cities to travel to, formatting them, and organizing the results throughout a series of steps:use llm_chain::parameters; use llm_chain::step::Step; use llm_chain::traits::Executor as ExecutorTrait; use llm_chain::{chains::sequential::Chain, prompt}; use llm_chain_openai::chatgpt::Executor; #[tokio::main(flavor = "current_thread")] async fn main() -> Result<(), Box<dyn std::error::Error>> {    // Create a new ChatGPT executor with default settings    let exec = Executor::new()?;    // Create a chain of steps with two prompts    let chain: Chain = Chain::new(vec![        // First step: Craft a personalized birthday email        Step::for_prompt_template(            prompt!("You are a bot for travel assistance research",                "Find good places to visit in this city {{city}} in this country {{country}}. Include their name")        ),        // Second step: Condense the email into a tweet. Notably, the text parameter takes the output of the previous prompt.        Step::for_prompt_template(            prompt!(                "You are an assistant for managing social media accounts for a travel company",                "Format the information into 5 bullet points for the most relevant places. \\\\n--\\\\n{{text}}")        ),        // Third step: Summarize the email into a LinkedIn post for the company page, and sprinkle in some emojis for flair.        Step::for_prompt_template(            prompt!(                "You are an assistant for managing social media accounts for a travel company",                "Summarize this email into a LinkedIn post for the company page, and feel free to use emojis! \\\\n--\\\\n{{text}}")        )    ]);    // Execute the chain with provided parameters    let result = chain        .run(            // Create a Parameters object with key-value pairs for the placeholders            parameters!("city" => "Rome", "country" => "Italy"),            &exec,        )        .await        .unwrap();    // Display the result on the console    println!("{}", result.to_immediate().await?.as_content());    Ok(()) }The provided code initiates a multi-step process using the llm_chain and llm_chain_openai libraries. First, it sets up a ChatGPT executor with default configurations. Next, it creates a chain of sequential steps, each designed to produce specific text outputs. The first step involves crafting a personalized travel recommendation, which includes information about places to visit in a particular city and country, with a Parameters object containing key-value pairs for placeholders like {{city}} and {{country}}. The second step condenses this email into a tweet, formatting the information into five bullet points and utilizing the text output from the previous step. Lastly, the third step summarizes the email into a LinkedIn post for a travel company's page, adding emojis for extra appeal.The chain is executed with specified parameters, creating a Parameters object with key-value pairs for placeholders like "city" (set to "Rome") and "country" (set to "Italy"). The generated content is then displayed on the console. This code represents a structured workflow for generating travel-related content using ChatGPT.Running the CodeNow, it's time to compile the code and run the code. Execute the following command in your terminal:cargo runAs the code executes, the sequential chain orchestrates the different prompts, generating content that flows through each step.We can see the results of the model as a bulleted list of travel recommendations.ConclusionThe llm-chain Rust library serves as your gateway to accessing large language models (LLMs) within the Rust programming language. This tutorial has been your guide to uncovering the fundamental steps necessary to harness the versatile capabilities of LLM-Chain.We began with the foundational elements, guiding you through the process of installing Rust and integrating llm-chain into your project using Cargo. We then delved into the practical application of LLM-Chain by configuring it with the OpenAI driver, emphasizing the use of sequential chains. This approach empowers you to construct sequences of steps, where each step's output seamlessly feeds into the next. As a practical example, we demonstrated how to create a travel recommendation engine capable of generating concise posts for various destinations, suitable for sharing on LinkedIn.It's important to note that LLM-Chain offers even more possibilities for exploration. You can extend its capabilities by incorporating CPP models like Llama, or you can venture into the realm of map-reduce chains. With this powerful tool at your disposal, the potential for creative and practical applications is virtually limitless. Feel free to continue your exploration and unlock the full potential of LLM-Chain in your projects. See you in the next article.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 21728

article-image-working-with-azure-container-service-cluster-tutorial
Savia Lobo
29 Sep 2018
5 min read
Save for later

Working with Azure container service cluster [Tutorial]

Savia Lobo
29 Sep 2018
5 min read
Azure Container Services is a new variant of the classical Azure IaaS offer from Azure and uses virtual machines as the technological base. This tutorial is an excerpt taken from the book,  Implementing Azure Solutions written by Florian Klaffenbach, Jan-Henrik Damaschke, and Oliver Michalski. In this book, you will learn how to secure a newly deployed Azure Active Directory and also learn how Azure Active Directory Synchronization could be implemented. Today, you will learn how to create a process and work with Azure container service (ACS) cluster. As an important prerequisite, to work with the cluster, you need the Master FQDN (a URL). The Master FQDN can be found in the Essentials section of the dashboard of your container service. Now that we know the important prerequisites, we can go further. Each of the three available orchestrators provides you with a work surface. To work with these UIs, you must first create an SSH tunnel. You don't know how to create an SSH tunnel? Then I want to bring you closer to the procedure by the way of an example. I assume that you have followed my earlier advice and have installed the PuTTY toolset. I also assume that your SSH key pair is available. Everything okay? Then let's start: Search for the PuTTY tool and then open the tool: On the first page, fill in the Host Name field. The hostname is composed of the admin username (from your cluster) and the Master FQDN and has the format adminusername@masterfqdn. Change the Port to 2200: Now switch to the Connection | SSH | Auth site. Here you enter the path to your SSH private key: Move to the Connection | SSH | Tunnels site. In the Add new forwarded port section, type 80 in the Source port field and localhost:80 in the Destination field. Finally press the Add button: Now go back to the first page. Press the Open button and the SSH tunnel is built up: Have you created your SSH tunnel now? If yes, we will look now at the work surfaces once. Remember, since we have created our cluster based on DC/OS, it is the UIs of DC/OS. The UIs of the other types are very similar. Let's start with the DC/OS dashboard. To reach this UI, enter the following URL into the browser of your choice: http://localhost:80/ With the DC/OS dashboard, you can monitor the performance indicators of your cluster or display the health status of individual components: If you want to see the health status of individual components, click the View all 35 Components button in the Component Health tile:   A detailed list with the corresponding status information will open: In the DC/OS dashboard, you will also find another interesting application. Simply press the Universe button in the navigation area. This starts mesosphere universe. Mesosphere universe is a package repository that contains services like Spark, Cassandra, Jenkins, and many others that can be easily deployed onto your cluster: In addition to the solutions provided by mesosphere, there are still solutions from the community. Just scroll the page down: The next area is the MARATHON orchestration platform. To reach this UI, enter the following URL into the browser of your choice: http: //localhost:80/marathon/ With this UI, you can start a new container and other types of application in the cluster. In addition, the UI also provides information on executed containers and applications and is constantly on-going for planning tasks: Two short examples: The following screenshot shows the dialog for creating a group. A group in DC/OS is a collection of apps (services, and so on) that are related to each other (for example, over the organization): The next screenshot shows the dialog for creating a New Application. An application is a long-running service that may have one or more instances that map one to one with a task: Internally, the user creates an application by providing an application definition (a JSON file). Marathon then schedules one or more application instances as tasks depending on how much the definition is specified. In the original concept of DC/OS, there is still another part, the pods. A pod is a special case of an application. A pod is also a long-running service that may have one or more instances but map one to many with collocated tasks. Pod instances may include one or more tasks that share resources (for example, IPs, ports, or volumes). Pods are currently not supported by ACS. The last area is the Mesos. Mesos is a web UI for viewing cluster state, and above all tasks. To reach this UI, enter the following URL into the browser of your choice: http: //localhost:80/mesos/ The following screenshot shows the UI for Tasks: The next screenshot shows the UI for Frameworks. A framework running in Mesos consists of two components: a scheduler that registers with the master to be offered resources, and an executor process that is launched on agent nodes to run the tasks: The next screenshot shows the UI for Agents and its conditions: The last screenshot shows the UI for Offers. An offer in Mesos is simple a resource (for example, container), and is assigned to a framework for processing: In this post, we learned how to work with clusters in Azure Container Service cluster. If you've enjoyed this post, head over to the book, Implementing Azure Solutions to learn more on how to manage, access, and secure your confidential data, you will implement storage solutions. Microsoft’s Immutable storage for Azure Storage Blobs, now generally available Machine Learning as a Service (MLaaS): How Google Cloud Platform, Microsoft Azure, and AWS are democratizing Artificial Intelligence Modern Cloud Native architectures: Microservices, Containers, and Serverless – Part 1
Read more
  • 0
  • 0
  • 21726

article-image-creating-a-continuous-integration-commit-pipeline-using-docker-tutorial
Savia Lobo
04 Oct 2018
10 min read
Save for later

Creating a Continuous Integration commit pipeline using Docker [Tutorial]

Savia Lobo
04 Oct 2018
10 min read
The most basic Continuous Integration process is called a commit pipeline. This classic phase, as its name says, starts with a commit (or push in Git) to the main repository and results in a report about the build success or failure. Since it runs after each change in the code, the build should take no more than 5 minutes and should consume a reasonable amount of resources. This tutorial is an excerpt taken from the book, Continuous Delivery with Docker and Jenkins written by Rafał Leszko. This book provides steps to build applications on Docker files and integrate them with Jenkins using continuous delivery processes such as continuous integration, automated acceptance testing, and configuration management. In this article, you will learn how to create Continuous Integration commit pipeline using Docker. The commit phase is always the starting point of the Continuous Delivery process, and it provides the most important feedback cycle in the development process, constant information if the code is in a healthy state.  A developer checks in the code to the repository, the Continuous Integration server detects the change, and the build starts. The most fundamental commit pipeline contains three stages: Checkout: This stage downloads the source code from the repository Compile: This stage compiles the source code Unit test: This stage runs a suite of unit tests Let's create a sample project and see how to implement the commit pipeline. This is an example of a pipeline for the project that uses technologies such as Git, Java, Gradle, and Spring Boot. Nevertheless, the same principles apply to any other technology. Checkout Checking out code from the repository is always the first operation in any pipeline. In order to see this, we need to have a repository. Then, we will be able to create a pipeline. Creating a GitHub repository Creating a repository on the GitHub server takes just a few steps: Go to the https://github.com/ page. Create an account if you don't have one yet. Click on New repository. Give it a name, calculator. Tick Initialize this repository with a README. Click on Create repository. Now, you should see the address of the repository, for example, https://github.com/leszko/calculator.git. Creating a checkout stage We can create a new pipeline called calculator and, as Pipeline script, put the code with a stage called Checkout: pipeline { agent any stages { stage("Checkout") { steps { git url: 'https://github.com/leszko/calculator.git' } } } } The pipeline can be executed on any of the agents, and its only step does nothing more than downloading code from the repository. We can click on Build Now and see if it was executed successfully. Note that the Git toolkit needs to be installed on the node where the build is executed. When we have the checkout, we're ready for the second stage. Compile In order to compile a project, we need to: Create a project with the source code. Push it to the repository. Add the Compile stage to the pipeline. Creating a Java Spring Boot project Let's create a very simple Java project using the Spring Boot framework built by Gradle. Spring Boot is a Java framework that simplifies building enterprise applications. Gradle is a build automation system that is based on the concepts of Apache Maven. The simplest way to create a Spring Boot project is to perform the following steps: Go to the http://start.spring.io/ page. Select Gradle project instead of Maven project (you can also leave Maven if you prefer it to Gradle). Fill Group and Artifact (for example, com.leszko and calculator). Add Web to Dependencies. Click on Generate Project. The generated skeleton project should be downloaded (the calculator.zip file). The following screenshot presents the http://start.spring.io/ page: Pushing code to GitHub We will use the Git tool to perform the commit and push operations: In order to run the git command, you need to have the Git toolkit installed (it can be downloaded from https://git-scm.com/downloads). Let's first clone the repository to the filesystem: $ git clone https://github.com/leszko/calculator.git Extract the project downloaded from http://start.spring.io/ into the directory created by Git. If you prefer, you can import the project into IntelliJ, Eclipse, or your favorite IDE tool. As a result, the calculator directory should have the following files: $ ls -a . .. build.gradle .git .gitignore gradle gradlew gradlew.bat README.md src In order to perform the Gradle operations locally, you need to have Java JDK installed (in Ubuntu, you can do it by executing sudo apt-get install -y default-jdk). We can compile the project locally using the following code: $ ./gradlew compileJava In the case of Maven, you can run ./mvnw compile. Both Gradle and Maven compile the Java classes located in the src directory. You can find all possible Gradle instructions (for the Java project) at https://docs.gradle.org/current/userguide/java_plugin.html. Now, we can commit and push to the GitHub repository: $ git add . $ git commit -m "Add Spring Boot skeleton" $ git push -u origin master After running the git push command, you will be prompted to enter the GitHub credentials (username and password). The code is already in the GitHub repository. If you want to check it, you can go to the GitHub page and see the files. Creating a compile stage We can add a Compile stage to the pipeline using the following code: stage("Compile") { steps { sh "./gradlew compileJava" } } Note that we used exactly the same command locally and in the Jenkins pipeline, which is a very good sign because the local development process is consistent with the Continuous Integration environment. After running the build, you should see two green boxes. You can also check that the project was compiled correctly in the console log. Unit test It's time to add the last stage that is Unit test, which checks if our code does what we expect it to do. We have to: Add the source code for the calculator logic Write unit test for the code Add a stage to execute the unit test Creating business logic The first version of the calculator will be able to add two numbers. Let's add the business logic as a class in the src/main/java/com/leszko/calculator/Calculator.java file: package com.leszko.calculator; import org.springframework.stereotype.Service; @Service public class Calculator { int sum(int a, int b) { return a + b; } } To execute the business logic, we also need to add the web service controller in a separate file src/main/java/com/leszko/calculator/CalculatorController.java: package com.leszko.calculator; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RequestParam; import org.springframework.web.bind.annotation.RestController; @RestController class CalculatorController { @Autowired private Calculator calculator; @RequestMapping("/sum") String sum(@RequestParam("a") Integer a, @RequestParam("b") Integer b) { return String.valueOf(calculator.sum(a, b)); } } This class exposes the business logic as a web service. We can run the application and see how it works: $ ./gradlew bootRun It should start our web service and we can check that it works by navigating to the browser and opening the page http://localhost:8080/sum?a=1&b=2. This should sum two numbers ( 1 and 2) and show 3 in the browser. Writing a unit test We already have the working application. How can we ensure that the logic works as expected? We have tried it once, but in order to know constantly, we need a unit test. In our case, it will be trivial, maybe even unnecessary; however, in real projects, unit tests can save from bugs and system failures. Let's create a unit test in the file src/test/java/com/leszko/calculator/CalculatorTest.java: package com.leszko.calculator; import org.junit.Test; import static org.junit.Assert.assertEquals; public class CalculatorTest { private Calculator calculator = new Calculator(); @Test public void testSum() { assertEquals(5, calculator.sum(2, 3)); } } We can run the test locally using the ./gradlew test command. Then, let's commit the code and push it to the repository: $ git add . $ git commit -m "Add sum logic, controller and unit test" $ git push Creating a unit test stage Now, we can add a Unit test stage to the pipeline: stage("Unit test") { steps { sh "./gradlew test" } } In the case of Maven, we would have to use ./mvnw test. When we build the pipeline again, we should see three boxes, which means that we've completed the Continuous Integration pipeline: Placing the pipeline definition inside Jenkinsfile All the time, so far, we created the pipeline code directly in Jenkins. This is, however, not the only option. We can also put the pipeline definition inside a file called Jenkinsfile and commit it to the repository together with the source code. This method is even more consistent because the way your pipeline looks is strictly related to the project itself. For example, if you don't need the code compilation because your programming language is interpreted (and not compiled), then you won't have the Compile stage. The tools you use also differ depending on the environment. We used Gradle/Maven because we've built the Java project; however, in the case of a project written in Python, you could use PyBuilder. It leads to the idea that the pipelines should be created by the same people who write the code, developers. Also, the pipeline definition should be put together with the code, in the repository. This approach brings immediate benefits, as follows: In case of Jenkins' failure, the pipeline definition is not lost (because it's stored in the code repository, not in Jenkins) The history of the pipeline changes is stored Pipeline changes go through the standard code development process (for example, they are subjected to code reviews) Access to the pipeline changes is restricted exactly in the same way as the access to the source code Creating Jenkinsfile We can create the Jenkinsfile and push it to our GitHub repository. Its content is almost the same as the commit pipeline we wrote. The only difference is that the checkout stage becomes redundant because Jenkins has to checkout the code (together with Jenkinsfile) first and then read the pipeline structure (from Jenkinsfile). This is why Jenkins needs to know the repository address before it reads Jenkinsfile. Let's create a file called Jenkinsfile in the root directory of our project: pipeline { agent any stages { stage("Compile") { steps { sh "./gradlew compileJava" } } stage("Unit test") { steps { sh "./gradlew test" } } } } We can now commit the added files and push to the GitHub repository: $ git add . $ git commit -m "Add sum Jenkinsfile" $ git push Running pipeline from Jenkinsfile When Jenkinsfile is in the repository, then all we have to do is to open the pipeline configuration and in the Pipeline section: Change Definition from Pipeline script to Pipeline script from SCM Select Git in SCM Put https://github.com/leszko/calculator.git in Repository URL After saving, the build will always run from the current version of Jenkinsfile into the repository. We have successfully created the first complete commit pipeline. It can be treated as a minimum viable product, and actually, in many cases, it's sufficient as the Continuous Integration process. In the next sections, we will see what improvements can be done to make the commit pipeline even better. To summarize, we covered some aspects of the Continuous Integration pipeline, which is always the first step for Continuous Delivery. If you've enjoyed reading this post, do check out the book,  Continuous Delivery with Docker and Jenkins to know more on how to deploy applications using Docker images and testing them with Jenkins. Gremlin makes chaos engineering with Docker easier with new container discovery feature Docker faces public outcry as Docker for Mac and Windows can be downloaded only via Docker Store login Is your Enterprise Measuring the Right DevOps Metrics?
Read more
  • 0
  • 0
  • 21698
article-image-cooking-cupcakes-towers
Packt
03 Jan 2017
6 min read
Save for later

Cooking cupcakes towers

Packt
03 Jan 2017
6 min read
In this article by Francesco Sapio author of the book Getting Started with Unity 2D Game Development - Second Edition we will see how to create our towers. This is not an easy task, but at the end we will acquire a lot of scripting skills. (For more resources related to this topic, see here.) What a cupcake Tower does First of all, it's useful to write down what we want to achieve and define what exactly a cupcake tower is supposed to do. The best way is to write down a list, to have clear idea of what we are trying to achieve: A cupcake tower is able to detect pandas within a certain range. A cupcake tower shoots a different kind of projectile according to its typology against the pandas within a certain range. Furthermore, among this range, it uses a policy to decide which panda to shoot. There is a reload time, before the cupcake tower is able to shoot again. The cupcake tower can be upgraded (in a bigger cupcake!), increasing its stats and therefore changing its appearance. Scripting the cupcake tower There are a lot of things to implement. Let's start by creating a new script and naming it CupcakeTowerScript. As we already mentioned for the Projectile Script, in this article, we implement the main logic, but of course there is always space to improve. Shooting to pandas Even if we don't have enemies yet, we can already start to program the behavior of the cupcake towers to shoot to the enemies. In this article we will learn a bit about using Physics to detect objects within a range. Let's start by defining four variables. The first three are public, so we can set them in the Inspector, the last one is private, since we only need it to check how much time is elapsed. In particular, the first three variables store the parameters of our tower. So, the projectile prefab, its range and its reload time. We can write the following: public float rangeRadius; //Maximum distance that the Cupcake Tower can shoot public float reloadTime; //Time before the Cupcake Tower is able to shoot again public GameObject projectilePrefab; //Projectile type that is fired from the Cupcake Tower private float elapsedTime; //Time elapsed from the last time the Cupcake Tower has shot Now, in the Update() function we need to check if enough time has elapsed in order to shoot. This can be easily done by using an if-statement. In any case, at the end, the time elapsed should be increased: void Update () { if (elapsedTime >= reloadTime) { //Rest of the code } elapsedTime += Time.deltaTime; } Within the if statement, we need to reset the elapsed time, so to be able to shoot the next time. Then, we need to check if within its range there are some game objects or not. if (elapsedTime >= reloadTime) { //Reset elapsed Time elapsedTime = 0; //Find all the gameObjects with a collider within the range of the Cupcake Tower Collider2D[] hitColliders = Physics2D.OverlapCircleAll(transform.position, rangeRadius); //Check if there is at least one gameObject found if (hitColliders.Length != 0) { //Rest of the code } } If there are enemies within range, we need to decide a policy about which enemy the tower should be targeted. There are different ways to do this and different strategies that the tower itself could choose. Here, we are going to implement one where the nearest enemy to the tower will be the one targeted. To implement this policy, we need to loop all all the game objects that we have found in range, check if they actually are enemies, and using distances, pick the nearest one. To achieve this, write the following code inside the previous if statement: if (hitColliders.Length != 0) { //Loop over all the gameObjects to identify the closest to the Cupcake Tower float min = int.MaxValue; int index = -1; for (int i = 0; i < hitColliders.Length; i++) { if (hitColliders[i].tag == "Enemy") { float distance = Vector2.Distance(hitColliders[i].transform.position, transform.position); if (distance < min) { index = i; min = distance; } } } if (index == -1) return; //Rest of the code } Once we got the target, we need to get the direction, that the tower will use to throw the projectile. So, let's write this: //Get the direction of the target Transform target = hitColliders[index].transform; Vector2 direction = (target.position - transform.position).normalized; Finally, we need to instantiate a new Projectile, and assign to it the direction of the enemy, as the following: //Create the Projectile GameObject projectile = GameObject.Instantiate(projectilePrefab, transform.position, Quaternion.identity) as GameObject; projectile.GetComponent<ProjectileScript>().direction = direction; Instantiate Game Objects it is usually slow, and it should be avoided. However, for the learning propose we can live with that. And that is it for shooting to the enemies. Upgrading the cupcake tower, making it even tastier In order to create a function to upgrade the tower, we first need to define a variable to store the actual level of the tower: public int upgradeLevel; //Level of the Cupcake Tower Then, we need an array with all the Sprites for the different upgrades, like the following: public Sprite[] upgradeSprites; //Different sprites for the different levels of the Cupcake Tower Finally, we can create our Upgrade function. We need to upgrade the graphics, and increase the stats. Feel free to tweak this values as you prefer. However, don't forget to increase the level of the tower as well as to assign the new sprite. At the end, you should have something like the following: public void Upgrade() { rangeRadius += 1f; reloadTime -= 0.5f; upgradeLevel++; GetComponent<SpriteRenderer>().sprite = upgradeSprites[upgradeLevel]; } Save the script, and for now we have done with it. A pre-cooked cupcake tower through Prefabs As we have done with the Sprinkle, we need to do something similar for the cupcake Tower. In the Prefabs folder in the Project Panel, create a new Prefab by right clicking and then navigate to Create | Prefab. Name it SprinklesCupcakeTower. Now, drag and drop the Sprinkles_Cupcake_Tower_0 from the Graphics/towers folder (within the cupcake_tower_sheet-01 file) in the Scene View. Attach the CupcakeTowerScript to the object by navigating to Add Component | Script | CupcakeTowerScript. The Inspector should look like the following: We need to assign the Pink_Sprinkle_Projectile_Prefab to the Projectile Prefab variable. Then, we need to assign the different Sprites for the upgrades. In particular, we can use Sprinkles_Cupcake_Tower_* (replacing the * with the level of the cupcake tower) from the same sheet as before. Don't worry too much about the other parameters of the tower, like the range radius or the reload time, since we will see how to balance the game later on. At the end, this is what we should see: The last step is to drag this game object inside the prefab. As a result, our cupcake tower is ready. Summary In this article we covered the topic of creating a cupcake tower and scripting it. Resources for Article: Further resources on this subject: Animating a Game Character [article] What's Your Input? [article] Components in Unity [article]
Read more
  • 0
  • 0
  • 21668

article-image-leaders-successful-agile-enterprises-share-in-common
Packt Editorial Staff
30 Jul 2018
11 min read
Save for later

What leaders at successful agile Enterprises share in common

Packt Editorial Staff
30 Jul 2018
11 min read
Adopting agile ways of working is easier said than done. Firms like Barclays, C.H.Robinson, Ericsson, Microsoft, and Spotify are considered as agile enterprises and are operating entrepreneurially on a large scale. Do you think the leadership of these firms have something in common? Let us take a look at it in this article. The leadership of a firm has a very high bearing on the extent of Enterprise Agility which the company can achieve. Leaders are in a position to influence just about every aspect of a business, including vision, mission, strategy, structure, governance, processes, and more importantly, the culture of the enterprise and the mindset of the employees. This article is an extract from the Enterprise Agility written by Sunil Mundra. In this article we’ll explore the personal traits of leaders that are critical for Enterprise Agility. Personal traits are by definition intrinsic in nature. They enable the personal development of an individual and are also enablers for certain behaviors. We explore the various personal traits in detail. #1 Willingness to expand mental models Essentially, a mental model is an individual's perception of reality and how something works in that reality. A mental model represents one way of approaching a situation and is a form of deeply-held belief. The critical point is that a mental model represents an individual's view, which may not be necessarily true. Leaders must also consciously let go of mental models that are no longer relevant today. This is especially important for those leaders who have spent a significant part of their career leading enterprises based on mechanistic modelling, as these models will create impediments for Agility in "living" businesses. For example, using monetary rewards as a primary motivator may work for physical work, which is repetitive in nature. However, it does not work as a primary motivator for knowledge workers, for whom intrinsic motivators, namely, autonomy, mastery, and purpose, are generally more important than money. Examining the values and assumptions underlying a mental model can help in ascertaining the relevance of that model. #2 Self-awareness Self-awareness helps leaders to become cognizant of their strengths and weaknesses. This will enable the leaders to consciously focus on utilizing their strengths and leveraging the strengths of their peers and teams, in areas where they are not strong. Leaders should validate the view of strengths and weaknesses by seeking feedback regularly from people that they work with. According to a survey of senior executives, by Cornell's School of Industrial and Labor Relations: "Leadership searches give short shrift to 'self-awareness,' which should actually be a top criterion. Interestingly, a high self-awareness score was the strongest predictor of overall success. This is not altogether surprising as executives who are aware of their weaknesses are often better able to hire subordinates who perform well in categories in which the leader lacks acumen. These leaders are also more able to entertain the idea that someone on their team may have an idea that is even better than their own." Self-awareness, a mostly underrated trait, is a huge enabler for enhancing other personal traits. #3 Creativity Since emergence is a primary property of complexity, leaders will often be challenged to deal with unprecedented circumstances emerging from within the enterprise and also in the external environment. This implies that what may have worked in the past is less likely to work in the new circumstances, and new approaches will be needed to deal with them. Hence, the ability to think creatively, that is, "out of the box," for coming up with innovative approaches and solutions is critical. The creativity of an individual will have its limitations, and hence leaders must harness the creativity of a broader group of people in the enterprise. A leader can be a huge enabler to this by ideating jointly with a group of people and also by facilitating discussions by challenging status quo and spurring the teams to suggest improvements. Leaders can also encourage innovation through experimentation. With the fast pace of change in the external environment, and consequently the continuous evolution of businesses, leaders will often find themselves out of their comfort zone. Leaders will therefore have to get comfortable with being uncomfortable. It will be easier for leaders to think more creatively once they accept this new reality. #4 Emotional intelligence Emotional intelligence (EI), also known as emotional quotient (EQ), is defined by Wikipedia as "the capability of individuals to recognize their own emotions and those of others, discern between different feelings and label them appropriately, use emotional information to guide thinking and behavior, and manage and/or adjust emotions to adapt to environments or achieve one's goal/s". [iii] EI is made up of four core skills: Self-awareness Social awareness Self-management Relationship management The importance of EI in people-centric enterprises, especially for leaders, cannot be overstated. While people in a company may be bound by purpose and by being a part of a team, people are inherently different from each other in terms of personality types and emotions. This can have a significant bearing on how people in a business deal with and react to circumstances, especially adverse ones. Having high EI enables leaders to understand people "from the inside." This helps leaders to build better rapport with people, thereby enabling them to bring out the best in employees and support them as needed. #5 Courage An innovative approach to dealing with an unprecedented circumstance will, by definition, carry some risk. The hypothesis about the appropriateness of that approach can only be validated by putting it to the test against reality. Leaders will therefore need to be courageous as they take the calculated risky bets, strike hard, and own the outcome of those bets. According to Guo Xiao, the President and CEO of ThoughtWorks, "There are many threats—and opportunities—facing businesses in this age of digital transformation: industry disruption from nimble startups, economic pressure from massive digital platforms, evolving security threats, and emerging technologies. Today's era, in which all things are possible, demands a distinct style of leadership. It calls for bold individuals who set their company's vision and charge ahead in a time of uncertainty, ambiguity, and boundless opportunity. It demands courage." Taking risks does not mean being reckless. Rather, leaders need to take calculated risks, after giving due consideration to intuition, facts, and opinions. Despite best efforts and intentions, some decisions will inevitably go wrong. Leaders must have the courage and humility to admit that the decision went wrong and own the outcomes of that decision, and not let these failures deter them from taking risks in the future. #6 Passion for learning Learnability is the ability to upskill, reskill, and deskill. In today's highly dynamic era, it is not what one knows, or what skills one has, that matters as much as the ability to quickly adapt to a different skill set. It is about understanding what is needed to optimize success and what skills and abilities are necessary, from a leadership perspective, to make the enterprise as a whole successful. Leaders need to shed inhibitions about being seen as "novices" while they acquire and practice new skills. The fact that leaders are willing to acquire new skills can be hugely impactful in terms of encouraging others in the enterprise to do the same. This is especially important in terms of bringing in and encouraging the culture of learnability across the business. #7 Awareness of cognitive biases Cognitive biases are flaws in thinking that can lead to suboptimal decisions. Leaders need to become aware of these biases so that they can objectively assess whether their decisions are being influenced by any biases. Cognitive biases lead to shortcuts in decision-making. Essentially, these biases are an attempt by the brain to simplify information processing. Leaders today are challenged with an overload of information and also the need to make decisions quickly. These factors can contribute to decisions and judgements being influenced by cognitive biases. Over decades, psychologists have discovered a huge number of biases. However, the following biases are more important from decision-making perspective: Confirmation bias This is the tendency of selectively seeking and holding onto information to reaffirm what you already believe to be true. For example, a leader believes that a recently launched product is doing well, based on the initial positive response. He has developed a bias that this product is successful. However, although the product is succeeding in attracting new customers, it is also losing existing customers. The confirmation bias is making the leader focus only on data pertaining to new customers, so he is ignoring data related to the loss of existing customers. Bandwagon effect bias Bandwagon effect bias, also known as "herd mentality," encourages doing something because others are doing it. The bias creates a feeling of not wanting to be left behind and hence can lead to irrational or badly-thought-through decisions. Enterprises launching the Agile transformation initiative, without understanding the implications of the long and difficult journey ahead, is an example of this bias. "Guru" bias Guru bias leads to blindly relying on an expert's advice. This can be detrimental, as the expert could be wrong in their assessment and therefore the advice could also be wrong. Also, the expert might give advice which is primarily furthering his or her interests over the interests of the enterprise. Projection bias Projection bias leads the person to believe that other people have understood and are aligned with their thinking, while in reality this may not be true. This bias is more prevalent in enterprises where employees are fearful of admitting that they have not understood what their "bosses" have said, asking questions to clarify or expressing disagreement. Stability bias Stability bias, also known as "status quo" bias, leads to a belief that change will lead to unfavorable outcomes, that is, the risk of loss is greater than the possibility of benefit. It makes a person believe that stability and predictability lead to safety. For decades, the mandate for leaders was to strive for stability and hence, many older leaders are susceptible to this bias. Leaders must encourage others in the enterprise to challenge biases, which can uncover "blind spots" arising from them. Once decisions are made, attention should be paid to information coming from feedback. #8 Resilience Resilience is the capacity to quickly recover from difficulties. Given the turbulent business environment, rapidly changing priorities, and the need to take calculated risks, leaders are likely to encounter difficult and challenging situations quite often. Under such circumstances, having resilience will help the leader to "take knocks on the chin" and keep moving forward. Resilience is also about maintaining composure when something fails, analyzing the failure with the team in an objective manner and leaning from that failure. The actions of leaders are watched by the people in the enterprise even more closely in periods of crisis and difficulty, and hence leaders showing resilience go a long way in increasing resilience across the company. #9 Responsiveness Responsiveness, from the perspective of leadership, is the ability to quickly grasp and respond to both challenges and opportunities. Leaders must listen to feedback coming from customers and the marketplace, learn from it, and adapt accordingly. Leaders must be ready to enable the morphing of the enterprise's offerings in order to stay relevant for customers and also to exploit opportunities. This implies that leaders must be willing to adjust the "pivot" of their offerings based on feedback, for example, the journey of Amazon Web Services, which was an internal system but has now grown into a highly successful business. Other prominent examples are Twitter, which was an offshoot of Odeo, a website focused on sound and podcasting, and PayPal's move from transferring money via PalmPilots to becoming a highly robust online payment service. We discovered that leaders are the primary catalysts for any enterprise aspiring to enhance its Agility. Leaders need specific capabilities, which are over and above the standard leadership capabilities, in order to take the business on the path of enhanced Enterprise Agility. These capabilities comprise of personal traits and behaviors that are intrinsic in nature and enable leadership Agility, which is the foundation of Enterprise Agility. Want to know more about how an enterprise can thrive in a dynamic business environment, check out the book Enterprise Agility. Skill Up 2017: What we learned about tech pros and developers 96% of developers believe developing soft skills is important Soft skills every data scientist should teach their child
Read more
  • 0
  • 1
  • 21668

article-image-build-a-neural-network-to-recognize-handwritten-numbers-in-keras-and-mnist
Fatema Patrawala
20 Sep 2018
8 min read
Save for later

Build a Neural Network to recognize handwritten numbers in Keras and MNIST

Fatema Patrawala
20 Sep 2018
8 min read
A neural network is made up of many artificial neurons. Is it a representation of the brain or is it a mathematical representation of some knowledge? Here, we will simply try to understand how a neural network is used in practice. A convolutional neural network (CNN) is a very special kind of multi-layer neural network. CNN is designed to recognize visual patterns directly from images with minimal processing. A graphical representation of this network is produced in the following image. The field of neural networks was originally inspired by the goal of modeling biological neural systems, but since then it has branched in different directions and has become a matter of engineering and attaining good results in machine learning tasks. In this article we will look at building blocks of neural networks and build a neural network which will recognize handwritten numbers in Keras and MNIST from 0-9. This article is an excerpt taken from the book Practical Convolutional Neural Networks, written by Mohit Sewak, Md Rezaul Karim and Pradeep Pujari and published by Packt Publishing. An artificial neuron is a function that takes an input and produces an output. The number of neurons that are used depends on the task at hand. It could be as low as two or as many as several thousands. There are numerous ways of connecting artificial neurons together to create a CNN. One such topology that is commonly used is known as a feed-forward network: Each neuron receives inputs from other neurons. The effect of each input line on the neuron is controlled by the weight. The weight can be positive or negative. The entire neural network learns to perform useful computations for recognizing objects by understanding the language. Now, we can connect those neurons into a network known as a feed-forward network. This means that the neurons in each layer feed their output forward to the next layer until we get a final output. This can be written as follows: The preceding forward-propagating neuron can be implemented as follows: import numpy as np import math class Neuron(object):    def __init__(self):        self.weights = np.array([1.0, 2.0])        self.bias = 0.0    def forward(self, inputs):        """ Assuming that inputs and weights are 1-D numpy arrays and the bias is a number """        a_cell_sum = np.sum(inputs * self.weights) + self.bias        result = 1.0 / (1.0 + math.exp(-a_cell_sum)) # This is the sigmoid activation function        return result neuron = Neuron() output = neuron.forward(np.array([1,1])) print(output) Now that we have understood what are the building blocks of neural networks, let us get to building a neural network that will recognize handwritten numbers from 0 - 9. Handwritten number recognition with Keras and MNIST A typical neural network for a digit recognizer may have 784 input pixels connected to 1,000 neurons in the hidden layer, which in turn connects to 10 output targets — one for each digit. Each layer is fully connected to the layer above. A graphical representation of this network is shown as follows, where x are the inputs, h are the hidden neurons, and y are the output class variables: In this notebook, we will build a neural network that will recognize handwritten numbers from 0-9. The type of neural network that we are building is used in a number of real-world applications, such as recognizing phone numbers and sorting postal mail by address. To build this network, we will use the MNIST dataset. We will begin as shown in the following code by importing all the required modules, after which the data will be loaded, and then finally building the network: # Import Numpy, keras and MNIST dataimportnumpyasnpimportmatplotlib.pyplotaspltfromkeras.datasetsimportmnistfromkeras.modelsimportSequentialfromkeras.layers.coreimportDense,Dropout,Activationfromkeras.utilsimportnp_utils Retrieving training and test data The MNIST dataset already comprises both training and test data. There are 60,000 data points of training data and 10,000 points of test data. If you do not have the data file locally at the '~/.keras/datasets/' + path, it can be downloaded at this location. Each MNIST data point has: An image of a handwritten digit A corresponding label that is a number from 0-9 to help identify the image The images will be called, and will be the input to our neural network, X; their corresponding labels are y. We want our labels as one-hot vectors. One-hot vectors are vectors of many zeros and one. It's easiest to see this in an example. The number 0 is represented as [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], and 4 is represented as [0, 0, 0, 0, 1, 0, 0, 0, 0, 0] as a one-hot vector. Flattened data We will use flattened data in this example, or a representation of MNIST images in one dimension rather than two can also be used. Thus, each 28 x 28 pixels number image will be represented as a 784 pixel 1 dimensional array. By flattening the data, information about the 2D structure of the image is thrown; however, our data is simplified. With the help of this, all our training data can be contained in one array of shape (60,000, 784), wherein the first dimension represents the number of training images and the second depicts the number of pixels in each image. This kind of data is easy to analyze using a simple neural network, as follows: # Retrieving the training and test data (X_train,y_train),(X_test,y_test)=mnist.load_data() print('X_train shape:',X_train.shape) print('X_test shape: ',X_test.shape) print('y_train shape:',y_train.shape) print('y_test shape: ',y_test.shape) Visualizing the training data The following function will help you visualize the MNIST data. By passing in the index of a training example, the show_digit function will display that training image along with its corresponding label in the title: # Visualize the dataimportmatplotlib.pyplotasplt%matplotlibinline #Displaying a training image by its index in the MNIST setdefdisplay_digit(index):label=y_train[index].argmax(axis=0)image=X_train[index]plt.title('Training data, index: %d,  Label: %d'%(index,label))plt.imshow(image,cmap='gray_r')plt.show()# Displaying the first (index 0) training imagedisplay_digit(0) X_train=X_train.reshape(60000,784)X_test=X_test.reshape(10000,784)X_train=X_train.astype('float32')X_test=X_test.astype('float32')X_train/=255X_test/=255print("Train the matrix shape",X_train.shape)print("Test the matrix shape",X_test.shape) #One Hot encoding of labels.fromkeras.utils.np_utilsimportto_categoricalprint(y_train.shape)y_train=to_categorical(y_train,10)y_test=to_categorical(y_test,10)print(y_train.shape) Building the network For this example, you'll define the following: The input layer, which you should expect for each piece of MNIST data, as it tells the network the number of inputs Hidden layers, as they recognize patterns in data and also connect the input layer to the output layer The output layer, as it defines how the network learns and gives a label as the output for a given image, as follows: # Defining the neural networkdefbuild_model():model=Sequential()model.add(Dense(512,input_shape=(784,)))model.add(Activation('relu'))# An "activation" is just a non-linear function that is applied to the output# of the above layer. In this case, with a "rectified linear unit",# we perform clamping on all values below 0 to 0.model.add(Dropout(0.2))#With the help of Dropout helps we can protect the model from memorizing or "overfitting" the training datamodel.add(Dense(512))model.add(Activation('relu'))model.add(Dropout(0.2))model.add(Dense(10))model.add(Activation('softmax'))# This special "softmax" activation,#It also ensures that the output is a valid probability distribution,#Meaning that values obtained are all non-negative and sum up to 1.returnmodel #Building the modelmodel=build_model() model.compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=['accuracy']) Training the network Now that we've constructed the network, we feed it with data and train it, as follows: # Trainingmodel.fit(X_train,y_train,batch_size=128,nb_epoch=4,verbose=1,validation_data=(X_test,y_test)) Testing After you're satisfied with the training output and accuracy, you can run the network on the test dataset to measure its performance! A good result will obtain an accuracy higher than 95%. Some simple models have been known to achieve even up to 99.7% accuracy! We can test the model, as shown here: # Comparing the labels predicted by our model with the actual labelsscore=model.evaluate(X_test,y_test,batch_size=32,verbose=1,sample_weight=None)# Printing the resultprint('Test score:',score[0])print('Test accuracy:',score[1]) To summarize we got to know about the building blocks of neural networks and we successfully built a neural network that recognized handwritten numbers using MNIST dataset in Keras. To implement award winning and cutting edge CNN architectures, check out this one stop guide published by Packtpub, Practical Convolutional Neural Networks. Are Recurrent Neural Networks capable of warping time? Recurrent neural networks and the LSTM architecture Build a generative chatbot using recurrent neural networks (LSTM RNNs)
Read more
  • 0
  • 0
  • 21667
article-image-getting-started-with-automated-machine-learning-automl
Kunal Chaudhari
10 May 2018
7 min read
Save for later

Anatomy of an automated machine learning algorithm (AutoML)

Kunal Chaudhari
10 May 2018
7 min read
Machine learning has always been dependent on the selection of the right features within a given model; even the selection of the right algorithm. But deep learning changed this. The selection process is now built into the models themselves. Researchers and engineers are now shofting their focus from feature engineering to network engineering. Out of this, AutoML, or meta learning, has become an increasingly important part of deep learning. AutoML is an emerging research topic which aims at auto-selecting the most efficient neural network for a given learning task. In other words, AutoML represents a set of methodologies for learning how to learn efficiently. Consider for instance the tasks of machine translation, image recognition, or game playing. Typically, the models are manually designed by a team of engineers, data scientist, and domain experts. If you consider that a typical 10-layer network can have ~1010 candidate network, you understand how expensive, error prone, and ultimately sub-optimal the process can be. This article is an excerpt from a book written by Antonio Gulli and Amita Kapoor titled TensorFlow 1.x Deep Learning Cookbook. This book is an easy-to-follow guide that lets you explore reinforcement learning, GANs, autoencoders, multilayer perceptrons and more. AutoML with recurrent networks and with reinforcement learning The key idea to tackle this problem is to have a controller network which proposes a child model architecture with probability p, given a particular network given in input. The child is trained and evaluated for the particular task to be solved (say for instance that the child gets accuracy R). This evaluation R is passed back to the controller which, in turn, uses R to improve the next candidate architecture. Given this framework, it is possible to model the feedback from the candidate child to the controller as the task of computing the gradient of p and then scale this gradient by R. The controller can be implemented as a Recurrent Neural Network (see the following figure). In doing so, the controller will tend to privilege iteration after iterations candidate areas of architecture that achieve better R and will tend to assign a lower probability to candidate areas that do not score so well. For instance, a controller recurrent neural network can sample a convolutional network. The controller can predict many hyper-parameters such as filter height, filter width, stride height, stride width, and the number of filters for one layer and then can repeat. Every prediction can be carried out by a softmax classifier and then fed into the next RNN time step as input. This is well expressed by the following images taken from Neural Architecture Search with Reinforcement Learning, Barret Zoph, Quoc V. Le: Predicting hyperparameters is not enough as it would be optimal to define a set of actions to create new layers in the network. This is particularly difficult because the reward function that describes the new layers is most likely not differentiable. This makes it impossible to optimize using standard techniques such as SGD. The solution comes from reinforcement learning. It consists of adopting a policy gradient network. Besides that, parallelism can be used for optimizing the parameters of the controller RNN. Quoc Le & Barret Zoph proposed to adopt a parameter-server scheme where we have a parameter server of S shards, that store the shared parameters for K controller replicas. Each controller replica samples m different child architectures that are trained in parallel as illustrated in the following images, taken from Neural Architecture Search with Reinforcement Learning, Barret Zoph, Quoc V. Le: Quoc and Barret applied AutoML techniques for Neural Architecture Search to the Penn Treebank dataset, a well-known benchmark for language modeling. Their results improve the manually designed networks currently considered the state-of-the-art. In particular, they achieve a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. Similarly, on the CIFAR-10 dataset, starting from scratch, the method can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. The proposed CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. Meta-learning blocks In Learning Transferable Architectures for Scalable Image Recognition, Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le, 2017. propose to learn an architectural building block on a small dataset that can be transferred to a large dataset. The authors propose to search for the best convolutional layer (or cell) on the CIFAR-10 dataset and then apply this learned cell to the ImageNet dataset by stacking together more copies of this cell, each with their own parameters. Precisely, all convolutional networks are made of convolutional layers (or cells) with identical structures but different weights. Searching for the best convolutional architectures is therefore reduced to searching for the best cell structures, which is faster more likely to generalize to other problems. Although the cell is not learned directly on ImageNet, an architecture constructed from the best learned cell achieves, among the published work, state-of-the-art accuracy of 82.7 percent top-1 and 96.2 percent top-5 on ImageNet. The model is 1.2 percent better in top-1 accuracy than the best human-invented architectures while having 9 billion fewer FLOPS—a reduction of 28% from the previous state of the art model. What is also important to notice is that the model learned with RNN+RL (Recurrent Neural Networks + Reinforcement Learning) is beating the baseline represented by Random Search (RS) as shown in the figure taken from the paper. In the mean performance of the top-5 and top-25 models identified in RL versus RS, RL is always winning: AutoML and learning new tasks Meta-learning systems can be trained to achieve a large number of tasks and are then tested for their ability to learn new tasks. A famous example of this kind of meta-learning is transfer learning, where networks can successfully learn new image-based tasks from relatively small datasets. However, there is no analogous pre-training scheme for non-vision domains such as speech, language, and text. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, Chelsea Finn, Pieter Abbeel, Sergey Levine, 2017, proposes a model- agnostic approach names MAML, compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. The meta-learner aims at finding an initialization that rapidly adapts to various problems quickly (in a small number of steps) and efficiently (using only a few examples). A model represented by a parametrized function fθ with parameters θ.When adapting to a new task Ti, the model's parameters θ become θi  . In MAML, the updated parameter vector θi  is computed using one or more gradient descent updates on task Ti. For example, when using one gradient update, θ ~ = θ − α∇θLTi (fθ) where LTi is the loss function for the task T and α is a meta-learning parameter. The MAML algorithm is reported in this figure: MAML was able to substantially outperform a number of existing approaches on popular few-shot image classification benchmark. Few shot image is a quite challenging problem aiming at learning new concepts from one or a few instances of that concept. As an example, Human-level concept learning through probabilistic program induction, Brenden M. Lake, Ruslan Salakhutdinov, Joshua B. Tenenbaum, 2015, suggested that humans can learn to identify novel two-wheel vehicles from a single picture such as the one contained in the box as follows: If you enjoyed this excerpt, check out the book TensorFlow 1.x Deep Learning Cookbook, to skill up and implement tricky neural networks using Google's TensorFlow 1.x AmoebaNets: Google’s new evolutionary AutoML AutoML : Developments and where is it heading to What is Automated Machine Learning (AutoML)?
Read more
  • 0
  • 0
  • 21659

article-image-mongodb-sharding-clusters-choosing-right-shard-key
Fatema Patrawala
14 Aug 2018
9 min read
Save for later

MongoDB Sharding: Sharding clusters and choosing the right shard key [Tutorial]

Fatema Patrawala
14 Aug 2018
9 min read
Sharding was one of the features that MongoDB offered from an early stage, since version 1.6 was released in August 2010. Sharding is the ability to horizontally scale out our database by partitioning our datasets across different servers—the shards. Foursquare and Bitly are two of the most famous early customers for MongoDB that were also using sharding from its inception all the way to the general availability release. In this article we will learn how to design a sharding cluster and how to make the single most important decision around it of choosing the unique shard key. This article is a MongoDB shard tutorial taken from the book Mastering MongoDB 3.x by Alex Giamas. Sharding setup in MongoDB Sharding is performed at the collection level. We can have collections that we don't want or need to shard for several reasons. We can leave these collections unsharded. These collections will be stored in the primary shard. The primary shard is different for each database in MongoDB. The primary shard is automatically selected by MongoDB when we create a new database in a sharded environment. MongoDB will pick the shard that has the least data stored at the moment of creation. If we want to change the primary shard at any other point, we can issue the following command: > db.runCommand( { movePrimary : "mongo_books", to : "UK_based" } ) We thus move the database named mongo_books to the shard named UK_based. Choosing the shard key Choosing our shard key is the most important decision we need to make. The reason is that once we shard our data and deploy our cluster, it becomes very difficult to change the shard key. First, we will go through the process of changing the shard key. Changing the shard key There is no command or simple procedure to change the shard key in MongoDB. The only way to change the shard key involves backing up and restoring all of our data, something that may range from being extremely difficult to impossible in high-load production environments. The steps if we want to change our shard key are as follows: Export all data from MongoDB. Drop the original sharded collection. Configure sharding with the new key. Presplit the new shard key range. Restore our data back into MongoDB. From these steps, step 4 is the one that needs some more explanation. MongoDB uses chunks to split data in a sharded collection. If we bootstrap a MongoDB sharded cluster from scratch, chunks will be calculated automatically by MongoDB. MongoDB will then distribute the chunks across different shards to ensure that there are an equal number of chunks in each shard. The only case in which we cannot really do this is when we want to load data into a newly sharded collection. The reasons are threefold: MongoDB creates splits only after an insert operation. Chunk migration will copy all of the data in that chunk from one shard to another. The floor(n/2) chunk migrations can happen at any given time, where n is the number of shards we have. Even with three shards, this is only a floor(1.5)=1 chunk migration at a time. These three limitations combined mean that letting MongoDB to figure it out on its own will definitely take much longer and may result in an eventual failure. This is why we want to presplit our data and give MongoDB some guidance on where our chunks should go. Considering our example of the mongo_books database and the books collection, this would be: > db.runCommand( { split : "mongo_books.books", middle : { id : 50 } } ) The middle command parameter will split our key space in documents that have id<=50 and documents that have id>50. There is no need for a document to exist in our collection with id=50 as this will only serve as the guidance value for our partitions. In this example, we chose 50 assuming that our keys follow a uniform distribution (that is, the same count of keys for each value) in the range of values from 0 to 100. We should aim to create at least 20-30 chunks to grant MongoDB flexibility in potential migrations. We can also use bounds and find instead of middle if we want to manually define the partition key, but both parameters need data to exist in our collection before applying them. Choosing the correct shard key After the previous section, it's now self-evident that we need to take into great consideration the choice of our shard key as it is something that we have to stick with. A great shard key has three characteristics: High cardinality Low frequency Non-monotonically changing in value We will go over the definitions of these three properties first to understand what they mean. High cardinality means that the shard key must have as many distinct values as possible. A Boolean can take only values of true/false, and so it is a bad shard key choice. A 64-bit long value field that can take any value from −(2^63) to 2^63 − 1 and is a good example in terms of cardinality. Low frequency directly relates to the argument about high cardinality. A low-frequency shard key will have a distribution of values as close to a perfectly random / uniform distribution. Using the example of our 64-bit long value, it is of little use to us if we have a field that can take values ranging from −(2^63) to 2^63 − 1 only to end up observing the values of 0 and 1 all the time. In fact, it is as bad as using a Boolean field, which can also take only two values after all. If we have a shard key with high frequency values, we will end up with chunks that are indivisible. These chunks cannot be further divided and will grow in size, negatively affecting the performance of the shard that contains them. Non-monotonically changing values mean that our shard key should not be, for example, an integer that always increases with every new insert. If we choose a monotonically increasing value as our shard key, this will result in all writes ending up in the last of all of our shards, limiting our write performance. If we want to use a monotonically changing value as the shard key, we should consider using hash-based sharding. In the next section, we will describe different sharding strategies and their advantages and disadvantages. Range-based sharding The default and the most widely used sharding strategy is range-based sharding. This strategy will split our collection's data into chunks, grouping documents with nearby values in the same shard. For our example database and collection, mongo_books and books respectively, we have: > sh.shardCollection("mongo_books.books", { id: 1 } ) This creates a range-based shard key on id with ascending direction. The direction of our shard key will determine which documents will end up in the first shard and which ones in the subsequent ones. This is a good strategy if we plan to have range-based queries as these will be directed to the shard that holds the result set instead of having to query all shards. Hash-based sharding If we don't have a shard key (or can't create one) that achieves the three goals mentioned previously, we can use the alternative strategy of using hash-based sharding. In this case, we are trading data distribution with query isolation. Hash-based sharding will take the values of our shard key and hash them in a way that guarantees close to uniform distribution. This way we can be sure that our data will evenly distribute across shards. The downside is that only exact match queries will get routed to the exact shard that holds the value. Any range query will have to go out and fetch data from all shards. For our example database and collection (mongo_books and books respectively), we have: > sh.shardCollection("mongo_books.books", { id: "hashed" } ) Similar to the preceding example, we are now using the id field as our hashed shard key. Suppose we use fields with float values for hash-based sharding. Then we will end up with collisions if the precision of our floats is more that 2^53. These fields should be avoided where possible. Coming up with our own key Range-based sharding does not need to be confined to a single key. In fact, in most cases, we would like to combine multiple keys to achieve high cardinality and low frequency. A common pattern is to combine a low-cardinality first part (but still having as distinct values more than two times the number of shards that we have) with a high-cardinality key as its second field. This achieves both read and write distribution from the first part of the sharding key and then cardinality and read locality from the second part. On the other hand, if we don't have range queries, we can get away by using hash-based sharding on a primary key as this will exactly target the shard and document that we are going after. To make things more complicated, these considerations may change depending on our workload. A workload that consists almost exclusively (say 99.5%) of reads won't care about write distribution. We can use the built-in _id field as our shard key and this will only add 0.5% load in the last shard. Our reads will still be distributed across shards. Unfortunately, in most cases, this is not simple. Location-based data Due to government regulations and the desire to have our data as close to our users as possible, there is often a constraint and need to limit data in a specific data center. By placing different shards at different data centers, we can satisfy this requirement. To summarize we learned about MongoDB sharding and got to know techniques to choose the correct shard key. Get the expert guide Mastering MongoDB 3.x  today to build fault-tolerant MongoDB application. MongoDB 4.0 now generally available with support for multi-platform, mobile, ACID transactions and more MongoDB going relational with 4.0 release Indexing, Replicating, and Sharding in MongoDB [Tutorial]
Read more
  • 0
  • 0
  • 21654
Modal Close icon
Modal Close icon