How-To Tutorials

16 Aug 2017

18 min read

Puppet Server and Agents

16 Aug 2017

0
0
36194

Packt

16 Aug 2017

8 min read

Machine Learning Models

Packt

16 Aug 2017

8 min read

In this article by Pratap Dangeti, the author of the book Statistics for Machine Learning, we will take a look at ridge regression and lasso regression in machine learning. (For more resources related to this topic, see here.) Ridge regression and lasso regression In linear regression only residual sum of squares (RSS) are minimized, whereas in ridge and lasso regression, penalty applied (also known as shrinkage penalty) on coefficient values to regularize the coefficients with the tuning parameter λ. When λ=0 penalty has no impact, ridge/lasso produces the same result as linear regression, whereas λ => ∞ will bring coefficients to zero. Before we go in deeper on ridge and lasso, it is worth to understand some concepts on Lagrangian multipliers. One can show the preceding objective function into the following format, where objective is just RSS subjected to cost constraint (s) of budget. For every value of λ, there is some s such that will provide the equivalent equations as shown as follows for overall objective function with penalty factor: The following graph shows the two different Lagrangian format: Ridge regression works well in situations where the least squares estimates have high variance. Ridge regression has computational advantages over best subset selection which required 2P models. In contrast for any fixed value of λ, ridge regression only fits a single model and model-fitting procedure can be performed very quickly. One disadvantage of ridge regression is, it will include all the predictors and shrinks the weights according with its importance but it does not set the values exactly to zero in order to eliminate unnecessary predictors from models, this issue will be overcome in lasso regression. During the situation of number of predictors are significantly large, using ridge may provide good accuracy but it includes all the variables, which is not desired in compact representation of the model, this issue do not present in lasso as it will set the weights of unnecessary variables to zero. Model generated from lasso are very much like subset selection, hence it is much easier to interpret than those produced by ridge regression. Example of ridge regression machine learning model Ridge regression is machine learning model, in which we do not perform any statistical diagnostics on the independent variables and just utilize the model to fit on test data and check the accuracy of fit. Here we have used scikit-learn package: >>> from sklearn.linear_model import Ridge >>> wine_quality = pd.read_csv( >>> wine_quality.rename(columns=lambda x: x.replace(" ", inplace=True) >>> all_colnms = ['fixed_acidity', 'volatile_acidity', 'citric_acid', 'residual_sugar', 'chlorides', Article_01.png λ, ridge regression only fits a idge exactly to zero in egression. are significantly large, idge variables, which is not model, this issue do not present in lasso as it will e regression machine learning model learning model, in which we do not perform any statistical the model to fit on test data and have used scikit-learn package: csv("winequality-red.csv",sep=';') "_"), in ariables, asso odel earning el 'free_sulfur_dioxide', 'total_sulfur_dioxide', 'density', 'pH', 'sulphates', 'alcohol'] >>> pdx = wine_quality[all_colnms] >>> pdy = wine_quality["quality"] >>> x_train,x_test,y_train,y_test = train_test_split(pdx,pdy,train_size = 0.7,random_state=42) Simple version of grid search from scratch has been described as follows, in which various values of alphas are tried to be tested in grid search to test the model fitness: >>> alphas = [1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0] Initial values of R-squared are set to zero in order to keep track on its updated value and to print whenever new value is greater than exiting value: >>> initrsq = 0 >>> print ("nRidge Regression: Best Parametersn") >>> for alph in alphas: ... ridge_reg = Ridge(alpha=alph) ... ridge_reg.fit(x_train,y_train) 0 ... tr_rsqrd = ridge_reg.score(x_train,y_train) ... ts_rsqrd = ridge_reg.score(x_test,y_test) The following code always keep track on test R-squared value and prints if new value is greater than existing best value: >>> if ts_rsqrd > initrsq: ... print ("Lambda: ",alph,"Train R-Squared value:",round(tr_rsqrd,5),"Test R-squared value:",round(ts_rsqrd,5)) ... initrsq = ts_rsqrd It is is shown in the following screenshot: By looking into test R-squared (0.3513) value we can conclude that there is no significant relationship between independent and dependent variables. Also, please note that, the test R-squared value generated from ridge regression is similar to value obtained from multiple linear regression (0.3519), but with the no stress on diagnostics of variables, and so on. Hence machine learning models are relatively compact and can be utilized for learning automatically without manual intervention to retrain the model, this is one of the biggest advantages of using ML models for deployment purposes. The R code for ridge regression on wine quality data is shown as follows: # Ridge regression library(glmnet) wine_quality = read.csv("winequality-red.csv",header=TRUE,sep = ";",check.names = FALSE) names(wine_quality) <- gsub(" ", "_", names(wine_quality)) set.seed(123) numrow = nrow(wine_quality) trnind = sample(1:numrow,size = as.integer(0.7*numrow)) train_data = wine_quality[trnind,]; test_data = wine_quality[- trnind,] xvars = c("fixed_acidity","volatile_acidity","citric_acid","residual_sugar ","chlorides","free_sulfur_dioxide", "total_sulfur_dioxide","density","pH","sulphates","alcohol") yvar = "quality" x_train = as.matrix(train_data[,xvars]);y_train = as.double (as.matrix (train_data[,yvar])) x_test = as.matrix(test_data[,xvars]) print(paste("Ridge Regression")) lambdas = c(1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0) initrsq = 0 for (lmbd in lambdas){ ridge_fit = glmnet(x_train,y_train,alpha = 0,lambda = lmbd) pred_y = predict(ridge_fit,x_test) R2 <- 1 - (sum((test_data[,yvar]-pred_y )^2)/sum((test_data[,yvar]-mean(test_data[,yvar]))^2)) if (R2 > initrsq){ print(paste("Lambda:",lmbd,"Test Adjusted R-squared :",round(R2,4))) initrsq = R2 } } Example of lasso regression model Lasso regression is close cousin of ridge regression, in which absolute values of coefficients are minimized rather than square of values. By doing so, we eliminate some insignificant variables, which are very much compacted representation similar to OLS methods. Following implementation is almost similar to ridge regression apart from penalty application on mod/absolute value of coefficients: >>> from sklearn.linear_model import Lasso >>> alphas = [1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0] >>> initrsq = 0 >>> print ("nLasso Regression: Best Parametersn") >>> for alph in alphas: ... lasso_reg = Lasso(alpha=alph) ... lasso_reg.fit(x_train,y_train) ... tr_rsqrd = lasso_reg.score(x_train,y_train) ... ts_rsqrd = lasso_reg.score(x_test,y_test) ... if ts_rsqrd > initrsq: ... print ("Lambda: ",alph,"Train R-Squared value:",round(tr_rsqrd,5),"Test R-squared value:",round(ts_rsqrd,5)) ... initrsq = ts_rsqrd It is shown in the following screenshot: Lasso regression produces almost similar results as ridge, but if we check the test R-squared values bit carefully, lasso produces little less values. Reason behind the same could be due to its robustness of reducing coefficients to zero and eliminate them from analysis: >>> ridge_reg = Ridge(alpha=0.001) >>> ridge_reg.fit(x_train,y_train) >>> print ("nRidge Regression coefficient values of Alpha = 0.001n") >>> for i in range(11): ... print (all_colnms[i],": ",ridge_reg.coef_[i]) >>> lasso_reg = Lasso(alpha=0.001) >>> lasso_reg.fit(x_train,y_train) >>> print ("nLasso Regression coefficient values of Alpha = 0.001n") >>> for i in range(11): ... print (all_colnms[i],": ",lasso_reg.coef_[i]) Following results shows the coefficient values of both the methods, coefficient of density has been set to o in lasso regression whereas density value is -5.5672 in ridge regression; also none of the coefficients in ridge regression are zero values: R Code – Lasso Regression on Wine Quality Data # Above Data processing steps are same as Ridge Regression, only below section of the code do change # Lasso Regression print(paste("Lasso Regression")) lambdas = c(1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0) initrsq = 0 for (lmbd in lambdas){ lasso_fit = glmnet(x_train,y_train,alpha = 1,lambda = lmbd) pred_y = predict(lasso_fit,x_test) R2 <- 1 - (sum((test_data[,yvar]-pred_y )^2)/sum((test_data[,yvar]-mean(test_data[,yvar]))^2)) if (R2 > initrsq){ print(paste("Lambda:",lmbd,"Test Adjusted R-squared :",round(R2,4))) initrsq = R2 } } Regularization parameters in linear regression and ridge/lasso regression Adjusted R-squared in linear regression always penalizes adding extra variables with less significance is one type of regularizing the data in linear regression, but it will adjust to unique fit of the model. Whereas in machine learning many parameters are adjusted to regularizing the overfitting problem, in the example of lasso/ridge regression penalty parameter (λ) to regularization, there are infinite values can be applied to regularize the model infinite ways: In overall there are many similarities between statistical way and machine learning ways of predicting the pattern. Summary We have seen ridge regression and lasso regression with their examples and we have also seen its regularization parameters. Resources for Article: Further resources on this subject: Machine Learning Review [article] Getting Started with Python and Machine Learning [article] Machine learning in practice [article]

0
0
3492

article-image-introduction-latest-social-media-landscape-and-importance

Packt

14 Aug 2017

10 min read

Introduction to the Latest Social Media Landscape and Importance

Packt

14 Aug 2017

10 min read

In this article by Siddhartha Chatterjee and Michal Krystyanczuk, author of the book, Python Social Media Analytics, starts with a question to you: Have you seen the movie Social Network? If you have not, it could be a good idea to see it before you read this. If you have, you may have seen the success story around Mark Zuckerberg and his company Facebook. This was possible due to power of the platform in connecting, enabling, sharing, and impacting the lives of almost two billion people on this planet. The earliest Social Networks existed as far back as 1995; such as Yahoo (Geocities), theglobe.com, and tripod.com. These platforms were mainly to facilitate interaction among people through chat rooms. It was only at the end of the 90s that user profiles became the in thing in social networking platforms, allowing information about people to be discoverable, and therefore, providing a choice to make friends or not. Those embracing this new methodology were Makeoutclub, Friendster, SixDegrees.com, and so on. MySpace, LinkedIn, and Orkut were thereafter created, and the social networks were on the verge of becoming mainstream. However, the biggest impact happened with the creation of Facebook in 2004; a total game changer for people's lives, business, and the world. The sophistication and the ease of using the platform made it into mainstream media for individuals and companies to advertise and sell their ideas and products. Hence, we are in the age of social media that has changed the way the world functions. Since the last few years, there have been new entrants in the social media, which are essentially of different interaction models as compared to Facebook, LinkedIn, or Twitter. These are Pinterest, Instagram, Tinder, and others. Interesting example is Pinterest, which unlike Facebook, is not centered around people but is centered around interests and/or topics. It's essentially able to structure people based on their interest around these topics. CEO of Pinterest describes it as a catalog of ideas. Forums which are not considered as regular social networks, such as Facebook, Twitter, and others, are also very important social platforms. Unlike in Twitter or Facebook, Forum users are often anonymous in nature, which enables them to make in-depth conversations with communities. Other non-typical social networks are video sharing platforms, such as YouTube and Dailymotion. They are non-typical because they are centered around the user-generated content, and the social nature is generated by the sharing of these content on various social networks and also the discussion it generates around the user commentaries. Social media is gradually changing from platform centric to more experiences and features. In the future, we'll see more and more traditional content providers and services becoming social in nature through sharing and conversations. The term social media today includes not just social networks but every service that's social in nature with a wide audience. Delving into Social Data The data acquired from social media is called social data. The social data exists in many forms. The types of social media data can be information around the users of social networks, like name, city, interests, and so on. These types of data that are numeric or quantifiable are known as structured data. However, since Social Media are platforms for expression, hence, a lot of the data is in the form of texts, images, videos, and such. These sources are rich in information, but not as direct to analyze as structured data described earlier. These types of data are known as unstructured data. The process of applying rigorous methods to make sense of the social data is called social data analytics. We will go into great depth in social data analytics to demonstrate how we can extract valuable sense and information from these really interesting sources of social data. Since there are almost no restrictions on social media, there are lot of meaningless accounts, content, and interactions. So, the data coming out of these streams is quite noisy and polluted. Hence, a lot of effort is required to separate the information from the noise. Once the data is cleaned and we are focused on the most important and interesting aspects, we then require various statistical and algorithmic methods to make sense out of the filtered data and draw meaningful conclusions. Understanding the process Once you are familiar with the topic of social media data, let us proceed to the next phase. The first step is to understand the process involved in exploitation of data present on social networks. A proper execution of the process, with attention to small details, is the key to good results. In many computer science domains, a small error in code will lead to a visible or at least correctable dysfunction, but in data science, it will produce entirely wrong results, which in turn will lead to incorrect conclusions. The very first step of data analysis is always problem definition. Understanding the problem is crucial for choosing the right data sources and the methods of analysis. It also helps to realize what kind of information and conclusions we can infer from the data and what is impossible to derive. This part is very often underestimated while it is key to successful data analysis. Any question that we try to answer in a data science project has to be very precise. Some people tend to ask very generic questions, such as I want to find trends on Twitter. This is not a correct problem definition and an analysis based on such statement can fail in finding relevant trends. By a naïve analysis, we can get repeating Twitter ads and content generated by bots. Moreover, it raises more questions than it answers. In order to approach the problem correctly, we have to ask in the first step: what is a trend? what is an interesting trend for us? and what is the time scope? Once we answer these questions, we can break up the problem in multiple sub problems: I'm looking for the most frequent consumer reactions about my brand on Twitter in English over the last week and I want to know if they were positive or negative. Such a problem definition will lead to a relevant, valuable analysis with insightful conclusions. The next part of the process consists of getting the right data according to the defined problem. Many social media platforms allow users to collect a lot of information in an automatized way via APIs (Application Programming Interfaces), which is the easiest way to complete the task. Once the data is stored in a database, we perform the cleaning. This step requires a precise understanding of the project's goals. In many cases, it will involve very basic tasks such as duplicates removal, for example, retweets on Twitter, or more sophisticated such as spam detection to remove irrelevant comments, language detection to perform linguistic analysis, or other statistical or machine learning approaches that can help to produce a clean dataset. When the data is ready to be analyzed, we have to choose what kind of analysis and structure the data accordingly. If our goal is to understand the sense of the conversations, then it only requires a simple list of verbatims (textual data), but if we aim to perform analysis on different variables, like number of likes, dates, number of shares, and so on, the data should be combined in a structure such as data frame, where each row corresponds to an observation and each column to a variable. The choice of the analysis method depends on the objectives of the study and the type of data. It may require statistical or machine learning approach, or a specific approach to time series. Different approaches will be explained on the examples of Facebook, Twitter, YouTube, GitHub, Pinterest, and Forum data. Once the analysis is done, it's time to infer conclusions. We can derive conclusions based on the outputs from the models, but one of the most useful tools is visualization technique. Data and output can be presented in many different ways, starting from charts, plots, and diagrams through more complex 2D charts, to multidimensional visualizations. Project planning Analysis of content on social media can get very confusing due to difficulty of working on large amount of data and also trying to make sense out of it. For this reason, it's extremely important to ask the right questions in the beginning to get the right answers. Even though this is an exploratory approach, and getting exact answers may be difficult, the right questions allow you to define the scope, process and the time. The main questions that we will be working on are the following : What does Google post on Facebook ? How do people react to Google Posts ? (Likes, Shares and Comments) What do Google's audience say about Google and its ecosystem? What are the emotions expressed by Google's audience ? With the preceding questions in mind we will proceed to the next steps. Scope and process The analysis will consist of analyzing the feed of posts and comments on official Facebook page of Google. The process of information extraction is organized in a data flow. It starts with data extraction from API, data preprocessing and wrangling and is followed by a series of different analyses. The analysis becomes actionable only after the last step of results interpretation. In order to arrive at retrieving the above information we need to do the following : Extract all the posts of Google permitted by the Facebook API Extract the metadata for each posts : TimeStamp, Number of Likes, Number of Shares, Number of comments. Extract the user comments under each post and the metadata Process the posts to retrieve the most common keywords, bi-grams, hashtags Process the user comments using Alchemy API to retrieve the emotions Analyse the above information to derive conclusions Data type The main part of information extraction comes from an analysis of textual data (posts and comments). However, in order to add quantitative and temporal dimension, we process numbers (likes, shares) and dates (date of creation). Summary The avalanche of Social Network data is a result of communication platforms been developed since the last two decades. These are the platforms that evolved from chat rooms to personal information sharing and finally, social and professional networks. Among many Facebook, Twitter, Instagram, Pinterest and LinkedIn have emerged as the modern day Social Media. These platforms collectively have reach of more than a billon or more of individuals across the world sharing their activities and interaction with each other. Sharing of their data by these media through APIs and other technologies has given rise to a new field called Social Media Analytics. This has multiple applications such as in Marketing, Personalized recommendations, Research and Societal. Modern Data Science techniques such as Machine Learning and Text Mining are widely used for these applications. Python is one of the most widely used programming languages used for these techniques. However, manipulating the unstructured-data from Social Networks requires a lot of precise processing and preparation before coming to the most interesting bits. Resources for Article: Further resources on this subject: How to integrate social media with your WordPress website [article] Social Media for Wordpress: VIP Memberships [article] Social Media Insight Using Naive Bayes [article]

0
0
2192

Packt

14 Aug 2017

16 min read

Sewable LEDs in Clothing

Packt

14 Aug 2017

16 min read

In this article by Jonathan Witts, author of the book Wearable Projects with Raspberry Pi Zero, we will use sewable LEDs and conductive thread to transform an item of clothing into a sparkling LED piece of wearable tech, controlled with our Pi Zero hidden in the clothing. We will incorporate a Pi Zero and battery into a hidden pocket in the garment and connect our sewable LEDs to the Pi's GPIO pins so that we can write Python code to control the order and timings of the LEDs. (For more resources related to this topic, see here.) To deactivate the software running automatically, connect to your Pi Zero over SSH and issue the following command: sudo systemctl disable scrollBadge.service Once that command completes, you can shutdown your Pi Zero by pressing your off-switch for three seconds. Now let's look at what we are going to cover in this article. What We Will Cover In this article we will cover the following topics: What we will need to complete the project in this article How we will modify our item of clothing Writing a Python program to control the electronics in our modified garment Making our program run automatically Let's jump straight in and look at what parts we will need to complete the project in this article. Bill of parts We will make use of the following things in the project in this article: A Pi Zero W An official Pi Zero Case A portable battery pack Item of clothing to modify e.g. a top or t-shirt 10 sewable LEDs Conductive Thread Some fabric the same color as the clothing Thread the same color as the clothing A sewing needle Pins 6 metal poppers Some black and yellow colored cable Solder and soldering iron Modifying our item of clothing So let's take a look at what we need to do to modify our item of clothing ready to accommodate our Pi Zero, battery pack and sewable LEDs. We will start by looking at creating our hidden pocket for the Pi Zero and the batteries, followed by how we will sew our LEDs into the top and design our sewable circuit. We will then solve the problem of connecting our conductive thread back to the GPIO holes on our Pi Zero. Our hidden pocket You will need a piece of fabric that is large enough to house your Pi Zero case and battery pack alongside each other, with enough spare to hem the pocket all the way round. If you have access to a sewing machine then this section of the project will be much quicker, otherwise you will need to do the stitching by hand. The piece of fabric I am using is 18 x 22 cm allowing for a 1 cm hem all around. Fold and pin the 1 cm hem and then either stitch by hand or run it through a sewing machine to secure your hem. When you have finished, remove the pins. You need to then turn your garment inside out and decide where you are going to position your hidden pocket. As I am using a t-shirt for my garment I am going to position my pocket just inside at the bottom side of the garment, wrapping around the front and back so that it sits across the wearer's hip. Pin your pocket in place and then stitch it along the bottom and left and right sides, leaving the top open. Make sure that you stitch this nice and firmly as it has to hold the weight of your battery pack. The picture below shows you my pocket after being stitched in place. When you have finished this you can remove the pins and turn your garment the correct way round again. Adding our sewable LEDs We are now going to plan our circuit for our sewable LEDs and add them to the garment. I am going to run a line of 10 LEDs around the bottom of my top. These will be wired up in pairs so that we can control a pair of LEDs at any one time with our Pi Zero. You can position your LEDs however you want, but it is important that the circuits of conductive thread do not cross one another. Turn your garment inside out again and mark with a washable pen where you want your LEDs to be sewn. As the fabric for my garment is quite light I am going to just stitch them inside and let the LED shine through the material. However if your material is heavier then you will have to put a small cut where each LED will be and then button-hole the cut so the LED can push through. Start with the LED furthest from your hidden pocket. Once you have your LED in position take a length of your conductive thread and over-sew the negative hole of your LED to your garment, trying to keep your stitches as small and as neat as possible, to minimize how much they can be seen from the front of the garment. You now need to stitch small running stitches from the first LED to the second, ensure that you use the same piece of conductive thread and do not cut it! When you get to the position of your LED, again over-sew the negative hole of your LED ensuring that it is face down so that the LED shows through the fabric. As I am stitching my LEDs quite close to the hem of my t-shirt, I have made use of the hem to run the conductive thread in when connecting the negative holes of the LEDs, as shown in the following image: Continue on connecting each negative point of your LEDs to each other with a single length of conductive thread. Your final LED will be the one closest to your hidden pocket, continue with a running stitch until you are on your hidden pocket. Now take the male half of one of your metal poppers and over-sew this in place through one of the holes. You can now cut the length of conductive thread as we have completed our common ground circuit for all the LEDs. Now stitch the three remaining holes with standard thread, as shown in this picture. When cutting the conductive thread be sure to leave a very short end. If two pieces of thread were to touch when we were powering the LEDs we could cause a short circuit. We now need to sew our positive connections to our garment. Start with a new length of conductive thread and attach the LED second closest to your hidden pocket. Again over-sew it it to the fabric trying to ensure that your stitches are as small as possible so they are not very visible from the front of the garment. Now you have secured the first LED, sew a small running stitch to the positive connection on the LED closest to the hidden pocket. After securing this LED stitch a running stitch so that it stops alongside the popper you previously secured to your pocket, and this time attach a female half of a metal popper in the same way as before, as shown in this picture: Secure your remaining 8 LEDs in the same fashion, working in pairs, away from the pocket. so that you are left with 1 male metal popper and 5 female metal poppers in a line on your hidden pocket. Ensure that the 6 different conductive threads do not cross at any point, the six poppers do not touch and that you have the positive and negative connections the right way round! The picture below shows you the completed circuit stitched into the t-shirt, terminating at the poppers on the pocket. Connecting our Pi Zero Now we have our electrical circuit we need to find a way to attach our Pi Zero to each pair of LEDs and the common ground we have sewn into our garment. We are going to make use of the poppers we stitched onto our hidden pocket for this. You have probably noticed that the only piece of conductive thread which we attached the male popper to was the common ground thread for our LEDs. This is so that when we construct our method of attaching the Pi Zero GPIO pins to the conductive thread it will be impossible to connect the positive and negative threads the wrong way round! Another reason for using the poppers to attach our Pi Zero to the conductive thread is because the LEDs and thread I am using are both rated as OK for hand washing; your Pi Zero is not! Take your remaining female popper and solder a length of black cable to it, about two and a half times the height of your hidden pocket should do the job. You can feed the cable through one of the holes in the popper as shown in the picture to ensure you get a good connection. For the five remaining male poppers solder the same length of yellow cable to each popper. The picture below shows two of my soldered poppers. Now connect all of your poppers to the other parts on your garment and carefully bend all the cables so that they all run in the same direction, up towards the top of your hidden pocket. Trim all the cable to the same length and then mark the top and bottom of each yellow cable with a permanent marker so that you know which cable is attached to which pair of LEDs. I am marking the bottom yellow cable as number 1 and the top as number 5. We can now cut a length of heat shrink and and cover the loose lengths of cable, leaving about 4 cm free to strip, tin and solder onto your Pi. You can now heat the heat shrink with a hair dryer to shrink it around your cables. We are now going to stitch together a small piece of fabric to attach the poppers to. We want to end up with a piece of fabric which is large enough to sew all six poppers to and to stitch the uncovered cables down to. This will be used to detach the Pi Zero from our garment when we need to wash it, or just remove our Pi Zero for another project. To strengthen up the fabric I am doubling it over and hemming a long and short side of the fabric to make a pocket. This can then be turned inside out and the remaining short side stitched over. You now need to position these poppers onto your piece of fabric so that they are aligned with the poppers you have sewn into your garment. Once you are happy with their placement, stitch them to the piece of fabric using standard thread, ensuring that they are really firmly attached. If you like you can also put a few stitches over each cable to ensure they stay in place too. Using a fine permanent marker number both ends of the 5 yellow cables 1 through 5 so that you can identify each cable. Now push all 6 cables through about 12 cm of shrink wrap and apply some heat from a hair dryer until it shrinks around your cables.You can strip and tin the ends of the six cables ready to solder them to your Pi Zero. Insert the black cable in the ground hole below GPIO 11 on your Pi Zero and then insert the 5 yellow cables, sequentially 1 through 5, into the GPIO holes 11, 09, 10, 7 and 8 as shown in this diagram, again from the rear of the Pi Zero. When you are happy all the cables are in the correct place, solder them to your Pi Zero and clip off any extra length from the front of the Pi zero with wire snips. You should now be able to connect your Pi Zero to your LEDs by pressing all 6 poppers together. To ensure that the wearer's body does not cause a short circuit with the conductive thread on the inside of the garment, you may want to take another piece of fabric and stitch it over all of the conductive thread lines. I would recommend that you do this after you have tested all your LEDs with the program in the next section. We have now carried out all the modifications needed to our garment, so let's move onto writing our Python program to control our LEDs. Writing Our Python Program To start with we will write a simple, short piece of Python just to check that all ten of our LEDs are working and we know which GPIO pin controls which pair of LEDs. Power on your Pi Zero and connect to it via SSH. Testing Our LEDs To check that our LEDs are all correctly wired up and that we can control them using Python, we will write this short program to test them. First move into our project directory by typing: cd WearableTech Now make a new directory for this article by typing: mkdir Chapter3 Now move into our new directory: cd Chapter3 Next we create our test program, by typing: nano testLED.py Then we enter the following code into Nano: #!/usr/bin/python3 from gpiozero import LED from time import sleep pair1 = LED(11) pair2 = LED(9) pair3 = LED(10) pair4 = LED(7) pair5 = LED(8) for i in range(4): pair1.on() sleep(2) pair1.off() pair2.on() sleep(2) pair2.off() pair3.on() sleep(2) pair3.off() pair4.on() sleep(2) pair4.off() pair5.on() sleep(2) pair5.off() Press Ctrl + o followed by Enter to save the file, and then Ctrl + x to exit Nano. We can then run our file by typing: python3 ./testLED.py All being well we should see each pair of LEDs light up for two seconds in turn and then the next pair, and the whole loop should repeat 4 times. Our final LED program We will now write our Python program which will control our LEDs in our t-shirt. This will be the program which we configure to run automatically when we power up our Pi Zero. So let's begin... Create a new file by typing: nano tShirtLED.py Now type the following Python program into Nano: #!/usr/bin/python3 from gpiozero import LEDBoard from time import sleep from random import randint leds = LEDBoard(11, 9, 10, 7, 8) while True: for i in range(5): wait = randint(5,10)/10 leds.on() sleep(wait) leds.off() sleep(wait) for i in range(5): wait = randint(5,10)/10 leds.value = (1, 0, 1, 0, 1) sleep(wait) leds.value = (0, 1, 0, 1, 0) sleep(wait) for i in range(5): wait = randint(1,5)/10 leds.value = (1, 0, 0, 0, 0) sleep(wait) leds.value = (1, 1, 0, 0, 0) sleep(wait) leds.value = (1, 1, 1, 0, 0) sleep(wait) leds.value = (1, 1, 1, 1, 0) sleep(wait) leds.value = (1, 1, 1, 1, 1) sleep(wait) leds.value = (1, 1, 1, 1, 0) sleep(wait) leds.value = (1, 1, 1, 0, 0) sleep(wait) leds.value = (1, 1, 0, 0, 0) sleep(wait) leds.value = (1, 0, 0, 0, 0) sleep(wait) Now save your file by pressing Ctrl + o followed by Enter, then exit Nano by pressing Ctrl + x. Test that your program is working correctly by typing: python3 ./tShirtLED.py If there are any errors displayed, go back and check your program in Nano. Once your program has gone through the three different display patterns, press Ctrl + c to stop the program from running. We have introduced a number of new things in this program, firstly we imported a new GPIOZero library item called LEDBoard. LEDBoard lets us define a list of GPIO pins which have LEDs attached to them and perform actions on our list of LEDs rather than having to operate them all one at a time. It also lets us pass a value to the LEDBoard object which indicates whether to turn the individual members of the board on or off. We also imported randint from the random library. Randint allows us to get a random integer in our program and we can also pass it a start and stop value from which the random integer should be taken. We then define three different loop patterns and set each of them inside a for loop which repeats 5 times. Making our program start automatically We now need to make our tShirtLED.py program run automatically when we switch our Pi Zero on: First we must make the Python program we just wrote executable, type: chmod +x ./tShirtLED.py Now we will create our service definition file, type: sudo nano /lib/systemd/system/tShirtLED.service Now type the definition into it: [Unit] Description=tShirt LED Service After=multi-user.target [Service] Type=idle ExecStart=/home/pi/WearableTech/Chapter3/tShirtLED.py [Install] WantedBy=multi-user.target Save and exit Nano by typing Ctrl + o followed by Enter and then Ctrl + x. Now change the file permissions, reload the systemd daemon and activate our service by typing: sudo chmod 644 /lib/systemd/system/tShirtLED.service sudo systemctl daemon-reload sudo systemctl enable tShirtLED.service Now we need to test whether this is working, so reboot your Pi by typing sudo reboot and then when your Pi Zero restarts you should see that your LED pattern starts to display automatically. Once you are happy that it is all working correctly press and hold your power-off button for three seconds to shut your Pi Zero down. You can now turn your garment the right way round and install the Pi Zero and battery pack into your hidden pocket. As soon as you plug your battery pack into your Pi Zero your LEDs will start to display their patterns and you can safely turn it all off using your power-off button. Summary In this article we looked at making use of stitchable electronics and how we could combine them with our Pi Zero. We made our first stitchable circuit and found a way that we could connect our Pi Zero to this circuit and control the electronic devices using the GPIO Zero Python library. Resources for Article: Further resources on this subject: Sending Notifications using Raspberry Pi Zero [article] Raspberry Pi LED Blueprints [article] Raspberry Pi and 1-Wire [article]

0
0
12877

How-To Tutorials

article-image-cloud-and-devops-revolution

Packt

14 Aug 2017

12 min read

The Cloud and the DevOps Revolution

Packt

14 Aug 2017

12 min read

Cloud and DevOps are two of the most important trends to emerge in technology. The reasons are clear - it's all about the amount of data that needs to be processed and managed in the applications and websites we use every day. The amount of data being processed and handled is huge. Every day, over a billion people visit Facebook, every hour 18,000 hours of video are uploaded to YouTube, every second Google process 40,000 search queries. Being able to handle such a staggering scale isn't easy. Through the use of the Amazon Web Services (AWS), you will be able to build out the key components needed to succeed at minimum cost and effort. This is an extract from Effective DevOps on AWS. Thinking in terms of cloud and not infrastructure The day I discovered that noise can damage hard drives. December 2011, sometime between Christmas and new year's eve. I started to receive dozens of alerts from our monitoring system. Apparently we had just lost connectivity to our European datacenter in Luxembourg. I rushed into network operating center (NOC) hopping that it's only a small glitch in our monitoring system, maybe just a joke after all, with so much redundancy how can everything go offline? Unfortunately, when I got into the room, the big monitoring monitors were all red, not a good sign. This was just the beginning of a very long nightmare. An electrician working in our datacenter mistakenly triggered the fire alarm, within seconds the fire suppression system set off and released its aragonite on top of our server racks. Unfortunately, this kind of fire suppression system makes so much noise when it releases its gas that sound wave instantly killed hundreds and hundreds of hard drives effectively shutting down our only European facility. It took months for us to be back on our feet. Where is the cloud when you need it! As Charles Philips said it the best: "Friends don't let friends build a datacenter." Deploying your own hardware versus in the cloud It wasn't long ago that tech companies small and large had to have a proper technical operations organization able to build out infrastructures. The process went a little bit like this: Fly to the location you want to put your infrastructure in, go tour the the different datacenters and their facilities. Look at the floor considerations, power considerations, HVAC, fire prevention systems, physical security, and so on. Shop for an internet provider, ultimately even so you are talking about servers and a lot more bandwidth, the process is the same, you want to get internet connectivity for your servers. Once that's done, it's time to get your hardware. Make the right decisions because you are probably going to spend a big portion of your company money on buying servers, switches, routers, firewall, storage, UPS (for when you have a power outage), kvm, network cables, the dear to every system administrator heart, labeler and a bunch of spare parts, hard drives, raid controllers, memory, power cable, you name it. At that point, once the hardware is bought and shipped to the datacenter location, you can rack everything, wire all the servers, power everything.Your network team can kick in and start establishing connectivity to the new datacenter using various links, configuring the edge routers, switches, top of the racks switches, kvm, firewalls (sometime), your storage team is next and is going to provide the much needed NAS or SAN, next come your sysops team who will image the servers, sometime upgrade the bios and configure hardware raid and finally put an OS on those servers. Not only this is a full-time job for a big team, it also takes lots of time and money to even get there. Getting new servers up and running with AWS will take us minutes.In fact, more than just providing a server within minutes, we will soon see how to deploy and run a service in minutes and just when you need it. Cost Analysis From a cost stand point, deploying in a cloud infrastructure such as AWSusually end up being a to a lot cheaper than buying your own hardware. If you want to deploy your own hardware, you have to pay upfront for all the hardware (servers, network equipment) and sometime license software as well. In a cloud environment you pay as you go. You can add and remove servers in no time. Also, if you take advantage of the PaaS and SaaS applications you also usually end up saving even more money by lowering your operating costs as you don't need as much staff to administrate your database, storage, and so on. Most cloud providers,AWS included, also offer tired pricing and volume discount. As your service gets bigger and bigger, you end up paying less for each unit of storage, bandwidth, and so on. Just on time infrastructure As we just saw, when deploying in the cloud,you pay as you go. Most cloud companies use that to their advantage to scale up and down their infrastructure as the traffic to their sites changes. This ability to add and remove new servers and services in no time and on demand is one of the main differentiator of an effective cloud infrastructure. Here is a diagram from a presentation from 2015 that shows the annual traffic going to https://www.amazon.com/(the online store): © 2016, Amazon Web Services, Inc. or its affiliates. All rights reserved. As you can see, with the holidays, the end of the year is a busy time for https://www.amazon.com/, their traffic triple. If they were hosting their service in an "old fashion" way, they would have only 24% of their infrastructure used in average every year but thanks to being able to scale dynamically they are able to only provision what they really need. © 2016, Amazon Web Services, Inc. or its affiliates. All rights reserved. Here at medium, we also see on a very regular basis the benefits from having fast auto scaling capabilities. Very often, stories will become viral and the amount of traffic going on medium can drastically change. On January 21st 2015, to our surprise, the White House posted the transcript of the State of the Union minutes before President Obama started his speech: https://medium.com/@WhiteHouse/ As you can see in the following graph, thanks to being in the cloud and having auto scaling capabilities, our platform was able to absorb the 5xinstant spike of traffic that the announcement made by doubling the amount of servers our front service uses. Later as the traffic started to naturally drain, we automatically removed some hosts from our fleet. The different layer of building a cloud The cloud computing is often broken up into 3 different type of service: Infrastructure as a Service (IaaS): It is the fundamental block on top of which everything cloud is built upon. It is usually a computing resource in a virtualized environment. It offers a combination of processing power, memory, storage, and network. The most common IaaS entities you will find are virtual machines (VM), network equipment like load balancers or virtual Ethernet interface and storage like block devices.This layer is very close to the hardware and give you the full flexibility that you would get deploying your software outside of a cloud. If you have any datacentre physical knowledge, this will mostly also apply to that layer. Platform as a Service (PaaS):It is where things start to get really interesting with the cloud. When building an application, you will likely need a certain number of common components such as a data store, a queue, and so on. The PaaS layer provides a number of ready to use applications to help you build your own services without worrying about administrating and operating those 3rd party services such as a database server. Sofware as a Service (SaaS):It is the icing on the cake. Similarly, to the PaaS layer you get access to managed services but this time those services are complete solution dedicated to certain purpose such as management or monitoring tools. When building an application, relying on those services make a big difference when compared to more traditional environment outside of a cloud. Another key element to succeed when deploying or migrating to a new infrastructure is to adopt a DevOps mind-set. Deploying in AWS AWS is on the forefront of the cloud providers. Launched in 2006 with SQS and EC2, Amazon quickly became the biggest IaaS provider. They have the biggest infrastructure, the biggest ecosystem and constantly add new feature and release new services. In 2015 they passed the cap of 1 million active customers. Over the last few years, they managed to change people's mind set about cloud and now deploying new services to the cloud is the new normal. Using the AWS managed tools and services is a drastic way to improve your productivity and keep your team lean. Amazon is continually listening to its customer's feedback and looking at the market trends therefore, as the DevOps movement started to get established, Amazon released a number of new services tailored toward implementing some of the DevOps best practices.We will also see how those services synergize with the DevOps culture. How to take advantage of the AWS ecosystem When you talk to applications architects, there are usually two train of train of thought. The first one is to stay as platform agnostic as possible. The idea behind that is that if you aren't happy with AWS anymore, you can easily switch cloud provider or even build your own private cloud. The 2nd train of thought is the complete opposite; the idea is that you are going to stick to AWS no matter what. It feels a bit extreme to think it that way but the reward is worth the risk and more and more companies agree with that. That's also where I stand. When you build a product now a day, the scarcity is always time and people. If you can outsource what is not your core business to a company that provides similar service or technology, support expertise and that you can just pay for it on SaaS model, then do so. If, like me you agree that using managed services is the way to go then being a cloud architect is like playing with Lego. With Lego, you have lots of pieces of different shapes, sizes, and colors and you assemble them to build your own MOC. Amazon services are like those Lego pieces. If you can picture your final product, then you can explore the different services and start combining them to build the supporting stack needed to quickly and efficiently build your product. Of course, in this case, the "If" is a big if and unlike Lego, understanding what each piece can do is a lot less visual and colorful than Lego pieces. How AWS synergize with the DevOps culture Having a DevOps culture is about rethinking how engineering teams work together by breaking out those developers and operations silos and bringing a new set of new tools to implement some best practices. AWS helps in many different accomplish that: For some developers, the world of operations can be scary and confusing but if you want better cooperation between engineers, it is important to expose every aspect of running a service to the entire engineering organization. As an operations engineer, you can't have a gate keeper mentality toward developers, instead it's better to make them comfortable accessing production and working on the different component of the platform. A good way to get started with that in the AWS console. While a bit overwarming, it is still a much better experience for people not familiar with this world to navigate that web interface than referring to constantly out of date documentations, using SSH and random plays to discover the topology and configuration of the service. Of course, as your expertise grows, as your application becomes and more complex and the need to operate it faster increase, the web interface starts to showing some weakness. To go around that issue, AWS provides a very DevOPS friendly alternative: an API. Accessible through a command-line tool and a number of SDK (which include Java, Javascript, Python, .net, php, ruby go, and c++) the SDKs let you administrate and use the managed services. Finally, AWS offers a number of DevOps tools. AWS has a source control service similar to GitHub called CodeCommit. For automation, in addition to allowing to control everything via SDKs, AWSprovides the ability to create template of your infrastructure via CloudFormation but also a configuration management system called OpsWork. It also knows how to scale up and down fleets of servers using Auto Scaling Groups. For the continuous delivery AWS provide a service called CodePipeline and for continuous deployment a service called CodeCommit. With regards to measuring everything, we will rely on CloudWatch and later ElasticSearch / Kibanato visualize metrics and logs. Finally, we will see how to use Docker via ECS which will let us create containers to improve the server density (we will be able to reduce the VM consumption as we will be able to collocate services together in 1 VM while still keeping fairly good isolation), improve the developer environment as we will now be able to run something closer to the production environment and improve testing time as starting containers is a lot faster than starting virtual machines.

0
0
35554

Packt

14 Aug 2017

15 min read

Writing Modules

Packt

14 Aug 2017

15 min read

In this article, David Mark Clements, the author of the book, Node.js Cookbook, we will be covering the following points to introduce you to using Node.js for exploratory data analysis: Node's module system Initializing a module Writing a module Tooling around modules Publishing modules Setting up a private module repository Best practices (For more resources related to this topic, see here.) In idiomatic Node, the module is the fundamental unit of logic. Any typical application or system consists of generic code and application code. As a best practice, generic shareable code should be held in discrete modules, which can be composed together at the application level with minimal amounts of domain-specific logic. In this article, we'll learn how Node's module system works, how to create modules for various scenarios, and how we can reuse and share our code. Scaffolding a module Let's begin our exploration by setting up a typical file and directory structure for a Node module. At the same time, we'll be learning how to automatically generate a package.json file (we refer to this throughout as initializing a folder as a package) and to configure npm (Node's package managing tool) with some defaults, which can then be used as part of the package generation process. In this recipe, we'll create the initial scaffolding for a full Node module. Getting ready Installing Node If we don't already have Node installed, we can go to https://nodejs.org to pick up the latest version for our operating system. If Node is on our system, then so is the npm executable; npm is the default package manager for Node. It's useful for creating, managing, installing, and publishing modules. Before we run any commands, let's tweak the npm configuration a little: npm config set init.author.name "<name here>" This will speed up module creation and ensure that each package we create has a consistent author name, thus avoiding typos and variations of our name. npm stands for... Contrary to popular belief, npm is not an acronym for Node Package Manager; in fact, it stands for npm is Not An Acronym, which is why it's not called NINAA. How to do it… Let's say we want to create a module that converts HSL (hue, saturation, luminosity) values into a hex-based RGB representation, such as will be used in CSS (for example, #fb4a45 ). The name hsl-to-hex seems good, so let's make a new folder for our module and cd into it: mkdir hsl-to-hex cd hsl-to-hex Every Node module must have a package.json file, which holds metadata about the module. Instead of manually creating a package.json file, we can simply execute the following command in our newly created module folder: npm init This will ask a series of questions. We can hit enter for every question without supplying an answer. Note how the default module name corresponds to the current working directory, and the default author is the init.author.name value we set earlier. An npm init should look like this: Upon completion, we should have a package.json file that looks something like the following: { "name": "hsl-to-hex", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "test": "echo "Error: no test specified" && exit 1" }, "author": "David Mark Clements", "license": "MIT" } How it works… When Node is installed on our system, npm comes bundled with it. The npm executable is written in JavaScript and runs on Node. The npm config command can be used to permanently alter settings. In our case, we changed the init.author.name setting so that npm init would reference it for the default during a module's initialization. We can list all the current configuration settings with npm config ls . Config Docs Refer to https://docs.npmjs.com/misc/config for all possible npm configuration settings. When we run npm init, the answers to prompts are stored in an object, serialized as JSON and then saved to a newly created package.json file in the current directory. There's more… Let's find out some more ways to automatically manage the content of the package.json file via the npm command. Reinitializing Sometimes additional metadata can be available after we've created a module. A typical scenario can arise when we initialize our module as a git repository and add a remote endpoint after creating the module. Git and GitHub If we've not used the git tool and GitHub before, we can refer to http://help.github.com to get started. If we don't have a GitHub account, we can head to http://github.com to get a free account. To demonstrate, let's create a GitHub repository for our module. Head to GitHub and click on the plus symbol in the top-right, then select New repository: Select New repository. Specify the name as hsl-to-hex and click on Create Repository. Back in the Terminal, inside our module folder, we can now run this: echo -e "node_modulesn*.log" > .gitignore git init git add . git commit -m '1st' git remote add origin http://github.com/<username>/hsl-to-hex git push -u origin master Now here comes the magic part; let's initialize again (simply press enter for every question): npm init This time the Git remote we just added was detected and became the default answer for the git repository question. Accepting this default answer meant that the repository, bugs, and homepage fields were added to package.json . A repository field in package.json is an important addition when it comes to publishing open source modules since it will be rendered as a link on the modules information page at http://npmjs.com. A repository link enables potential users to peruse the code prior to installation. Modules that can't be viewed before use are far less likely to be considered viable. Versioning The npm tool supplies other functionalities to help with module creation and management workflow. For instance, the npm version command can allow us to manage our module's version number according to SemVer semantics. SemVer SemVer is a versioning standard. A version consists of three numbers separated by a dot, for example, 2.4.16. The position of a number denotes specific information about the version in comparison to the other versions. The three positions are known as MAJOR.MINOR.PATCH. The PATCH number is increased when changes have been made that don't break the existing functionality or add any new functionality. For instance, a bug fix will be considered a patch. The MINOR number should be increased when new backward compatible functionality is added. For instance, the adding of a method. The MAJOR number increases when backwards-incompatible changes are made. Refer to http://semver.org/ for more information. If we were to a fix a bug, we would want to increase the PATCH number. We can either manually edit the version field in package.json , setting it to 1.0.1, or we can execute the following: npm version patch This will increase the version field in one command. Additionally, if our module is a Git repository, it will add a commit based on the version (in our case, v1.0.1), which we can then immediately push. When we ran the command, npm output the new version number. However, we can double-check the version number of our module without opening package.json: npm version This will output something similar to the following: { 'hsl-to-hex': '1.0.1', npm: '2.14.17', ares: '1.10.1-DEV', http_parser: '2.6.2', icu: '56.1', modules: '47', node: '5.7.0', openssl: '1.0.2f', uv: '1.8.0', v8: '4.6.85.31', zlib: '1.2.8' } The first field is our module along with its version number. If we added a new backwards-compatible functionality, we can run this: npm version minor Now our version is 1.1.0. Finally, we can run the following for a major version bump: npm version major This sets our modules version to 2.0.0. Since we're just experimenting and didn't make any changes, we should set our version back to 1.0.0. We can do this via the npm command as well: npm version 1.0.0 See also Refer to the following recipes: Writing module code Publishing a module Installing dependencies In most cases, it's most wise to compose a module out of other modules. In this recipe, we will install a dependency. Getting ready For this recipe, all we need is Command Prompt open in the hsl-to-hex folder from the Scaffolding a module recipe. How to do it… Our hsl-to-hex module can be implemented in two steps: Convert the hue degrees, saturation percentage, and luminosity percentage to corresponding red, green, and blue numbers between 0 and 255. Convert the RGB values to HEX. Before we tear into writing an HSL to the RGB algorithm, we should check whether this problem has already been solved. The easiest way to check is to head to http://npmjs.com and perform a search: Oh, look! Somebody already solved this. After some research, we decide that the hsl-to-rgb-for-reals module is the best fit. Ensuring that we are in the hsl-to-hex folder, we can now install our dependency with the following: npm install --save hsl-to-rgb-for-reals Now let's take a look at the bottom of package.json: tail package.json #linux/osx type package.json #windows Tail output should give us this: "bugs": { "url": "https://github.com/davidmarkclements/hsl-to-hex/issues" }, "homepage": "https://github.com/davidmarkclements/hsl-to-hex#readme", "description": "", "dependencies": { "hsl-to-rgb-for-reals": "^1.1.0" } } We can see that the dependency we installed has been added to a dependencies object in the package.json file. How it works… The top two results of the npm search are hsl-to-rgb and hsl-to-rgb-for-reals . The first result is unusable because the author of the package forgot to export it and is unresponsive to fixing it. The hsl-to-rgb-for-reals module is a fixed version of hsl-to-rgb . This situation serves to illustrate the nature of the npm ecosystem. On the one hand, there are over 200,000 modules and counting, and on the other many of these modules are of low value. Nevertheless, the system is also self-healing in that if a module is broken and not fixed by the original maintainer, a second developer often assumes responsibility and publishes a fixed version of the module. When we run npm install in a folder with a package.json file, a node_modules folder is created (if it doesn't already exist). Then, the package is downloaded from the npm registry and saved into a subdirectory of node_modules (for example, node_modules/hsl-to-rgb-for-reals ). npm 2 vs npm 3 Our installed module doesn't have any dependencies of its own. However, if it did, the sub-dependencies would be installed differently depending on whether we're using version 2 or version 3 of npm. Essentially, npm 2 installs dependencies in a tree structure, for instance, node_modules/dep/node_modules/sub-dep-of-dep/node_modules/sub-dep-of-sub-dep. Conversely, npm 3 follows a maximally flat strategy where sub-dependencies are installed in the top level node_modules folder when possible, for example, node_modules/dep, node_modules/sub-dep-of-dep, and node_modules/sub-dep-of-sub-dep. This results in fewer downloads and less disk space usage; npm 3 resorts to a tree structure in cases where there are two versions of a sub-dependency, which is why it's called a maximally flat strategy. Typically, if we've installed Node 4 or above, we'll be using npm version 3. There's more… Let's explore development dependencies, creating module management scripts and installing global modules without requiring root access. Installing development dependencies We usually need some tooling to assist with development and maintenance of a module or application. The ecosystem is full of programming support modules, from linting to testing to browser bundling to transpilation. In general, we don't want consumers of our module to download dependencies they don't need. Similarly, if we're deploying a system built-in node, we don't want to burden the continuous integration and deployment processes with superfluous, pointless work. So, we separate our dependencies into production and development categories. When we use npm --save install <dep>, we're installing a production module. To install a development dependency, we use --save-dev. Let's go ahead and install a linter. JavaScript Standard Style A standard is a JavaScript linter that enforces an unconfigurable ruleset. The premise of this approach is that we should stop using precious time up on bikeshedding about syntax. All the code in this article uses the standard linter, so we'll install that: npm install --save-dev standard semistandard If the absence of semicolons is abhorrent, we can choose to install semistandard instead of standard at this point. The lint rules match those of standard, with the obvious exception of requiring semicolons. Further, any code written using standard can be reformatted to semistandard using the semistandard-format command tool. Simply, run npm -g i semistandard-format to get started with it. Now, let's take a look at the package.json file: { "name": "hsl-to-hex", "version": "1.0.0", "main": "index.js", "scripts": { "test": "echo "Error: no test specified" && exit 1" }, "author": "David Mark Clements", "license": "MIT", "repository": { "type": "git", "url": "git+ssh://git@github.com/davidmarkclements/hsl-to-hex.git" }, "bugs": { "url": "https://github.com/davidmarkclements/hsl-to-hex/issues" }, "homepage": "https://github.com/davidmarkclements/hsl-to- hex#readme", "description": "", "dependencies": { "hsl-to-rgb-for-reals": "^1.1.0" }, "devDependencies": { "standard": "^6.0.8" } } We now have a devDependencies field alongside the dependencies field. When our module is installed as a sub-dependency of another package, only the hsl-to-rgb-for-reals module will be installed while the standard module will be ignored since it's irrelevant to our module's actual implementation. If this package.json file represented a production system, we could run the install step with the --production flag, as shown: npm install --production Alternatively, this can be set in the production environment with the following command: npm config set production true Currently, we can run our linter using the executable installed in the node_modules/.bin folder. Consider this example: ./node_modules/.bin/standard This is ugly and not at all ideal. Refer to Using npm run scripts for a more elegant approach. Using npm run scripts Our package.json file currently has a scripts property that looks like this: "scripts": { "test": "echo "Error: no test specified" && exit 1" }, Let's edit the package.json file and add another field, called lint, as follows: "scripts": { "test": "echo "Error: no test specified" && exit 1", "lint": "standard" }, Now, as long as we have standard installed as a development dependency of our module (refer to Installing Development Dependencies), we can run the following command to run a lint check on our code: npm run-script lint This can be shortened to the following: npm run lint When we run an npm script, the current directory's node_modules/.bin folder is appended to the execution context's PATH environment variable. This means even if we don't have the standard executable in our usual system PATH, we can reference it in an npm script as if it was in our PATH. Some consider lint checks to be a precursor to tests. Let's alter the scripts.test field, as illustrated: "scripts": { "test": "npm run lint", "lint": "standard" }, Chaining commands Later, we can append other commands to the test script using the double ampersand (&&) to run a chain of checks. For instance, "test": "npm run lint && tap test". Now, let's run the test script: npm run test Since the test script is special, we can simply run this: npm test Eliminating the need for sudo The npm executable can install both the local and global modules. Global modules are mostly installed so to allow command line utilities to be used system wide. On OS X and Linux, the default npm setup requires sudo access to install a module. For example, the following will fail on a typical OS X or Linux system with the default npm setup: npm -g install cute-stack # <-- oh oh needs sudo This is unsuitable for several reasons. Forgetting to use sudo becomes frustrating; we're trusting npm with root access and accidentally using sudo for a local install causes permission problems (particularly with the npm local cache). The prefix setting stores the location for globally installed modules; we can view this with the following: npm config get prefix Usually, the output will be /usr/local . To avoid the use of sudo, all we have to do is set ownership permissions on any subfolders in /usr/local used by npm: sudo chown -R $(whoami) $(npm config get prefix)/{lib/node_modules,bin,share} Now we can install global modules without root access: npm -g install cute-stack # <-- now works without sudo If changing ownership of system folders isn't feasible, we can use a second approach, which involves changing the prefix setting to a folder in our home path: mkdir ~/npm-global npm config set prefix ~/npm-global We'll also need to set our PATH: export PATH=$PATH:~/npm-global/bin source ~/.profile The source essentially refreshes the Terminal environment to reflect the changes we've made. See also Scaffolding a module Writing module code Publishing a module Resources for Article: Further resources on this subject: Understanding and Developing Node Modules [article] Working with Pluginlib, Nodelets, and Gazebo Plugins [article] Basic Website using Node.js and MySQL database [article]

0
0
14417

How-To Tutorials

Packt

14 Aug 2017

19 min read

Scripting FreeSWITCH Lua

Packt

14 Aug 2017

19 min read

0
0
22598

How-To Tutorials

article-image-getting-best-out-onedrive-business

Packt

14 Aug 2017

15 min read

Getting the best out of OneDrive for Business

Packt

14 Aug 2017

15 min read

0
0
9243

Packt

10 Aug 2017

21 min read

Starting Out

Packt

10 Aug 2017

21 min read

In this article by Chris Simmonds, author of the book Mastering Embedded Linux Programming – Second Edition, you are about to begin working on your next project, and this time it is going to be running Linux. What should you think about before you put finger to keyboard? Let's begin with a high-level look at embedded Linux and see why it is popular, what are the implications of open source licenses, and what kind of hardware you will need to run Linux. (For more resources related to this topic, see here.) Linux first became a viable choice for embedded devices around 1999. That was when Axis (https://www.axis.com), released their first Linux-powered network camera and TiVo (https://business.tivo.com/) their first Digital Video Recorder (DVR). Since 1999, Linux has become ever more popular, to the point that today it is the operating system of choice for many classes of product. At the time of writing, in 2017, there are about two billion devices running Linux. That includes a large number of smartphones running Android, which uses a Linux kernel, and hundreds of millions of set-top-boxes, smart TVs, and Wi-Fi routers, not to mention a very diverse range of devices such as vehicle diagnostics, weighing scales, industrial devices, and medical monitoring units that ship in smaller volumes. So, why does your TV run Linux? At first glance, the function of a TV is simple: it has to display a stream of video on a screen. Why is a complex Unix-like operating system like Linux necessary? The simple answer is Moore's Law: Gordon Moore, co-founder of Intel, observed in 1965 that the density of components on a chip will double approximately every two years. That applies to the devices that we design and use in our everyday lives just as much as it does to desktops, laptops, and servers. At the heart of most embedded devices is a highly integrated chip that contains one or more processor cores and interfaces with main memory, mass storage, and peripherals of many types. This is referred to as a System on Chip, or SoC, and SoCs are increasing in complexity in accordance with Moore's Law. A typical SoC has a technical reference manual that stretches to thousands of pages. Your TV is not simply displaying a video stream as the old analog sets used to do. The stream is digital, possibly encrypted, and it needs processing to create an image. Your TV is (or soon will be) connected to the Internet. It can receive content from smartphones, tablets, and home media servers. It can be (or soon will be) used to play games. And so on and so on. You need a full operating system to manage this degree of complexity. Here are some points that drive the adoption of Linux: Linux has the necessary functionality. It has a good scheduler, a good network stack, support for USB, Wi-Fi, Bluetooth, many kinds of storage media, good support for multimedia devices, and so on. It ticks all the boxes. Linux has been ported to a wide range of processor architectures, including some that are very commonly found in SoC designs—ARM, MIPS, x86, and PowerPC. Linux is open source, so you have the freedom to get the source code and modify it to meet your needs. You, or someone working on your behalf, can create a board support package for your particular SoC board or device. You can add protocols, features, and technologies that may be missing from the mainline source code. You can remove features that you don't need to reduce memory and storage requirements. Linux is flexible. Linux has an active community; in the case of the Linux kernel, very active. There is a new release of the kernel every 8 to 10 weeks, and each release contains code from more than 1,000 developers. An active community means that Linux is up to date and supports current hardware, protocols, and standards. Open source licenses guarantee that you have access to the source code. There is no vendor tie-in. For these reasons, Linux is an ideal choice for complex devices. But there are a few caveats I should mention here. Complexity makes it harder to understand. Coupled with the fast moving development process and the decentralized structures of open source, you have to put some effort into learning how to use it and to keep on re-learning as it changes. Selecting the right operating system Is Linux suitable for your project? Linux works well where the problem being solved justifies the complexity. It is especially good where connectivity, robustness, and complex user interfaces are required. However, it cannot solve every problem, so here are some things to consider before you jump in: Is your hardware up to the job? Compared to a traditional real-time operating system (RTOS) such as VxWorks, Linux requires a lot more resources. It needs at least a 32-bit processor and lots more memory. I will go into more detail in the section on typical hardware requirements. Do you have the right skill set? The early parts of a project, board bring-up, detailed knowledge of Linux and how it relates to your hardware. Likewise, when debugging and tuning your application, you will need to be able to interpret the results. If you don't have the skills in-house, you may want to outsource some of the work. Is your system real-time? Linux can handle many real-time activities so long as you pay attention to certain details. Consider these points carefully. Probably the best indicator of success is to look around for similar products that run Linux and see how they have done it; follow best practice. The players Where does open source software come from? Who writes it? In particular, how does this relate to the key components of embedded development—the toolchain, bootloader, kernel, and basic utilities found in the root filesystem? The main players are: The open source community: This, after all, is the engine that generates the software you are going to be using. The community is a loose alliance of developers, many of whom are funded in some way, perhaps by a not-for-profit organization, an academic institution, or a commercial company. They work together to further the aims of the various projects. There are many of them—some small, some large. CPU architects: These are the organizations that design the CPUs we use. The important ones here are ARM/Linaro (ARM-based SoCs), Intel (x86 and x86_64), Imagination Technologies (MIPS), and IBM (PowerPC). They implement or, at the very least, influence support for the basic CPU architecture. SoC vendors (Atmel, Broadcom, Intel, Qualcomm, TI, and many others). They take the kernel and toolchain from the CPU architects and modify them to support their chips. They also create reference boards: designs that are used by the next level down to create development boards and working products. Board vendors and OEMs: These people take the reference designs from SoC vendors and build them in to specific products, for instance, set-top-boxes or cameras, or create more general purpose development boards, such as those from Avantech and Kontron. An important category are the cheap development boards such as BeagleBoard/BeagleBone and Raspberry Pi that have created their own ecosystems of software and hardware add-ons. These form a chain, with your project usually at the end, which means that you do not have a free choice of components. You cannot simply take the latest kernel from https://www.kernel.org/, except in a few rare cases, because it does not have support for the chip or board that you are using. This is an ongoing problem with embedded development. Ideally, the developers at each link in the chain would push their changes upstream, but they don't. It is not uncommon to find a kernel which has many thousands of patches that are not merged. In addition, SoC vendors tend to actively develop open source components only for their latest chips, meaning that support for any chip more than a couple of years old will be frozen and not receive any updates. The consequence is that most embedded designs are based on old versions of software. They do not receive security fixes, performance enhancements, or features that are in newer versions. Problems such as Heartbleed (a bug in the OpenSSL libraries) and ShellShock (a bug in the bash shell) go unfixed. What can you do about it? First, ask questions of your vendors: what is their update policy, how often do they revise kernel versions, what is the current kernel version, what was the one before that, and what is their policy for merging changes up-stream? Some vendors are making great strides in this way. You should prefer their chips. Secondly, you can take steps to make yourself more self-sufficient. The article explains the dependencies in more detail and show you where you can help yourself. Don't just take the package offered to you by the SoC or board vendor and use it blindly without considering the alternatives. The four elements of embedded Linux Every project begins by obtaining, customizing, and deploying these four elements: the toolchain, the bootloader, the kernel, and the root filesystem: Toolchain: The compiler and other tools needed to create code for your target device. Everything else depends on the toolchain. Bootloader: The program that initializes the board and loads the Linux kernel. Kernel: This is the heart of the system, managing system resources and interfacing with hardware. Root filesystem: Contains the libraries and programs that are run once the kernel has completed its initialization. Of course, there is also a fifth element, not mentioned here. That is the collection of programs specific to your embedded application which make the device do whatever it is supposed to do, be it weigh groceries, display movies, control a robot, or fly a drone. Typically, you will be offered some or all of these elements as a package when you buy your SoC or board. But, for the reasons mentioned in the preceding paragraph, they may not be the best choices for you. Open source The components of embedded Linux are open source, so now is a good time to consider what that means, why open sources work the way they do, and how this affects the often proprietary embedded device you will be creating from it. Licenses When talking about open source, the word free is often used. People new to the subject often take it to mean nothing to pay, and open source software licenses do indeed guarantee that you can use the software to develop and deploy systems for no charge. However, the more important meaning here is freedom, since you are free to obtain the source code, modify it in any way you see fit, and redeploy it in other systems. These licenses give you this right. Compare that with shareware licenses which allow you to copy the binaries for no cost but do not give you the source code, or other licenses that allow you to use the software for free under certain circumstances, for example, for personal use but not commercial. These are not open source. I will provide the following comments in the interest of helping you understand the implications of working with open source licenses, but I would like to point out that I am an engineer and not a lawyer. What follows is my understanding of the licenses and the way they are interpreted. Open source licenses fall broadly into two categories: the copyleft licenses such as the General Public License (GPL) and the permissive licenses such as those from the Berkeley Software Distribution (BSD), the , and others. The permissive licenses say, in essence, that you may modify the source code and use it in systems of your own choosing so long as you do not modify the terms of the license in any way. In other words, with that one restriction, you can do with it what you want, including building it into possibly proprietary systems. The GPL licenses are similar, but have clauses which compel you to pass the rights to obtain and modify the software on to your end users. In other words, you share your source code. One option is to make it completely public by putting it onto a public server. Another is to offer it only to your end users by means of a written offer to provide the code when requested. The GPL goes further to say that you cannot incorporate GPL code into proprietary programs. Any attempt to do so would make the GPL apply to the whole. In other words, you cannot combine a GPL and proprietary code in one program. So, what about libraries? If they are licensed with the GPL, any program linked with them becomes GPL also. However, most libraries are licensed under the Lesser General Public License (LGPL). If this is the case, you are allowed to link with them from a proprietary program. All the preceding description relates specifically to GLP v2 and LGPL v2.1. I should mention the latest versions of GLP v3 and LGPL v3. These are controversial, and I will admit that I don't fully understand the implications. However, the intention is to ensure that the GPLv3 and LGPL v3 components in any system can be replaced by the end user, which is in the spirit of open source software for everyone. It does pose some problems though. Some Linux devices are used to gain access to information according to a subscription level or another restriction, and replacing critical parts of the software may compromise that. Set-top-boxes fit into this category. There are also issues with security. If the owner of a device has access to the system code, then so might an unwelcome intruder. Often the defense is to have kernel images that are signed by an authority, the vendor, so that unauthorized updates are not possible. Is that an infringement of my right to modify my device? Opinions differ. The TiVo set-top-box is an important part of this debate. It uses a Linux kernel, which is licensed under GPL v2. TiVo have released the source code of their version of the kernel and so comply with the license. TiVo also has a bootloader that will only load a kernel binary that is signed by them. Consequently, you can build a modified kernel for a TiVo box but you cannot load it on the hardware. The Free Software Foundation (FSF) takes the position that this is not in the spirit of open source software and refers to this procedure as Tivoization. The GPL v3 and LGPL v3 were written to explicitly prevent this happening. Some projects, the Linux kernel in particular, have been reluctant to adopt the version three licenses because of the restrictions it would place on device manufacturers. Hardware for embedded Linux If you are designing or selecting hardware for an embedded Linux project, what do you look out for? Firstly, a CPU architecture that is supported by the kernel—unless you plan to add a new architecture yourself, of course! Looking at the source code for Linux 4.9, there are 31 architectures, each represented by a sub-directory in the arch/ directory. They are all 32- or 64-bit architectures, most with a memory management unit (MMU), but some without. The ones most often found in embedded devices are ARM, MIPS PowerPC, and X86, each in 32- and 64-bit variants, and all of which have memory management units. That doesn't have an MMU that runs a subset of Linux known as microcontroller Linux or uClinux. These processor architectures include ARC, Blackfin, MicroBlaze, and Nios. I will mention uClinux from time to time but I will not go into detail because it is a rather specialized topic. Secondly, you will need a reasonable amount of RAM. 16 MiB is a good minimum, although it is quite possible to run Linux using half that. It is even possible to run Linux with 4 MiB if you are prepared to go to the trouble of optimizing every part of the system. It may even be possible to get lower, but there comes a point at which it is no longer Linux. Thirdly, there is non-volatile storage, usually flash memory. 8 MiB is enough for a simple device such as a webcam or a simple router. As with RAM, you can create a workable Linux system with less storage if you really want to, but the lower you go, the harder it becomes. Linux has extensive support for flash storage devices, including raw NOR and NAND flash chips, and managed flash in the form of SD cards, eMMC chips, USB flash memory, and so on. Fourthly, a debug port is very useful, most commonly an RS-232 serial port. It does not have to be fitted on production boards, but makes board bring-up, debugging, and development much easier. Fifthly, you need some means of loading software when starting from scratch. A few years ago, boards would have been fitted with a Joint Test Action Group (JTAG) interface for this purpose, but modern SoCs have the ability to load boot code directly from removable media, especially SD and micro SD cards, or serial interfaces such as RS-232 or USB. In addition to these basics, there are interfaces to the specific bits of hardware your device needs to get its job done. Mainline Linux comes with open source drivers for many thousands of different devices, and there are drivers (of variable quality) from the SoC manufacturer and from the OEMs of third-party chips that may be included in the design, but remember my comments on the commitment and ability of some manufacturers. As a developer of embedded devices, you will find that you spend quite a lot of time evaluating and adapting third-party code, if you have it, or liaising with the manufacturer if you don't. Finally, you will have to write the device support for interfaces that are unique to the device, or find someone to do it for you. Hardware The worked examples are intended to be generic, but to make them relevant and easy to follow, I have had to choose specific hardware. I have chosen two exemplar devices: the BeagleBone Black and QEMU. The first is a widely-available and cheap development board which can be used in serious embedded hardware. The second is a machine emulator that can be used to create a range of systems that are typical of embedded hardware. It was tempting to use QEMU exclusively, but, like all emulations, it is not quite the same as the real thing. Using a BeagleBone Black, you have the satisfaction of interacting with real hardware and seeing real LEDs flash. I could have selected a board that is more up-to-date than the BeagleBone Black, which is several years old now, but I believe that its popularity gives it a degree of longevity and it means that it will continue to be available for some years yet. In any case, I encourage you to try out as many of the examples as you can, using either of these two platforms, or indeed any embedded hardware you may have to hand. The BeagleBone Black The BeagleBone and the later BeagleBone Black are open hardware designs for a small, credit card sized development board produced by CircuitCo LLC. The main repository of information is at https://beagleboard.org/. The main points of the specifications are: TI AM335x 1 GHz ARM® Cortex-A8 Sitara SoC 512 MiB DDR3 RAM 2 or 4 GiB 8-bit eMMC on-board flash storage Serial port for debug and development MicroSD connector, which can be used as the boot device Mini USB OTG client/host port that can also be used to power the board Full size USB 2.0 host port 10/100 Ethernet port HDMI for video and audio output In addition, there are two 46-pin expansion headers for which there are a great variety of daughter boards, known as capes, which allow you to adapt the board to do many different things. However, you do not need to fit any capes in the examples. In addition to the board itself, you will need: A mini USB to full-size USB cable (supplied with the board) to provide power, unless you have the last item on this list. An RS-232 cable that can interface with the 6-pin 3.3V TTL level signals provided by the board. The Beagleboard website has links to compatible cables. A microSD card and a means of writing to it from your development PC or laptop, which will be needed to load software onto the board. An Ethernet cable, as some of the examples require network connectivity. Optional, but recommended, a 5V power supply capable of delivering 1 A or more. QEMU QEMU is a machine emulator. It comes in a number of different flavors, each of which can emulate a processor architecture and a number of boards built using that architecture. For example, we have the following: qemu-system-arm: ARM qemu-system-mips: MIPS qemu-system-ppc: PowerPC qemu-system-x86: x86 and x86_64 For each architecture, QEMU emulates a range of hardware, which you can see by using the option—machine help. Each machine emulates most of the hardware that would normally be found on that board. There are options to link hardware to local resources, such as using a local file for the emulated disk drive. Here is a concrete example: $ qemu-system-arm -machine vexpress-a9 -m 256M -drive file=rootfs.ext4,sd -net nic -net use -kernel zImage -dtb vexpress- v2p-ca9.dtb -append "console=ttyAMA0,115200 root=/dev/mmcblk0" - serial stdio -net nic,model=lan9118 -net tap,ifname=tap0 The options used in the preceding command line are: -machine vexpress-a9: Creates an emulation of an ARM Versatile Express development board with a Cortex A-9 processor -m 256M: Populates it with 256 MiB of RAM -drive file=rootfs.ext4,sd: Connects the SD interface to the local file rootfs.ext4 (which contains a filesystem image) -kernel zImage: Loads the Linux kernel from the local file named zImage -dtb vexpress-v2p-ca9.dtb: Loads the device tree from the local file vexpress-v2p-ca9.dtb -append "...": Supplies this string as the kernel command-line -serial stdio: Connects the serial port to the terminal that launched QEMU, usually so that you can log on to the emulated machine via the serial console -net nic,model=lan9118: Creates a network interface -net tap,ifname=tap0: Connects the network interface to the virtual network interface tap0 To configure the host side of the network, you need the tunctl command from the User Mode Linux (UML) project; on Debian and Ubuntu, the package is named uml-utilites: $ sudo tunctl -u $(whoami) -t tap0 This creates a network interface named tap0 which is connected to the network controller in the emulated QEMU machine. You configure tap0 in exactly the same way as any other interface. I will be using Versatile Express for most of my examples, but it should be easy to use a different machine or architecture. Software I have used only open source software, both for the development tools and the target operating system and applications. I assume that you will be using Linux on your development system. I tested all the host commands using Ubuntu 14.04 and so there is a slight bias towards that particular version, but any modern Linux distribution is likely to work just fine. Summary Embedded hardware will continue to get more complex, following the trajectory set by Moore's Law. Linux has the power and the flexibility to make use of hardware in an efficient way. Linux is just one component of open source software out of the many that you need to create a working product. The fact that the code is freely available means that people and organizations at many different levels can contribute. However, the sheer variety of embedded platforms and the fast pace of development lead to isolated pools of software which are not shared as efficiently as they should be. In many cases, you will become dependent on this software, especially the Linux kernel that is provided by your SoC or Board vendor, and to a lesser extent, the toolchain. Some SoC manufacturers are getting better at pushing their changes upstream and the maintenance of these changes is getting easier. Fortunately, there are some powerful tools that can help you create and maintain the software for your device. For example, Buildroot is ideal for small systems and the Yocto Project for larger ones. Before I describe these build tools, I will describe the four elements of embedded Linux, which you can apply to all embedded Linux projects, however they are created. Resources for Article: Further resources on this subject: Programming with Linux [article] Embedded Linux and Its Elements [article] Revisiting Linux Network Basics [article]

0
0
2423

article-image-understanding-sap-analytics-cloud

Packt

10 Aug 2017

14 min read

Understanding SAP Analytics Cloud

Packt

10 Aug 2017

14 min read

0
0
9861

Packt

09 Aug 2017

3 min read

Games and Exercises

Packt

09 Aug 2017

3 min read

In this article by Shishira Bhat and Ravi Wray, authors of the book, Learn Java in 7 days, we will study the following concepts: Making an object as the return type for a method Making an object as the parameter for a method (For more resources related to this topic, see here.) Let’s start this article by revisiting the reference variablesand custom data types: In the preceding program, p is a variable of datatype,Pen. Yes! Pen is a class, but it is also a datatype, a custom datatype. The pvariable stores the address of the Penobject, which is in heap memory. The pvariable is a reference that refers to a Penobject. Now, let’s get more comfortable by understanding and working with examples. How to return an Object from a method? In this section, let’s understand return types. In the following code, methods returnthe inbuilt data types (int and String), and the reason is explained after each method, as follows: int add () { int res = (20+50); return res; } The addmethod returns the res(70) variable, which is of the int type. Hence, the return type must be int: String sendsms () { String msg = "hello"; return msg; } The sendsmsmethod returns a variable by the name of msg, which is of the String type. Hence, the return type is String. The data type of the returning value and the return type must be the same. In the following code snippet, the return type of the givenPenmethod is not an inbuilt data type. However, the return type is a class (Pen) Let’s understand the following code: The givePen ()methodreturns a variable (reference variable) by the name of p, which is of the Pen type. Hence, the return type is Pen: In the preceding program, tk is a variable of the Ticket type. The method returns tk; hence, the return type of the method is Ticket. A method accepting an object (parameter) After seeing how a method can return an object/reference, let's understand how a method can take an object/reference as the input,that is, parameter. We already understood that if a method takes parameter(s), then we need to pass argument(s). Example In the preceding program,the method takestwo parameters,iandk. So, while calling/invoking the method, we need to pass two arguments, which are 20.5 and 15. The parameter type andthe argument type must be the same. Remember thatwhen class is the datatype, then object is the data. Consider the following example with respect toa non-primitive/class data type andthe object as its data: In the preceding code, the Kid class has the eat method, which takes ch as a parameter of the Chocolatetype, that is,the data type of ch is Chocolate, which is a class. When class is the data type then the object of that class is an actual data or argument. Hence,new Chocolate() is passed as an argument to the eat method. Let's see one more example: The drink method takes wtr as the parameter of the type,Water, which is a class/non-primitive type; hence, the argument must be an object of theWater class. Summary In this article we have learned what to return when a class is a return type for a method and what to pass as an argument for a method when a class is a parameter for the method. Resources for Article: Further resources on this subject: Saying Hello to Java EE [article] Getting Started with Sorting Algorithms in Java [article] Debugging Java Programs using JDB [article]

0
0
11438

article-image-creating-first-python-script

Packt

09 Aug 2017

27 min read

Creating the First Python Script

Packt

09 Aug 2017

27 min read

In this article by Silas Toms, the author of the book ArcPy and ArcGIS - Second Edition, we will demonstrate how to use ModelBuilder, which ArcGIS professionals are already familiar with, to model their first analysis and then export it out as a script. With the Python environment configured to fit our needs, we can now create and execute ArcPy scripts. To ease into the creation of Python scripts, this article will use ArcGIS ModelBuilder to model a simple analysis, and export it as a Python script. ModelBuilder is very useful for creating Python scripts. It has an operational and a visual component, and all models can be outputted as Python scripts, where they can be further customized. This article we will cover the following topics: Modeling a simple analysis using ModelBuilder Exporting the model out to a Python script Window file paths versus Pythonic file paths String formatting methods (For more resources related to this topic, see here.) Prerequisites The following are the prerequisites for this article: ArcGIS 10x and Python 2.7, with arcpy available as a module. For this article, the accompanying data and scripts should be downloaded from Packt Publishing's website. The completed scripts are available for comparison purposes, and the data will be used for this article's analysis. To run the code and test code examples, use your favorite IDE or open the IDLE (Python GUI) program from the Start Menu/ArcGIS/Python2.7 folder after installing ArcGIS for Desktop. Use the built-in "interpreter" or code entry interface, indicated by the triple chevron >>> and a blinking cursor. ModelBuilder ArcGIS has been in development since the 1970s. Since that time, it has included a variety of programming languages and tools to help GIS users automate analysis and map production. These include the Avenue scripting language in the ArcGIS 3x series, and the ARC Macro Language (AML) in the ARCInfo Workstation days as well as VBScript up until ArcGIS 10x, when Python was introduced. Another useful tool introduced in ArcGIS 9x was ModelBuilder, a visual programming environment used for both modeling analysis and creating tools that can be used repeatedly with different input feature classes. A useful feature of ModelBuilder is an export function, which allows modelers to create Python scripts directly from a model. This makes it easier to compare how parameters in a ModelBuilder tool are accepted as compared to how a Python script calls the same tool and supplies its parameters, and how generated feature classes are named and placed within the file structure. ModelBuilder is a helpful tool on its own, and its Python export functionality makes it easy for a GIS analyst to generate and customize ArcPy scripts. Creating a model and exporting to Python This article and the associated scripts depend on the downloadable file SanFrancisco.gdb geodatabase available from Packt. SanFrancisco.gdb contains data downloaded from https://datasf.org/ and the US Census' American Factfinder website at https://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml. All census and geographic data included in the geodatabase is from the 2010 census. The data is contained within a feature dataset called SanFrancisco. The data in this feature dataset is in NAD 83 California State Plane Zone 3, and the linear unit of measure is the US foot. This corresponds to SRID 2227 in the European Petroleum Survey Group (EPSG) format. The analysis which will create with the model, and eventually export to Python for further refinement, will use bus stops along a specific line in San Francisco. These bus stops will be buffered to create a representative region around each bus stop. The buffered areas will then be intersected with census blocks to find out how many people live within each representative region around the bus stops. Modeling the Select and Buffer tools Using ModelBuilder, we will model the basic bus stop analysis. Once it has been modeled, it will be exported as an automatically generated Python script. Follow these steps to begin the analysis: Open up ArcCatalog, and create a folder connection to the folder containing SanFrancisco.gdb. I have put the geodatabase in a C drive folder called "Projects" for a resulting file path of C:ProjectsSanFrancisco.gdb. Right-click on geodatabase, and add a new toolbox called Chapter2Tools. Right-click on geodatabase; select New, and then Feature Dataset, from the menu. A dialogue will appear that asks for a name; call it Chapter2Results, and push Next. It will ask for a spatial reference system; enter 2227 into the search bar, and push the magnifying glass icon. This will locate the correct spatial reference system: NAD 1983 StatePlane California III FIPS 0403 Feet. Don't select a vertical reference system, as we are not doing any Z value analysis. Push next, select the default tolerances, and push Finish. Next, open ModelBuilder using the ModelBuilder icon or by right-clicking on the Toolbox, and create a new Model. Save the model in the Chapter2Tools toolbox as Chapter2Model1. Drag in the Bus_Stops feature class and the Select tool from the Analysis/Extract toolset in ArcToolbox. Open up the Select tool, and name the output feature class Inbound71. Make sure that the feature class is written to the Chapter2Results feature dataset. Open up the Expression SQL Query Builder, and create the following SQL expression : NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'. The next step is to add a Buffer Tool from the Analysis/Proximity toolset. The Buffer tool will be used to create buffers around each bus stop. The buffered bus stops allow us to intersect with census data in the form of census blocks, creating the representative regions around each bus stop. Connect the output of the Select tool (Inbound71) to the Buffer tool. Open up the Buffer tool, add 400 to the Distance field, and change the units to Feet. Leave the rest of the options blank. Click on OK, and return to the model: Adding in the Intersect tool Now that we have selected the bus line of interest, and buffered the stops to create representative regions, we will need to intersect the regions with the census blocks to find the population of each representative region. This can be done as follows: First, add the CensusBlocks2010 feature class from the SanFrancisco feature dataset to the model. Next, add in the Intersect tool located in the Analysis/Overlay toolset in the ArcToolbox. While we could use a Spatial Join to achieve a similar result, I have used the Intersect tool to capture the area of intersect for use later in the model and script. At this point, our model should look like this: Tallying the analysis results After we have created this simple analysis, the next step is to determine the results for each bus stop. Finding the number of people that live in census blocks, touched by the 400-foot buffer of each bus stop, involves examining each row of data in the final feature class, and selecting rows that correspond to the bus stop. Once these are selected, a sum of the selected rows would be calculated either using the Field Calculator or the Summarize tool. All of these methods will work, and yet none are perfect. They take too long, and worse, are not repeatable automatically if an assumption in the model is adjusted (if the buffer is adjusted from 400 feet to 500 feet, for instance). This is where the traditional uses of ModelBuilder begin to fail analysts. It should be easy to instruct the model to select all rows associated with each bus stop, and then generate a summed population figure for each bus stop's representative region. It would be even better to have the model create a spreadsheet to contain the final results of the analysis. It's time to use Python to take this analysis to the next level. Exporting the model and adjusting the script While modeling analysis in ModelBuilder has its drawbacks, there is one fantastic option built into ModelBuilder: the ability to create a model, and then export the model to Python. Along with the ArcGIS Help Documentation, it is the best way to discover the correct Python syntax to use when writing ArcPy scripts. Create a folder that can hold the exported scripts next to the SanFrancisco geodatabase (for example, C:ProjectsScripts). This will hold both the exported scripts that ArcGIS automatically generates, and the versions that we will build from those generated scripts. Now, perform the following steps: Open up the model called Chapter2Model1. Click on the Model menu in the upper-left side of the screen. Select Export from the menu. Select To Python Script. Save the script as Chapter2Model1.py. Note that there is also the option to export the model as a graphic. Creating a graphic of the model is a good way to share what the model is doing with other analysts without the need to share the model and the data, and can also be useful when sharing Python scripts as well. The Automatically generated script Open the automatically generated script in an IDE. It should look like this: # -*- coding: utf-8 -*- # --------------------------------------------------------------------------- # Chapter2Model1.py # Created on: 2017-01-26 04:26:31.00000 # (generated by ArcGIS/ModelBuilder) # Description: # --------------------------------------------------------------------------- # Import arcpy module import arcpy # Local variables: Bus_Stops = "C:ProjectsSanFrancisco.gdbSanFranciscoBus_Stops" Inbound71 = "C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71" Inbound71_400ft_Buffer = "C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71_400ft_Buffer" CensusBlocks2010 = "C:ProjectsSanFrancisco.gdbSanFranciscoCensusBlocks2010" Intersect71Census = "C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census" # Process: Select arcpy.Select_analysis(Bus_Stops, Inbound71, "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'") # Process: Buffer arcpy.Buffer_analysis(Inbound71, Inbound71_400ft_buffer, "400 Feet", "FULL", "ROUND", "NONE", "") # Process: Intersect arcpy.Intersect_analysis("C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71_400ft_Buffer #;C:ProjectsSanFrancisco.gdbSanFranciscoCensusBlocks2010 #",Intersect71Census, "ALL", "", "INPUT") Let's examine this script line by line. The first line is preceded by a pound sign ("#"), which again means that this line is a comment; however, it is not ignored by the Python interpreter when the script is executed as usual, but is used to help Python interpret the encoding of the script as described here: http://legacy.python.org/dev/peps/pep-0263. The second commented line and the third line are included for decorative purposes. The next four lines, all commented, are used for providing readers information about the script: what it is called and when it was created along with a description, which is pulled from the model's properties. Another decorative line is included to visually separate out the informative header from the body of the script. While the commented information section is nice to include in a script for other users of the script, it is not necessary. The body of the script, or the executable portion of the script, starts with the import arcpy line. Import statements are, by convention, included at the top of the body of the script. In this instance, the only module that is being imported is ArcPy. ModelBuilder's export function creates not only an executable script, but also comments each section to help mark the different sections of the script. The comments let user know where the variables are located, and where the ArcToolbox tools are being executed. After the import statements come the variables. In this case, the variables represent the file paths to the input and output feature classes. The variable names are derived from the names of the feature classes (the base names of the file paths). The file paths are assigned to the variables using the assignment operator ("="), and the parts of the file paths are separated by two backslashes. File paths in Python To store and retrieve data, it is important to understand how file paths are used in Python as compared to how they are represented in Windows. In Python, file paths are strings, and strings in Python have special characters used to represent tabs "t", newlines "n", or carriage returns "r", among many others. These special characters all incorporate single backslashes, making it very hard to create a file path that uses single backslashes. File paths in Windows Explorer all use single backslashes. Windows Explorer: C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census Python was developed within the Linux environment, where file paths have forward slashes. There are a number of methods used to avoid this issue. The first is using filepaths with forward slashes. The Python interpreter will understand file paths with forward slashes as seen in this code: Python version: "C:/Projects/SanFrancisco.gdb/Chapter2Results/Intersect71Census" Within a Python script, the Python file path with the forward slashes will definitely work, while the Windows Explorer version might cause the script to throw an exception as Python strings can have special characters like the newline character n, or tab t. that will cause the string file path to be read incorrectly by the Python interpreter. Another method used to avoid the issue with special characters is the one employed by ModelBuilder when it automatically creates the Python scripts from a model. In this case, the backslashes are "escaped" using a second backslash. The preceding script uses this second method to produce the following results: Python escaped version: "C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census" The third method, which I use when copying file paths from ArcCatalog or Windows Explorer into scripts, is to create what is known as a "raw" string. This is the same as a regular string, but it includes an "r" before the script begins. This "r" alerts the Python interpreter that the following script does not contain any special characters or escape characters. Here is an example of how it is used: Python raw string: r"C:ProjectsSanFrancisco.gdbSanFranciscoBus_Stops" Using raw strings makes it easier to grab a file path from Windows Explorer, and add it to a string inside a script. It also makes it easier to avoid accidentally forgetting to include a set of double backslashes in a file path, which happens all the time and is the cause of many script bugs. String manipulation There are three major methods for inserting variables into strings. Each has different advantages and disadvantages of a technical nature. It's good to know about all three, as they have uses beyond our needs here, so let's review them. String manipulation method 1: string addition String addition seems like an odd concept at first, as it would not seem possible to "add" strings together, unlike integers or floats which are numbers. However, within Python and other programming languages, this is a normal step. Using the plus sign "+", strings are "added" together to make longer strings, or to allow variables to be added into the middle of existing strings. Here are some examples of this process: >>> aString = "This is a string" >>> bString = " and this is another string" >>> cString = aString + bString >>> cString The output is as follows: 'This is a string and this is another string' Two or more strings can be "added" together, and the result can be assigned to a third variable for using it later in the script. This process can be useful for data processing and formatting. Another similar offshoot of string addition is string multiplication, where strings are multiplied by an integer to produce repeating versions of the string, like this: >>> "string" * 3 'stringstringstring' String manipulation method 2: string formatting #1 The second method of string manipulation, known as string formatting, involves adding placeholders into the string, which accept specific kinds of data. This means that these special strings can accept other strings as well as integers and float values. These placeholders use the modulo "%" and a key letter to indicate the type of data to expect. Strings are represented using %s, floats using %f, and integers using %d. The floats can also be adjusted to limit the digits included by adding a modifying number after the modulo. If there is more than one placeholder in a string, the values are passed to the string in a tuple. This method has become less popular, since the third method discussed next was introduced in Python 2.6, but it is still valuable to know, as many older scripts use it. Here is an example of this method: >>> origString = "This string has as a placeholder %s" >>> newString = origString % "and this text was added" >>> print newString The output is as follows: This string has as a placeholder and this text was added Here is an example when using a float placeholder: >>> floatString1 = "This string has a float here: %f" >>> newString = floatString % 1.0 >>> newString = floatString1 % 1.0 >>> print newString The output is as follows: This string has a float here: 1.000000 Here is another example when using a float placeholder: >>> floatString2 = "This string has a float here: %.1f" >>> newString2 = floatString2 % 1.0 >>> print newString2 The output is as follows: This string has a float here: 1.0 Here is an example using an integer placeholder: >>> intString = "Here is an integer: %d" >>> newString = intString % 1 >>> print newString The output is as follows: Here is an integer: 1 String manipulation method 3: string formatting #2 The final method is known as string formatting. It is similar to the string formatting method 1, with the added benefit of not requiring a specific data type of placeholder. The placeholders, or tokens as they are also known, are only required to be in order to be accepted. The format function is built into strings; by adding .format to the string, and passing in parameters, the string accepts the values, as seen in the following example: >>> formatString = "This string has 3 tokens: {0}, {1}, {2}" >>> newString = formatString.format("String", 2.5, 4) >>> print newString This string has 3 tokens: String, 2.5, 4 The tokens don't have to be in order within the string, and can even be repeated by adding a token wherever it is needed within the template. The order of the values applied to the template is derived from the parameters supplied to the .format function, which passes the values to the string. The third method has become my go-to method for string manipulation because of the ability to add the values repeatedly, and because it makes it possible to avoid supplying the wrong type of data to a specific placeholder, unlike the second method. The ArcPy tools After the import statements and the variable definitions, the next section of the script is where the analysis is executed. The same tools that we created in the model--the Select, Buffer, and Intersect tools, are included in this section. The same parameters that we supplied in the model are also included here: the inputs and outputs, plus the SQL statement in the Select tool, and the buffer distance in the Buffer tool. The tool parameters are supplied to the tools in the script in the same order as they appear in the tool interfaces in the model. Here is the Select tool in the script: arcpy.Select_analysis(Bus_Stops, Inbound71, "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'") It works like this: the arcpy module has a "method", or tool, called Select_analysis. This method, when called, requires three parameters: the input feature class (or shapefile), the output feature class, and the SQL statement. In this example, the input is represented by the variable Bus_Stops, and the output feature class is represented by the variable Inbound71, both of which are defined in the variable section. The SQL statement is included as the third parameter. Note that it could also be represented by a variable if the variable was defined preceding to this line; the SQL statement, as a string, could be assigned to a variable, and the variable could replace the SQL statement as the third parameter. Here is an example of parameter replacement using a variable: sqlStatement = "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'" arcpy.Select_analysis(Bus_Stops, Inbound71, sqlStatement) While ModelBuilder is good for assigning input and output feature classes to variables, it does not assign variables to every portion of the parameters. This will be an important thing to correct when we adjust and build our own scripts. The Buffer tool accepts a similar set of parameters as the Select tool. There is an input feature class represented by a variable, an output feature class variable, and the distance that we provided (400 feet in this case) along with a series of parameters that were supplied by default. Note that the parameters rely on keywords, and these keywords can be adjusted within the text of the script to adjust the resulting buffer output. For instance, "Feet" could be adjusted to "Meters", and the buffer would be much larger. Check the help section of the tool to understand better how the other parameters will affect the buffer, and to find the keyword arguments that are accepted by the Buffer tool in ArcPy. Also, as noted earlier, all of the parameters could be assigned to variables, which can save time if the same parameters are used repeatedly throughout a script. Sometimes, the supplied parameter is merely an empty string, as in this case here with the last parameter: arcpy.Buffer_analysis(Inbound71,Inbound71_400ft_buffer, "400 Feet", "FULL", "ROUND", "NONE", "") The empty string for the last parameter, which, in this case, signifies that there is no dissolve field for this buffer, is found quite frequently within ArcPy. It could also be represented by two single quotes, but ModelBuilder has been built to use double quotes to encase strings. The Intersect tool The last tool, the Intersect tool, uses a different method to represent the files that need to be intersected together when the tool is executed. Because the tool accepts multiple files in the input section (meaning, there is no limit to the number of files that can be intersected together in one operation), it stores all of the file paths within one string. This string can be manipulated using one of the string manipulation methods discussed earlier, or it can be reorganized to accept a Python list that contains the file paths, or variables representing file paths as a list, as the first parameter in any order. The Intersect tool will find the intersection of all of the strings. Adjusting the script Now is the time to take the automatically generated script, and adjust it to fit our needs. We want the script to both produce the output data, and to have it analyze the data and tally the results into a spreadsheet. This spreadsheet will hold an averaged population value for each bus stop. The average will be derived from each census block that the buffered representative region surrounding the stops intersected. Save the original script as "Chapter2Model1Modified.py". Adding the CSV module to the script For this script, we will use the csv module, a useful module for creating Comma-Separated Value spreadsheets. Its simple syntax will make it a useful tool for creating script outputs. ArcGIS for Desktop also installs the xlrd and xlwt modules, used to read or generate Excel spreadsheets respectively, when it is installed. These modules are also great for data analysis output. After the import arcpy line, add import csv. This will allow us to use the csv module for creating the spreadsheet. # Import arcpy module import arcpy import csv The next adjustment is made to the Intersect tool. Notice that the two paths included in the input string are also defined as variables in the variable section. Remove the file paths from the input strings, and replace them with a list containing the variable names of the input datasets, as follows: # Process: Intersect arcpy.Intersect_analysis([Inbound71_400ft_buffer,CensusBlocks2010],Intersect71Census, "ALL", "", "INPUT") Accessing the data: using a cursor Now that the script is in place to generate the raw data we need, we need a way to access the data held in the output feature class from the Intersect tool. This access will allow us to aggregate the rows of data representing each bus stop. We also need a data container to hold the aggregated data in memory before it is written to the spreadsheet. To accomplish the second part, we will use a Python dictionary. To accomplish the first part, we will use a method built into the ArcPy module: the Data Access SearchCursor. The Python dictionary will be added after the Intersect tool. A dictionary in Python is created using curly brackets {}. Add the following line to the script, below the analysis section: dataDictionary = {} This script will use the bus stop IDs as keys for the dictionary. The values will be lists, which will hold all of the population values associated with each busStopID. Add the following lines to generate a Data Cursor: with arcpy.da.SearchCursor(Intersect71Census, ["STOPID","POP10"]) as cursor: for row in cursor: busStopID = row[0] pop10 = row[1] if busStopID not in dataDictionary.keys(): dataDictionary[busStopID] = [pop10] else: dataDictionary[busStopID].append(pop10) This iteration combines a few ideas in Python and ArcPy. The with...as statement is used to create a variable (cursor), which represents the arcpy.da.SearchCursor object. It could also be written like this: cursor = arcpy.da.SearchCursor(Intersect71Census, ["STOPID","POP10"]) The advantage of the with...as structure is that the cursor object is erased from memory when the iteration is completed, which eliminates locks on the feature classes being evaluated. The arcpy.da.SearchCursor function requires an input feature class, and a list of fields to be returned. Optionally, an SQL statement can limit the number of rows returned. The next line, for row in cursor:, is the iteration through the data. It is not a normal Pythonic iteration, a distinction that will have ramifications in certain instances. For instance, one cannot pass index parameters to the cursor object to only evaluate specific rows within the cursor object, as one can do with a list. When using a Search Cursor, each row of data is returned as a tuple, which cannot be modified. The data can be accessed using indexes. The if...else condition allows the data to be sorted. As noted earlier, the bus stop ID, which is the first member of the data included in the tuple, will be used as a key. The conditional evaluates if the bus stop ID is included in the dictionary's existing keys (which are contained in a list, and accessed using the dictionary.keys() method). If it is not, it is added to the keys, and assigned a value that is a list that contains (at first) one piece of data, the population value contained in that row. If it does exist in the keys, the list is appended with the next population value associated with that bus stop ID. With this code, we have now sorted each census block population according to the bus stop with which it is associated. Next we need to add code to create the spreadsheet. This code will use the same with...as structure, and will generate an average population value by using two built-in Python functions: sum, which creates a sum from a list of numbers, and len, which will get the length of a list, tuple, or string. with open(r'C:ProjectsAverages.csv', 'wb') as csvfile: csvwriter = csv.writer(csvfile, delimiter=',') for busStopID in dataDictionary.keys(): popList = dataDictionary[busStopID] averagePop = sum(popList)/len(popList) data = [busStopID, averagePop] csvwriter.writerow(data) The average population value is retrieved from the dictionary using the busStopID key, and then assigned to the variable averagePop. The two data pieces, the busStopID and the averagePop variable are then added to a list.This list is supplied to a csvwriter object, which knows how to accept the data and write it out to a file located at the file path supplied to the built-in Python function open, used to create simple files. The script is complete, although it is nice to add one more line to the end to give us visual confirmation that the script has run. print "Data Analysis Complete" This last line will create an output indicating that the script has run. Once it is done, go to the location of the output CSV file and open it using Excel or Notepad, and see the results of the analysis. Our first script is complete! Exceptions and tracebacks During the process of writing and testing scripts, there will be errors that cause the code to break and throw exceptions. In Python, these are reported as a "traceback", which shows the last few lines of code executed before an exception occurred. To best understand the message, read them from the last line up. It will tell you the type of exception that occurred, and preceding to that will be the code that failed, with a line number, that should allow you to find and fix the code. It's not perfect, but it works. Overwriting files One common issue is that ArcGIS for Desktop does not allow you to overwrite files without turning on an environment variable. To avoid this issue, you can add a line after the import statements that will make overwriting files possible. Be aware that the original data will be unrecoverable once it is overwritten. It uses the env module to access the ArcGIS environment: import arcpy arcpy.env.overwriteOutput = True The final script Here is how the script should look in the end: # Chapter2Model1Modified.py # Import arcpy module import arcpy import csv # Local variables: Bus_Stops = r"C:ProjectsSanFrancisco.gdbSanFranciscoBus_Stops" CensusBlocks2010 = r"C:ProjectsSanFrancisco.gdbSanFranciscoCensusBlocks2010" Inbound71 = r"C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71" Inbound71_400ft_buffer = r"C:ProjectsSanFrancisco.gdbChapter2ResultsInbound71_400ft_buffer" Intersect71Census = r"C:ProjectsSanFrancisco.gdbChapter2ResultsIntersect71Census" # Process: Select arcpy.Select_analysis(Bus_Stops, Inbound71, "NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza'") # Process: Buffer arcpy.Buffer_analysis(Inbound71, Inbound71_400ft_buffer, "400 Feet", "FULL", "ROUND", "NONE", "") # Process: Intersect arcpy.Intersect_analysis([Inbound71_400ft_buffe,CensusBlocks2010], Intersect71Census, "ALL", "", "INPUT") dataDictionary = {} with arcpy.da.SearchCursor(Intersect71Census, ["STOPID","POP10"]) as cursor: for row in cursor: busStopID = row[0] pop10 = row[1] if busStopID not in dataDictionary.keys(): dataDictionary[busStopID] = [pop10] else: dataDictionary[busStopID].append(pop10) with open(r'C:ProjectsAverages.csv', 'wb') as csvfile: csvwriter = csv.writer(csvfile, delimiter=',') for busStopID in dataDictionary.keys(): popList = dataDictionary[busStopID] averagePop = sum(popList)/len(popList) data = [busStopID, averagePop] csvwriter.writerow(data) print "Data Analysis Complete" Summary In this article, you learned how to craft a model of an analysis and export it out to a script. In particular, you learned how to use ModelBuilder to create an analysis and export it out as a script and how to adjust the script to be more "Pythonic". After explaining about the auto-generated script, we adjusted the script to include a results analysis and summation, which was outputted to a CSV file. We also briefly touched on the use of Search Cursors. Also, we saw how built-in modules such as the csv module can be used along with ArcPy to capture analysis output in formatted spreadsheets. Resources for Article: Further resources on this subject: Using the ArcPy DataAccess Module withFeature Classesand Tables [article] Measuring Geographic Distributions with ArcGIS Tool [article] Learning to Create and Edit Data in ArcGIS [article]

0
0
6626

Packt

08 Aug 2017

13 min read

Developing Games Using AI

Packt

08 Aug 2017

13 min read

In this article, MicaelDaGraça, the author of the book, Practical Game AI Programming, we will be covering the following points to introduce you to using AI Programming for games for exploratory data analysis. A brief history of and solutions to game AI Enemy AI in video games From simple to smart and human-like AI Visual and Audio Awareness A brief history of and solutions to game AI To better understand how to overcome the present problems that game developers are currently facing, we need to dig a little bit on the history of video game development and take a look to the problems and solutions that was so important at the time, where some of them was so avant-garde that actually changed the entire history of video game design itself and we still use the same methods today to create unique and enjoyable games. One of the first relevant marks that is always worth to mention when talking about game AI is the chess computer programmed to compete against humans. It was the perfect game to start experimenting artificial intelligence, because chess usually requires a lot of thought and planning ahead, something that a computer couldn't do at the time because it was necessary to have human features in order to successfully play and win the game. So the first step was to make it able for the computer to process the game rules and think for itself in order to make a good judgment of the next move that the computer should do to achieve the final goal that was winning by check-mate. The problem, chess has many possibilities so even if the computer had a perfect strategy to beat the game, it was necessary to recalculate that strategy, adapting it, changing or even creating a new one every time something went wrong with the first strategy. Humans can play differently every time, what makes it a huge task for the programmers to input all the possibility data into the computer to win the game. So writing all the possibilities that could exist wasn't a viable solution, and because of that programmers needed to think again about the problem. Then one day they finally came across with a better solution, making the computer decide for itself every turn, choosing the most plausible option for each turn, that way the computer could adapt to any possibility of the game. Yet this involved another problem, the computer would only think on the short term, not creating any plans to defeat the human in the future moves, so it was easy to play against it but at least we started to have something going on. It would be necessary decades until someone defined the word "Artificial Intelligence" by solving the first problem that many researchers had by trying to create a computer that was capable of defeating a human player. Arthur Samuel is the person responsible for creating a computer that could learn for itself and memorize all the possible combinations. That way there wasn't necessary any human intervention and the computer could actually think for its own and that was a huge step that it's still impressive even for today standards. Enemy AI in video games Now let's move to the video game industry and analyze how the first enemies and game obstacles were programmed, was it that different from what we are doing now? Let's find out. Single player games with AI enemies started to appear in the 70's and soon some games started to elevate the quality and expectations of what defines a video game AI, some of those examples were released for the arcade machines, like Speed Race from Taito (racing video game) and Qwak(duck hunting using a light gun) or Pursuit(aircraft fighter) both from Atari. Other notable examples are the text based games released for the first personal computers, like Hunt the Wumpus and Star Trek that also had enemies. What made those games so enjoyable was precisely the AI enemies that didn't reacted like any other before because they had random elements mixed with the traditional stored patterns, making them unpredictable and a unique experience every time you played the game. But that was only possible due to the incorporation of microprocessors that expanded the capabilities of a programmer at that time. Space Invaders brought the movement patterns,Galaxian improved and added more variety making the AI even more complex, Pac-Man later on brought movement patterns to the maze genre. The influence that the AI design in Pac-Man had is just as significant as the influence of the game itself. This classic arcade game makes the player believe that the enemies in the game are chasing him but not in a crude manner. The ghosts are chasing the player (or evading the player) in a different way as if they have an individual personality. This gives people the illusion that they are actually playing against 4 or 5 individual ghosts rather than copies of a same computer enemy. After that Karate Champ introduced the first AI fighting character, Dragon Quest introduced tactical system for the RPG genre and over the years the list of games that explored artificial intelligence and used it to create unique game concepts kept expanding and all of that came from a single question, how can we make a computer capable of beating a human on a game. All the games mentioned above have a different genre and they are unique in their style but all of them used the same method for the AI, that is called Finite State Machine. Here the programmer input all the behaviors necessary for the computer to challenge the player, just like the first computer that played chess. The programmer defined exactly how the computer should behave in different occasions in order to move, avoid, attack or perform any other behavior in order to challenge the player and that method is used even in the latest big budget games of today. From simple to smart and human-like AI Programmers face many challenges while developing an AI character but one of the greatest challenges is adapting the AI movement and behavior in relation to what the player is currently doing or will do in future actions. The difficulty exists because the AI is programmed with pre-determined states, using probability or possibility maps in order to adapt his movement and behavior according to the player. This technic can become very complex if the programmer extends the possibilities of the AI decisions, just like the chess machine that has all the possible situations that may occur on the game. It's a huge task for the programmer because it's necessary to determine what the player can do and how the AI will react to each action of the player and that takes a lot of CPU power. To overcome that challenge programmers started to mix possibility maps with probabilities and other technics that let the AI decide for itself on how it should react according to the player actions. These factors are important to consider while developing an AI that elevates the game quality as we are about to discover. Games kept evolving and players got even more exigent, not only with the visual quality, as well with the capabilities of the AI enemies and also with the allied characters. To deliver new games that took in consideration the player expectations, programmers started to write even more states for each character, creating new possibilities, more engaging enemies implementing important allies characters, more things for the player to do and a lot more features that helped re-defined different genres and creating new ones. Of course this was also possible because of the technology that also kept improving, allowing the developers to explore even more the artificial intelligence in the video games. A great example of this that is worth to mention is Metal Gear Solid, the game brought a new genre to the video game industry by implementing stealth elements, instead of the popular straight forward and shooting. But those elements couldn't be fully explored as Hideo Kojima intended because of the hardware limitations at the time. Jumping forward from the 3th to the 5th generation of consoles, Konami and Hideo Kojima presented the same title but this time with a lot more interactions, possibilities and behaviors from the AI elements of the game, making it so successful and important in the video game history that it's easy to see its influence in a large number of games that came after Metal Gear Solid. Metal Gear Solid - Sony Playstation 1 Visual and Audio Awareness The game in the above screenshot implemented visual and audio awareness to the enemy. This feature stablished the genre that we know today as a stealth game. So the game uses Pathfinding and Finite States Machine, features that already came from the beginning of the video game industry but in order to create something new they also created new features such as Interaction with the Environment, Navigation Behavior, Visual/Audio Awareness and AI interaction. A lot of things that didn't existed at the time but that is widely used today even on different game genres such as Sports, Racing, Fighting or FPS games were also introduced. After that huge step for game design, developers still faced other problems or should i say, this new possibilities brought even more problems, because it was not perfect. The AI still didn't react as a real person and many other elements was necessary to implement, not only on stealth games but in all other genres and one in particular needed to improve their AI to make the game feel realistic. We are talking about sport games, especially those who tried to simulate the real world team behaviors such as Basketball or Football. If we think about it, the interaction with the player is not the only thing that we need to care about, we left the chess long time ago, where it was 1 vs 1. Now we want more and watching other games getting realistic AI behaviors, sport fanatics started to ask that same features on their favorite games, after all those games was based on real world events and for that reason the AI should react realistically as possible. At this point developers and game designers started to take in consideration the AI interaction with itself and just like the enemies from Pac-Man, the player should get the impression that each character on the game, thinks for itself and reacts differently from the others. If we analyze it closely the AI that is present on a sports game is structured like an FPS or RTS game is, using different animation states, general movements, interactions, individual decisions and finally tactic and collective decisions. So it shouldn't be a surprise that sports games could reach the same level of realism as the other genres that greatly evolved in terms of AI development, .However there's a few problems that only sport games had at the time and it was how to make so many characters on the same screen react differently but working together to achieve the same objective. With this problem in mind, developers started to improve the individual behaviors of each character, not only for the AI that was playing against the player, but also the AI that was playing alongside with the player. Once again Finite State Machines made a crucial part of the Artificial Intelligence but the special touch that helped to create a realistic approach in the sports genre was the anticipation and awareness used on stealth games. The computer needed to calculate what the player was doing, where the ball was going and coordinate all of that, plus giving a false impression of a team mindset towards the same plan. Combining the newly features used on the new genre of stealth games with a vast number of characters on the same screen, it was possible to innovate the sports genre by creating a sports simulation type of game that has gained so much popularity over the years. This helps us to understand that we can use almost the same methods for any type of game even if it looks completely different, the core principles that we saw on the computer that played chess it's still valuable to the sport game released 30 years later. Let's move on to our last example that also has a great value in terms of how an AI character should behave to make it more realistic, the game is F.E.A.R. developed by Monolith Productions. What made this game so special in terms of Artificial Intelligence was the dialogues between the enemy characters. While it wasn't an improvement in a technical point of view, it was definitely something that helped to showcase all of the development work that was put on the characters AI and this is so crucial because if the AI don't say it, it didn't happen. This is an important factor to take in consideration while creating a realistic AI character, giving the illusion that it's real, the false impression that the computer reacts like humans and humans interact so AI should do the same. Not only the dialogues help to create a human like atmosphere, it also helps to exhale all of the development put on the character that otherwise the player wouldn't notice that it was there. When the AI detects the player for the first time, he shouts that he found it, when the AI loses sight of the player, he also express that emotion. When the squad of AI's are trying to find the player or ambush him, they speak about that, living the player imagining that the enemy is really capable of thinking and planning against him. Why is this so important? Because if we only had numbers and mathematical equations to the characters, they will react that way, without any human feature, just math and to make it look more human it's necessary to input mistakes, errors and dialogues inside the character AI, just to distract the player from the fact that he's playing against a machine. The history of video game artificial intelligence is still far away from perfect and it's possible that it would take us decades to improve just a little bit what we achieve from the early 50's until this present day, so don't be afraid of exploring what you are about to learn, combine, change or delete some of the things to find different results, because great games did it in the past and they had a lot of success with it. Summary In this article we learned about the AI impact in the video game history, how everything started from a simple idea to have a computer to compete against humans in traditional games and how that naturally evolved into the the world of video games. We also learned about the challenges and difficulties that were present since the day one and how coincidentally programmers kept facing and still face the same problems.

0
0
43322

Packt

21 Jul 2017

11 min read

Vulnerability Assessment

Packt

21 Jul 2017

11 min read

0
0
24093

How-To Tutorials

article-image-getting-started-android-things

Packt

21 Jul 2017

16 min read

Getting Started with Android Things

Packt

21 Jul 2017

16 min read

0
0
16037

Puppet Server and Agents

Machine Learning Models

Introduction to the Latest Social Media Landscape and Importance

Sewable LEDs in Clothing

The Cloud and the DevOps Revolution

Writing Modules

Scripting FreeSWITCH Lua

Getting the best out of OneDrive for Business

Starting Out

Understanding SAP Analytics Cloud

Trending Topics

Games and Exercises

Creating the First Python Script

Developing Games Using AI

Vulnerability Assessment

Getting Started with Android Things

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access