How-To Tutorials

article-image-introducing-test-driven-machine-learning

14 Oct 2015

19 min read

Introducing Test-driven Machine Learning

14 Oct 2015

In this article by Justin Bozonier, the author of the book Test Driven Machine Learning, we will see how to develop complex software (sometimes rooted in randomness) in small, controlled steps also it will guide you on how to begin developing solutions to machine learning problems using test-driven development (from here, this will be written as TDD). Mastering TDD is not something the book will achieve. Instead, the book will help you begin your journey and expose you to guiding principles, which you can use to creatively solve challenges as you encounter them. We will answer the following three questions in this article: What are TDD and behavior-driven development (BDD)? How do we apply these concepts to machine learning, and making inferences and predictions? How does this work in practice? (For more resources related to this topic, see here.) After having answers to these questions, we will be ready to move onto tackling real problems. The book is about applying these concepts to solve machine learning problems. This article is the largest theoretical explanation that we will have with the remainder of the theory being described by example. Due to the focus on application, you will learn much more than what you can learn about the theory of TDD and BDD. To read more about the theory and ideals, search the internet for articles written by the following: Kent Beck—The father of TDD Dan North—The father of BDD Martin Fowler—The father of refactoring, he has also created a large knowledge base, on these topics James Shore—one of the author of The Art of Agile Development, has a deep theoretical understanding of TDD, and explains the practical value of it quite well These concepts are incredibly simple and yet can take a lifetime to master. When applied to machine learning, we must find new ways to control and/or measure the random processes inherent in the algorithm. This will come up in this article as well as others. In the next section, we will develop a foundation for TDD and begin to explore its application. Test-driven development Kent Beck wrote in his seminal book on the topic that TDD consists of only two specific rules, which are as follows: Don't write a line of new code unless you first have a failing automated test Eliminate duplication This as he noted fairly quickly leads us to a mantra, really the mantra of TDD: Red, Green, Refactor. If this is a bit abstract, let me restate it that TDD is a software development process that enables a programmer to write code that specifies the intended behavior before writing any software to actually implement the behavior. The key value of TDD is that at each step of the way, you have working software as well as an itemized set of specifications. TDD is a software development process that requires the following: Writing code to detect the intended behavioral change. Rapid iteration cycle that produces working software after each iteration. It clearly defines what a bug is. If a test is not failing but a bug is found, it is not a bug. It is a new feature. Another point that Kent makes is that ultimately, this technique is meant to reduce fear in the development process. Each test is a checkpoint along the way to your goal. If you stray too far from the path and wind up in trouble, you can simply delete any tests that shouldn't apply, and then work your code back to a state where the rest of your tests pass. There's a lot of trial and error inherent in TDD, but the same matter applies to machine learning. The software that you design using TDD will also be modular enough to be able to have different components swapped in and out of your pipeline. You might be thinking that just thinking through test cases is equivalent to TDD. If you are like the most people, what you write is different from what you might verbally say, and very different from what you think. By writing the intent of our code before we write our code, it applies a pressure to our software design that prevents you from writing "just in case" code. By this I mean the code that we write just because we aren't sure if there will be a problem. Using TDD, we think of a test case, prove that it isn't supported currently, and then fix it. If we can't think of a test case, we don't add code. TDD can and does operate at many different levels of the software under development. Tests can be written against functions and methods, entire classes, programs, web services, neural networks, random forests, and whole machine learning pipelines. At each level, the tests are written from the perspective of the prospective client. How does this relate to machine learning? Lets take a step back and reframe what I just said. In the context of machine learning, tests can be written against functions, methods, classes, mathematical implementations, and the entire machine learning algorithms. TDD can even be used to explore technique and methods in a very directed and focused manner, much like you might use a REPL (an interactive shell where you can try out snippets of code) or the interactive (I)Python session. The TDD cycle The TDD cycle consists of writing a small function in the code that attempts to do something that we haven't programmed yet. These small test methods will have three main sections; the first section is where we set up our objects or test data; another section is where we invoke the code that we're testing; and the last section is where we validate that what happened is what we thought would happen. You will write all sorts of lazy code to get your tests to pass. If you are doing it right, then someone who is watching you should be appalled at your laziness and tiny steps. After the test goes green, you have an opportunity to refactor your code to your heart's content. In this context, refactor refers to changing how your code is written, but not changing how it behaves. Lets examine more deeply the three steps of TDD: Red, Green, and Refactor. Red First, create a failing test. Of course, this implies that you know what failure looks like in order to write the test. At the highest level in machine learning, this might be a baseline test where baseline is a better than random test. It might even be predicts random things, or even simpler always predicts the same thing. Is this terrible? Perhaps, it is to some who are enamored with the elegance and artistic beauty of his/her code. Is it a good place to start, though? Absolutely. A common issue that I have seen in machine learning is spending so much time up front, implementing the one true algorithm that hardly anything ever gets done. Getting to outperform pure randomness, though, is a useful change that can start making your business money as soon as it's deployed. Green After you have established a failing test, you can start working to get it green. If you start with a very high-level test, you may find that it helps to conceptually break that test up into multiple failing tests that are the lower-level concerns. I'll dive deeper into this later on in this article but for now, just know that you want to get your test passing as soon as possible; lie, cheat, and steal to get there. I promise that cheating actually makes your software's test suite that much stronger. Resist the urge to write the software in an ideal fashion. Just slap something together. You will be able to fix the issues in the next step. Refactor You got your test to pass through all the manners of hackery. Now, you get to refactor your code. Note that it is not to be interpreted loosely. Refactor specifically means to change your software without affecting its behavior. If you add the if clauses, or any other special handling, you are no longer refactoring. Then you write the software without tests. One way where you will know for sure that you are no longer refactoring is that you've broken previously passing tests. If this happens, we back up our changes until our tests pass again. It may not be obvious but this isn't all that it takes for you to know that you haven't changed behavior. Read Refactoring: Improving the Design of Existing Code, Martin Fowler for you to understand how much you should really care for refactoring. By the way of his illustration in this book, refactoring code becomes a set of forms and movements not unlike karate katas. This is a lot of general theory, but what does a test actually look like? How does this process flow in a real problem? Behavior-driven development BDD is the addition of business concerns to the technical concerns more typical of TDD. This came about as people became more experienced with TDD. They started noticing some patterns in the challenges that they were facing. One especially influential person, Dan North, proposed some specific language and structure to ease some of these issues. Some issues he noticed were the following: People had a hard time understanding what they should test next. Deciding what to name a test could be difficult. How much to test in a single test always seemed arbitrary. Now that we have some context, we can define what exactly BDD is. Simply put, it's about writing our tests in such a way that they will tell us the kind of behavior change they affect. A good litmus test might be asking oneself if the test you are writing would be worth explaining to a business stakeholder. How this solves the previous may not be completely obvious, but it may help to illustrate what this looks like in practice. It follows a structure of given, when, then. Committing to this style completely can require specific frameworks or a lot of testing ceremony. As a result, I loosely follow this in my tests as you will see soon. Here's a concrete example of a test description written in this style Given an empty dataset when the classifier is trained, it should throw an invalid operation exception. This sentence probably seems like a small enough unit of work to tackle, but notice that it's also a piece of work that any business user, who is familiar with the domain that you're working in, would understand and have an opinion on. You can read more about Dan North's point of view in this article on his website at dannorth.net/introducing-bdd/. The BDD adherents tend to use specialized tools to make the language and test result reports be as accessible to business stakeholders as possible. In my experience and from my discussions with others, this extra elegance is typically used so little that it doesn't seem worthwhile. The approach you will learn in the book will take a simplicity first approach to make it as easy as possible for someone with zero background to get up to speed. With this in mind, lets work through an example. Our first test Let's start with an example of what a test looks like in Python. The main reason for using this is that while it is a bit of a pain to install a library, this library, in particular, will make everything that we do much simpler. The default unit test solution in Python requires a heavier set up. On top of this, by using nose, we can always mix in tests that use the built-in solution where we find that we need the extra features. First, install it like this: pip install nose If you have never used pip before, then it is time for you to know that it is a very simple way to install new Python libraries. Now, as a hello world style example, lets pretend that we're building a class that will guess a number using the previous guesses to inform it. This is the first simplest example to get us writing some code. We will use the TDD cycle that we discussed previously, and write our first test in painstaking detail. After we get through our first test and have something concrete to discuss, we will talk about the anatomy of the test that we wrote. First, we must write a failing test. The simplest failing test that I can think of is the following: def given_no_information_when_asked_to_guess_test(): number_guesser = NumberGuesser() result = number_guesser.guess() assert result is None, "Then it should provide no result." The context for assert is in the test name. Reading the test name and then the assert name should do a pretty good job of describing what is being tested. Notice that in my test, I instantiate a NumberGuesser object. You're not missing any steps; this class doesn't exist yet. This seems roughly like how I'd want to use it. So, it's a great place to start with. Since it doesn't exist, wouldn't you expect this test to fail? Lets test this hypothesis. To run the test, first make sure your test file is saved so that it ends in _tests.py. From the directory with the previous code, just run the following: nosetests When I do this, I get the following result: Here's a lot going on here, but the most informative part is near the end. The message is saying that NumberGuesser does not exist yet, which is exactly what I expected since we haven't actually written the code yet. Throughout the book, we'll reduce the detail of the stack traces that we show. For now, we'll keep things detailed to make sure that we're on the same page. At this point, we're in a red state in the TDD cycle. Use the following steps to create our first successful test: Now, create the following class in a file named NumberGuesser.py: class NumberGuesser: """Guesses numbers based on the history of your input"" Import the new class at the top of my test file with a simple import NumberGuesser statement. I rerun nosetests, and get the following: TypeError: 'module' object is not callable Oh whoops! I guess that's not the right way to import the class. This is another very tiny step, but what is important is that we are making forward progress through constant communication with our tests. We are going through extreme detail because I can't stress this point enough; bear with me for the time being. Change the import statement to the following: from NumberGuesser import NumberGuesser Rerun nosetests and you will see the following: AttributeError: NumberGuesser instance has no attribute 'guess' The error message has changed, and is leading to the next thing that needs to be changed. From here, just implement what we think we need for the test to pass: class NumberGuesser: """Guesses numbers based on the history of your input""" def guess(self): return None On rerunning the nosetests, we'll get the following result: That's it! Our first successful test! Some of these steps seem so tiny so as to not being worthwhile. Indeed, overtime, you may decide that you prefer to work on a different level of detail. For the sake of argument, we'll be keeping our steps pretty small if only to illustrate just how much TDD keeps us on track and guides us on what to do next. We all know how to write the code in very large, uncontrolled steps. Learning to code surgically requires intentional practice, and is worth doing explicitly. Lets take a step back and look at what this first round of testing took. Anatomy of a test Starting from a higher level, notice how I had a dialog with Python. I just wrote the test and Python complained that the class that I was testing didn't exist. Next, I created the class, but then Python complained that I didn't import it correctly. So then, I imported it correctly, and Python complained that my guess method didn't exist. In response, I implemented the way that my test expected, and Python stopped complaining. This is the spirit of TDD. You have a conversation between you and your system. You can work in steps as little or as large as you're comfortable with. What I did previously could've been entirely skipped over, and the Python class could have been written and imported correctly the first time. The longer you go without talking to the system, the more likely you are to stray from the path to getting things working as simply as possible. Lets zoom in a little deeper and dissect this simple test to see what makes it tick. Here is the same test, but I've commented it, and broken it into sections that you will see recurring in every test that you write: def given_no_information_when_asked_to_guess_test(): # given number_guesser = NumberGuesser() # when guessed_number = number_guesser.guess() # then assert guessed_number is None, 'there should be no guess.' Given This section sets up the context for the test. In the previous test, you acquired that I didn't provide any prior information to the object. In many of our machine learning tests, this will be the most complex portion of our test. We will be importing certain sets of data, sometimes making a few specific issues in the data and testing our software to handle the details that we would expect. When you think about this section of your tests, try to frame it as Given this scenario… In our test, we might say Given no prior information for NumberGuesser… When This should be one of the simplest aspects of our test. Once you've set up the context, there should be a simple action that triggers the behavior that you want to test. When you think about this section of your tests, try to frame it as When this happens… In our test we might say When NumberGuesser guesses a number… Then This section of our test will check on the state of our variables and any return result if applicable. Again, this section should also be fairly straight-forward, as there should be only a single action that causes a change into your object under the test. The reason for this is that if it takes two actions to form a test, then it is very likely that we will just want to combine the two into a single action that we can describe in terms that are meaningful in our domain. A key example maybe loading the training data from a file and training a classifier. If we find ourselves doing this a lot, then why not just create a method that loads data from a file for us? In the book, you will find examples where we'll have the helper functions help us determine whether our results have changed in certain ways. Typically, we should view these helper functions as code smells. Remember that our tests are the first applications of our software. Anything that we have to build in addition to our code, to understand the results, is something that we should probably (there are exceptions to every rule) just include in the code we are testing. Given, When, Then is not a strong requirement of TDD, because our previous definition of TDD only consisted of two things (all that the code requires is a failing test first and an eliminate duplication). It's a small thing to be passionate about and if it doesn't speak to you, just translate this back into Arrange, act, assert in your head. At the very least, consider it as well as why these specific, very deliberate words are used. Applied to machine learning At this point, you maybe wondering how TDD will be used in machine learning, and whether we use it on regression or classification problems. In every machine learning algorithm, there exists a way to quantify the quality of what you're doing. In the linear regression; it's your adjusted R2 value; in classification problems, it's an ROC curve (and the area beneath it) or a confusion matrix, and more. All of these are testable quantities. Of course, none of these quantities have a built-in way of saying that the algorithm is good enough. We can get around this by starting our work on every problem by first building up a completely naïve and ignorant algorithm. The scores that we get for this will basically represent a plain, old, and random chance. Once we have built an algorithm that can beat our random chance scores, we just start iterating, attempting to beat the next highest score that we achieve. Benchmarking algorithms are an entire field onto their own right that can be delved in more deeply. In the book, we will implement a naïve algorithm to get a random chance score, and we will build up a small test suite that we can then use to pit this model against another. This will allow us to have a conversation with our machine learning models in the same manner as we had with Python earlier. For a professional machine learning developer, it's quite likely that an ideal metric to test is a profitability model that compares risk (monetary exposure) to expected value (profit). This can help us keep a balanced view of how much error and what kind of error we can tolerate. In machine learning, we will never have a perfect model, and we can search for the rest of our lives for the best model. By finding a way to work your financial assumptions into the model, we will have an improved ability to decide between the competing models. Summary In this article, you were introduced to TDD as well as BDD. With these concepts introduced, you have a basic foundation with which to approach machine learning. We saw that the specifying behavior in the form of sentences makes for an easier to ready a set of specifications for your software. Building off of that foundation, we started to delve into testing at a higher level. We did this by establishing concepts that we can use to quantify classifiers: the ROC curve and AUC metric. Now, we've seen that different models can be quantified; it follows that they can be compared. Putting all of this together, we have everything we need to explore machine learning with a test-driven methodology. Resources for Article: Further resources on this subject: Optimization in Python[article] How to do Machine Learning with Python[article] Modeling complex functions with artificial neural networks [article]

0
0
3494

article-image-working-drupal-audio-flash-part-2

Packt

16 Oct 2009

6 min read

Working with Drupal Audio in Flash (part 2)

Packt

16 Oct 2009

6 min read

Although there are a handful of controls that we can add to this custom audio player, this section will demonstrate the concept by adding the most basic control for multimedia, which is the play and pause buttons. Adding a play and pause button To begin, we will need to first move and resize our title field within our Flash application, so that it can hold more text than "Hello World". We can then make room for some new controls that will be used to control the playback of our audio file. Again, the design of each of these components is subjective, but what is important is the MovieClip instance hierarchy, which will be used within our ActionScript code. Before we begin, we will need to create a new layer in our TIMELINE that will be used to place all AudioPlayer objects. We will call this new layer player: Creating a base button MovieClip Our base button will simply be a rounded rectangle, which we will then add some gradients to, so as to give it depth. We can do this by first creating a rounded rectangle with a vertical linear gradient fill as follows: We can now give it some very cool depth by adding a smaller rounded rectangle within this one, and then orient the same gradient horizontally. An easy way to do this is to copy the original shape and paste it as a new shape. Once we have a new copy of our original rounded rectangle, we can navigate to Modify | Shape | Expand fill, where we will then select Inset, change our Distance to 4px, and then click on OK. After doing this, you will realize how such a simple contrast in gradients can really bring out the shape. After we have our new button shape, we will then need to create a new MovieClip, so that we can reuse this button for both the play and pause buttons. To do this, simply select both the rounded rectangle regions, and then choose Modify | Convert to Symbol in the Flash menu. We are going to call this new movie clip mcButton. Now that we have a base button MovieClip, we can now add the play and pause symbols to complete the play and pause buttons. Adding the PlayButton movie clip The first button that we will create is the play button, which simply consists of a sideways triangle (icon) with the button behind it. To do this, we will first create a new movie clip that will hold the button we just created, and the play icon. We can do this by first clicking on the mcButton movie clip, and then creating a new movie clip from that by selecting Modify | Convert to Symbol. We will call our new movie clip mcPlayButton. What we are really doing here is creating a parent movie clip for our mcButton, which will allow us to add new specific elements. For the play button, we simply want to add a play symbol. To do this, we first want to make sure that we are within the mcPlayButton movie clip by double-clicking on this symbol, so that our breadcrumb at the top of the stage looks as follows: Our next task is to modify our timeline within this movie clip so that we can separate the icon from the button. We can do this by creating two new layers within our timeline, called button (which will hold our button) and icon (which we will create in the next section). We are now ready to start drawing the play icon. Drawing a play icon To draw a Play icon, we will need to first select the PolyStar Tool by clicking and holding on the tool until you can select the PolyStar Tool. This tool will allow us to create a triangle, which we will use for the play icon in our play button. But before we can start drawing, we need to first set up the PolyStar Tool so that it will draw a triangle. We can do this by clicking on the Options button within the Properties tab, which will then bring up a dialog, where we can tell it to draw a polygon with three sides (triangle). After we click on OK, we will then need to change the fill color of this triangle, so that it is visible on our button. We will just change the fill color to Black. We can then move our cursor onto the stage where the button is, and then draw our triangle in the shape of a play button icon. Remember, if you do not like the shape of what you made, you can always tweak it using the transform tool. When we are done, we should have something that resembles a play button! Our next task is to create a pause button. Since we have already created the play button, which is similar to the pause button except for the icon, we can use a handy tool in Flash that will let us duplicate our play button, and then modify our duplication for the pause button icon. Creating a pause button from the play button In order to create our pause button, we will first need to duplicate our play button into a new movie clip, where we can change the icon from play to pause. To do this, we will first direct our attention to the library section of our Flash IDE, which should show us all of the movie clips that we have created so far. We can find the LIBRARY by clicking on the button on the right-hand side of our workspace. To create a duplicate, we will now right-click on the mcPlayButton movie clip, and then select the option Duplicate. This will then bring up a dialog very similar to the dialog when we created new symbols, but this time, we are defining a new movie clip name that will serve as a duplicate for the original one. We will call our new movie clip duplicate mcPauseButton. Now that we have created our duplicate movie clip, the next task is to change the icon within the pause button. We can do this by opening up our mcPauseButton movie clip by double-clicking on that name within the Library. At this point, we can now change the icon of our pause button without running any risk of also modifying the play button (since we created a duplicate). When we are done, we should have a complete pause button. We now have play a nd pause buttons that we will use to link to our AudioPlayer class.

0
0
3493

How-To Tutorials

Packt

16 Aug 2017

8 min read

Machine Learning Models

Packt

16 Aug 2017

8 min read

In this article by Pratap Dangeti, the author of the book Statistics for Machine Learning, we will take a look at ridge regression and lasso regression in machine learning. (For more resources related to this topic, see here.) Ridge regression and lasso regression In linear regression only residual sum of squares (RSS) are minimized, whereas in ridge and lasso regression, penalty applied (also known as shrinkage penalty) on coefficient values to regularize the coefficients with the tuning parameter λ. When λ=0 penalty has no impact, ridge/lasso produces the same result as linear regression, whereas λ => ∞ will bring coefficients to zero. Before we go in deeper on ridge and lasso, it is worth to understand some concepts on Lagrangian multipliers. One can show the preceding objective function into the following format, where objective is just RSS subjected to cost constraint (s) of budget. For every value of λ, there is some s such that will provide the equivalent equations as shown as follows for overall objective function with penalty factor: The following graph shows the two different Lagrangian format: Ridge regression works well in situations where the least squares estimates have high variance. Ridge regression has computational advantages over best subset selection which required 2P models. In contrast for any fixed value of λ, ridge regression only fits a single model and model-fitting procedure can be performed very quickly. One disadvantage of ridge regression is, it will include all the predictors and shrinks the weights according with its importance but it does not set the values exactly to zero in order to eliminate unnecessary predictors from models, this issue will be overcome in lasso regression. During the situation of number of predictors are significantly large, using ridge may provide good accuracy but it includes all the variables, which is not desired in compact representation of the model, this issue do not present in lasso as it will set the weights of unnecessary variables to zero. Model generated from lasso are very much like subset selection, hence it is much easier to interpret than those produced by ridge regression. Example of ridge regression machine learning model Ridge regression is machine learning model, in which we do not perform any statistical diagnostics on the independent variables and just utilize the model to fit on test data and check the accuracy of fit. Here we have used scikit-learn package: >>> from sklearn.linear_model import Ridge >>> wine_quality = pd.read_csv( >>> wine_quality.rename(columns=lambda x: x.replace(" ", inplace=True) >>> all_colnms = ['fixed_acidity', 'volatile_acidity', 'citric_acid', 'residual_sugar', 'chlorides', Article_01.png λ, ridge regression only fits a idge exactly to zero in egression. are significantly large, idge variables, which is not model, this issue do not present in lasso as it will e regression machine learning model learning model, in which we do not perform any statistical the model to fit on test data and have used scikit-learn package: csv("winequality-red.csv",sep=';') "_"), in ariables, asso odel earning el 'free_sulfur_dioxide', 'total_sulfur_dioxide', 'density', 'pH', 'sulphates', 'alcohol'] >>> pdx = wine_quality[all_colnms] >>> pdy = wine_quality["quality"] >>> x_train,x_test,y_train,y_test = train_test_split(pdx,pdy,train_size = 0.7,random_state=42) Simple version of grid search from scratch has been described as follows, in which various values of alphas are tried to be tested in grid search to test the model fitness: >>> alphas = [1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0] Initial values of R-squared are set to zero in order to keep track on its updated value and to print whenever new value is greater than exiting value: >>> initrsq = 0 >>> print ("nRidge Regression: Best Parametersn") >>> for alph in alphas: ... ridge_reg = Ridge(alpha=alph) ... ridge_reg.fit(x_train,y_train) 0 ... tr_rsqrd = ridge_reg.score(x_train,y_train) ... ts_rsqrd = ridge_reg.score(x_test,y_test) The following code always keep track on test R-squared value and prints if new value is greater than existing best value: >>> if ts_rsqrd > initrsq: ... print ("Lambda: ",alph,"Train R-Squared value:",round(tr_rsqrd,5),"Test R-squared value:",round(ts_rsqrd,5)) ... initrsq = ts_rsqrd It is is shown in the following screenshot: By looking into test R-squared (0.3513) value we can conclude that there is no significant relationship between independent and dependent variables. Also, please note that, the test R-squared value generated from ridge regression is similar to value obtained from multiple linear regression (0.3519), but with the no stress on diagnostics of variables, and so on. Hence machine learning models are relatively compact and can be utilized for learning automatically without manual intervention to retrain the model, this is one of the biggest advantages of using ML models for deployment purposes. The R code for ridge regression on wine quality data is shown as follows: # Ridge regression library(glmnet) wine_quality = read.csv("winequality-red.csv",header=TRUE,sep = ";",check.names = FALSE) names(wine_quality) <- gsub(" ", "_", names(wine_quality)) set.seed(123) numrow = nrow(wine_quality) trnind = sample(1:numrow,size = as.integer(0.7*numrow)) train_data = wine_quality[trnind,]; test_data = wine_quality[- trnind,] xvars = c("fixed_acidity","volatile_acidity","citric_acid","residual_sugar ","chlorides","free_sulfur_dioxide", "total_sulfur_dioxide","density","pH","sulphates","alcohol") yvar = "quality" x_train = as.matrix(train_data[,xvars]);y_train = as.double (as.matrix (train_data[,yvar])) x_test = as.matrix(test_data[,xvars]) print(paste("Ridge Regression")) lambdas = c(1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0) initrsq = 0 for (lmbd in lambdas){ ridge_fit = glmnet(x_train,y_train,alpha = 0,lambda = lmbd) pred_y = predict(ridge_fit,x_test) R2 <- 1 - (sum((test_data[,yvar]-pred_y )^2)/sum((test_data[,yvar]-mean(test_data[,yvar]))^2)) if (R2 > initrsq){ print(paste("Lambda:",lmbd,"Test Adjusted R-squared :",round(R2,4))) initrsq = R2 } } Example of lasso regression model Lasso regression is close cousin of ridge regression, in which absolute values of coefficients are minimized rather than square of values. By doing so, we eliminate some insignificant variables, which are very much compacted representation similar to OLS methods. Following implementation is almost similar to ridge regression apart from penalty application on mod/absolute value of coefficients: >>> from sklearn.linear_model import Lasso >>> alphas = [1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0] >>> initrsq = 0 >>> print ("nLasso Regression: Best Parametersn") >>> for alph in alphas: ... lasso_reg = Lasso(alpha=alph) ... lasso_reg.fit(x_train,y_train) ... tr_rsqrd = lasso_reg.score(x_train,y_train) ... ts_rsqrd = lasso_reg.score(x_test,y_test) ... if ts_rsqrd > initrsq: ... print ("Lambda: ",alph,"Train R-Squared value:",round(tr_rsqrd,5),"Test R-squared value:",round(ts_rsqrd,5)) ... initrsq = ts_rsqrd It is shown in the following screenshot: Lasso regression produces almost similar results as ridge, but if we check the test R-squared values bit carefully, lasso produces little less values. Reason behind the same could be due to its robustness of reducing coefficients to zero and eliminate them from analysis: >>> ridge_reg = Ridge(alpha=0.001) >>> ridge_reg.fit(x_train,y_train) >>> print ("nRidge Regression coefficient values of Alpha = 0.001n") >>> for i in range(11): ... print (all_colnms[i],": ",ridge_reg.coef_[i]) >>> lasso_reg = Lasso(alpha=0.001) >>> lasso_reg.fit(x_train,y_train) >>> print ("nLasso Regression coefficient values of Alpha = 0.001n") >>> for i in range(11): ... print (all_colnms[i],": ",lasso_reg.coef_[i]) Following results shows the coefficient values of both the methods, coefficient of density has been set to o in lasso regression whereas density value is -5.5672 in ridge regression; also none of the coefficients in ridge regression are zero values: R Code – Lasso Regression on Wine Quality Data # Above Data processing steps are same as Ridge Regression, only below section of the code do change # Lasso Regression print(paste("Lasso Regression")) lambdas = c(1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0) initrsq = 0 for (lmbd in lambdas){ lasso_fit = glmnet(x_train,y_train,alpha = 1,lambda = lmbd) pred_y = predict(lasso_fit,x_test) R2 <- 1 - (sum((test_data[,yvar]-pred_y )^2)/sum((test_data[,yvar]-mean(test_data[,yvar]))^2)) if (R2 > initrsq){ print(paste("Lambda:",lmbd,"Test Adjusted R-squared :",round(R2,4))) initrsq = R2 } } Regularization parameters in linear regression and ridge/lasso regression Adjusted R-squared in linear regression always penalizes adding extra variables with less significance is one type of regularizing the data in linear regression, but it will adjust to unique fit of the model. Whereas in machine learning many parameters are adjusted to regularizing the overfitting problem, in the example of lasso/ridge regression penalty parameter (λ) to regularization, there are infinite values can be applied to regularize the model infinite ways: In overall there are many similarities between statistical way and machine learning ways of predicting the pattern. Summary We have seen ridge regression and lasso regression with their examples and we have also seen its regularization parameters. Resources for Article: Further resources on this subject: Machine Learning Review [article] Getting Started with Python and Machine Learning [article] Machine learning in practice [article]

0
0
3492

article-image-getting-started-openlayers

Packt

22 Mar 2011

9 min read

Getting Started with OpenLayers

Packt

22 Mar 2011

9 min read

OpenLayers 2.10 Beginner's Guide Create, optimize, and deploy stunning cross-browser web maps with the OpenLayers JavaScript web mapping library What is OpenLayers? OpenLayers is an open source, client side JavaScript library for making interactive web maps, viewable in nearly any web browser. Since it is a client side library, it requires no special server side so beware or settings—you can use it without even downloading anything! Originally developed by Metacarta, as a response, in part, to Google Maps, it has grown into a mature, popular framework with many passionate developers and a very helpful community. Why use OpenLayers? OpenLayers makes creating powerful web-mapping applications easy and fun. It is very powerful but also easy to use—you don't even need to be a programmer to make a great map with it. It's open source, free, and has a strong community behind it. So if you want to dig into the internal code, or even improve it, you're encouraged to do so. Cross browser compatibility is handled for you—it even works in IE6. OpenLayers is not tied to any proprietary technology or company, so you don't have to worry so much about your application breaking (unless you break it). At the time of writing, support for modern mobile and touch devices is in the works (with many proof of concept examples), and should be in the official library in the near future—if they aren't by the time you're reading this. OpenLayers allows you to build entire mapping applications from the ground up, with the ability to customize every aspect of your map—layers, controls, events, etc. You can use a multitude of different map server backends together, including a powerful vector layer. It makes creating map 'mashups' extremely easy. What, technically, is OpenLayers? We said OpenLayers is a client side JavaScript library, but what does this mean? Client side When we say client side we are referring to the user's computer, specifically their web browser. The only thing you need to have to make OpenLayers work is the OpenLayers code itself and a web browser. You can either download it and use it on your computer locally, or download nothing and simply link to the JavaScript file served on the site that hosts the OpenLayers project (http://openlayers.org). OpenLayers works on nearly all browsers and can be served by any web server or your own computer. Using a modern, standard-based browser such as Firefox, Google Chrome, Safari, or Opera is recommended. Library When we say library we mean that OpenLayers is an API (Application Programmer Interface) that provides you with tools to develop your own web maps. Instead of building a mapping application from scratch, you can use OpenLayers for the mapping part, which is maintained and developed by a bunch of brilliant people For example, if you wanted to write a blog you could either write your own blog engine, or use an existing one such as WordPress or Blogger and build on top of it. Similarly, if you wanted to create a web map, you could write your own from scratch, or use so beware that has been developed and tested by a group of developers with a strong community behind it. By choosing to use OpenLayers, you do have to learn how to use the library, but the benefits greatly outweigh the costs. You get to use a rich, highly tested and maintained code base, and all you have to do is learn how to use it. OpenLayers is written in JavaScript, but don't fret if you don't know it very well. All you really need is some knowledge of the basic syntax, and we'll try to keep things as clear as possible in the code examples. If you are unfamiliar with JavaScript, Mozilla provides phenomenal JavaScript documentation at https://developer.mozilla. org/en/javascript. Anatomy of a web-mapping application First off —what is a 'web-mapping application. To put it bluntly, it's some type of Internet application that makes use of a map. This could be a site that displays the latest geo-tagged images from Flickr , a map that shows markers of locations you've traveled to, or an application that tracks invasive plant species and displays them. If it contains a map and it does something, you could argue that it is a web map application. The term can be used in a pretty broad sense. So where exactly does OpenLayers fit in? We know OpenLayers is a client side mapping library, but what does that mean? Let's take a look at the following screenshot: This is called the Client / Server Model and it is, essentially, the core of how all web applications operate. In the case of a web map application, some sort of map client (e.g., OpenLayers) communicates with some sort of web map server (e.g., a WMS server or the Google Maps backend). Web map client OpenLayers lives on the client side. One of the primary tasks the client performs is to get map images from a map server. Essentially, the client has to ask a map server for what you want to look at. Every time you navigate or zoom around on the map, the client has to make new requests to the server—because you're asking to look at something different. OpenLayers handles this all for you, and it is happening via asynchronous JavaScript (AJAX) calls to a map server. To reiterate—the basic concept is that OpenLayers sends requests to a map server for map images every time you interact with the map, then OpenLayers pieces together all the returned map images so it looks like one big, seamless map. Web map server A map server (or map service) provides the map itself. There are a myriad of different map server backends. A small sample includes WMS, Google Maps, Yahoo! Maps, ESRI ArcGIS, WFS, and OpenStreet Maps. If you are unfamiliar with those terms, don't sweat it. The basic principle behind all those services is that they allow you to specify the area of the map you want to look at (by sending a request), and then the map servers send back a response containing the map image. With OpenLayers, you can choose to use as many different backends in any sort of combination as you'd like. OpenLayers is not a web map server; it only consumes data from them. So, you will need to be able to access some type of web map service. Don't worry though. Fortunately, there are a myriad of free and/or open source web map servers available that are remotely hosted or easy to set up yourself, such as MapServer. Throughout this article, we'll often use a freely available web mapping service from OSGeo, so don't worry about having to provide your own. With many web map servers you do not have to do anything to use them—just supplying a URL to them in OpenLayers is enough. OSGeo, OpenStreet Maps, Google, Yahoo!, and Bing Maps, for instance, provide access to their map servers (although, some commercial restrictions may apply with various services in some situations). Relation to Google / Yahoo! / and other mapping APIs The Google, Yahoo!, Bing, and ESRI Mappings API allow you to connect with their map server backend. Their APIs also usually provide a client side interface (at least in the case of Google Maps). The Google Maps API , for instance, is fairly powerful. You have the ability to add markers, plot routes, and use KML data (things you can also do in OpenLayers)—but the main drawback is that your mapping application relies totally on Google. The map client and map server are provided by a third party. This is not inherently a bad thing, and for many projects, Google Maps and the like are a good fit However, there are quite a few drawbacks: You're not in control of the backend You can't really customize the map server backend, and it can change at any time There may be some commercial restrictions, or some costs involved These other APIs also cannot provide you with anything near the amount of flexibility and customization that an open source mapping application framework (i.e., OpenLayers) offers. Layers in OpenLayers So, what's with the Layer in OpenLayers? Well, OpenLayers allows you to have multiple different 'backend' servers that your map can use. To access a web map server, you create a layer object and add it to your map with OpenLayers. For instance, if you wanted to have a Google Maps and a WMS service displayed on your map, you would use OpenLayers to create a GoogleMaps layer object and a WMS layer object, and then add them to your OpenLayers map. We'll soon see an example with a WMS layer, so don't worry if you're a little confused. What is a Layer? Like layers of an onion, each layer is above and will cover up the previous one; the order that you add in the layers is important. With OpenLayers, you can arbitrarily set the overall transparency of any layer, so you are easily able to control how much layers cover each other up, and dynamically change the layer order at any time. For instance, you could have a Google map as your base layer, a layer with satellite imagery that is semi-transparent, and a vector layer all active on your map at once. A vector layer is a powerful layer that lets us add markers and various geometric objects to our maps. Thus, in this example, your map would have three separate layers. The OpenLayers website The website for OpenLayers is located at http://openlayers.org/. To begin, we need to download a copy of OpenLayers (or, we can directly link to the library—but we'll download a local copy). You can download the compressed library as either a .tar.gz or .zip, but both contain the same files. Let's go over the links: Link to the hosted version: If you do not want to actually download OpenLayers, you can instead link to the OpenLayers library by adding this script URL to your site in a <script> tag. 2.10 (Stable) .tar.gz or .zip: This should show the latest stable release (2.10 at the time of writing). You can download it as either a tar.gz or .zip; if you are unsure of which to get, you should download the .zip version. 2.10 Release Notes: This highlights things that have changed, bugs that have been fixed, etc. Class documentation on, more documentation on: These are links to the API documentation, which we will make heavy use of throughout the article. I recommend opening it up and keeping it up while working through the examples.

0
0
3492

article-image-joomla-16-organizing-and-managing-content

Packt

08 Apr 2011

10 min read

Joomla! 1.6: Organizing and Managing Content

Packt

08 Apr 2011

10 min read

Joomla! 1.6 First Look A concise guide to everything that's new in Joomla! 1.6. Anyone who's used to working with the previous versions of Joomla! knows the old section—category—article drill. Articles had to be part of a category and categories had to be part of a section. There were no workarounds for this rigid three-level content organization scheme. Sometimes, this required Joomla! users to adapt their content to the system's limitations (or extend Joomla!'s functionality by using more powerful content management extensions, so-called Content Construction Kits or CCKs). In Joomla! 1.6, the rigid old system has finally been replaced. Sections have gone; there are now only categories, and any category can hold as many levels of subcategories as you need. In the backend, instead of both a Section Manager and a Category Manager, you'll now find only a Category Manager. You can forget the concept of sections altogether; in Joomla!! 1.6 there's no need for them anymore, as they're no longer needed as 'containers' to hold categories. Improvement #1: categories can now be one level deep Sometimes, you'll want to organize articles in just one category. Let's say you want to add a few articles about your organization: who you are, where to reach you, and so on. You don't need any subcategories. You'd need a structure like this: In Joomla! 1.5, this simple setup of a "sectionless" category holding articles wasn't possible. You'd have to organize content in sections and categories—which implied that any group of articles would be stored two levels deep, even if you didn't need this. The only alternative was not to organize content, using uncategorized articles. In Joomla! 1.6, you can put content in just one category if you want to. Just go to Content | Category Manager | Add New Category to create a new category. In the Parent drop down box, select No parent: As this category has "No parent", it becomes a top-level or "parent" category. It's as simple as that; now you can do something that wasn't possible in Joomla! 1.5, by assigning articles directly to this category. Improvement #2: creating multiple category levels Joomla!'s old section—category—article approach didn't allow you to create categories within categories ( "nested categories"). However, on content-rich sites, you might need more than two levels of content organization and use a few subcategories. Here's an example from a site featuring product reviews. It uses several levels to organize the main category of "reviews" in subcategories of product types, brands, and models: A great advantage of being able to create such a structure is that it allows for very specific searches (that is, within categories) and multiple ways of navigation. Another example is if you are creating a catalog that you want to be searchable with multiple filters such as manufacturer, price, general item type, or a specific product name. Creating a set of 'nested' categories Let's find out how you can quickly set up a few nested categories like the ones shown in the illustration above: Go to Content | Category Manager | Add New Category. In the Title field, enter Reviews. In the Parent field, make sure that the default option No parent is selected. The screen should look like this: Click on Save & New. A message appears to confirm your action: Category successfully saved. At the same time, all the fields in the Add New Category are emptied. To create a subcategory, enter the subcategory name Cameras in the Title field. In the Parent drop-down box, select Reviews: Click on Save & New to store the subcategory. Repeat the previous three steps to create more subcategories. For each new category, first enter a title, then select the appropriate parent category and save it by clicking on Save & New. When you're done with creating subcategories, click on Save & Close to view the results in the Category Manager. In the example below, the Cameras category is parent to a subcategory Compact Cameras. The Compact Cameras category is parent to a subcategory called Canon. If you've followed the above example, you'll find the following set of categories in the Category Manager. They are displayed as shown below: The Reviews name isn't indented, as it is a top-level category. Cameras, Compact Cameras, and Canon are displayed indented as they are subcategories. When you create articles, you can now assign them to the new categories. The same category hierarchy as you've just seen in the Category Manager is displayed in the Category drop-down box: Using nested categories in the sample data You've just set up a few categories and subcategories yourself. On a complex site, you can have a far more complex structure. Don't worry, I won't ask you to create dozens of nested categories right now—but it's a good idea to learn from the example set by the Joomla! Developers. Let's have a look at the categories and articles that come with Joomla! when it is installed with sample data. The way things are organized there will give you some idea of how you can deploy nested categories and get the most out of the new system. Exploring the sample data On the frontend, click on the Sample sites link in the This Site menu. On the Sample Sites page, a new menu appears. This menu gives access to both sample sites—Australian Parks and Fruit Shop: Have a look around at both example sites. They appear to be separate websites, but they're not. Here the Joomla! developers have cunningly deployed the possibilities of the new category system and have organized all content for the three sites (the main site and two example sites) within one big website. To find out how this is done, let's have a look at the categories in the backend: Go to Content | Category Manager to see how the sample content is organized. The screenshot below shows an overview: As you can see in the screenshot above, there's one top-level category, Sample Data-Articles. All other articles are contained in the subcategories of this main level category. Apart from the top level category, there are three main categories: The Joomla! category. It has three sublevels. The Park Site category. It has two sublevels. The Fruit Shop category. It has one sublevel. Finally, there's a group of articles that's not in any category; it's a bunch of leftovers all marked as Uncategorized. How can different categories look like different sites? As you click through the example sites, not only the content changes; the menu links to each main category (such as the Parks and Fruit Shop category) have specific templates assigned to them. This way, on the frontend, the look-and-feel of the different main article categories are totally different, whereas in the backend, they're just part of one big site. Applying templates to categories can give visitors the impression of exploring a separate set of websites. Although there's no limit to the number of levels in the category hierarchy, even in this rather complex set of sample site articles, categories don't go further than four levels deep. It is possible to make more subcategories, but keep in mind that this means that your content will be stored 'deeper' in the hierarchy, possibly making it more difficult for visitors (and search engines) to find it. One benefit of placing interrelated content under its own main level category is that you can easily unpublish, delete, or archive any content dealing with a specific subject by unpublishing, deleting, or archiving this main level category. That's why the Joomla! developers have chosen to use one top-level category for all sample data. By unpublishing the top level category (Sample Data-Articles), you can unpublish all of the example content in one go. New category settings: notes and metadata When entering or editing a new category, the New Category or Edit Category screen now offer you an area to type notes about the purpose of the category or related items, as well as a place to add keywords and a description (metadata). The Note field (found in the Basic Options section) can be useful to share some extra information about the category with other backend users. For example, you can enter a short explanation about this category ('subcategory of ...'): Adding category metadata In Joomla! 1.5, there was no way to separately enter metadata for category pages. Now, you can enter specific Meta Description and Meta Keywords in the Metadata Options section when creating or editing a category. Another new item in the Basic Options of a category is the Alternative Layout select box. Alternative layouts are an advanced new feature that enable you to select a customized layout for the current category, provided the selected template (or a third-party component) provides these extra layout options. A template can contain so-called template override files, allowing for customized layouts that replace Joomla!'s default views. Using the Alternative Layout select box, you can now select the template override you want to activate for this particular item. To find out more about this feature, have a look at the "Introduction to Alternative Layouts in Version 1.6" document on the Joomlacode site. You'll find it at http://downloads.joomlacode.org/trackeritem/5/8/6/58619/introtoaltlayoutsinversion1-6v2.pdf. Fresh ways to display category contents on the frontend Joomla! 1.6 provides several additional methods to display category contents. They replace the four classic layouts of Category List, Category Blog, Section List, and Section Blog. When creating a new menu link pointing to a category, you are now presented with a slightly different set of Menu Item Types: These are the category views are available: List All Categories is a new view, described below Category Blog was previously called Category Blog Layout Category List was previously called Category List Layout The Blog and List views are basically the same as they've always been. However, these display types now offer new settings that provide more control over the look and feel of the resulting pages. Along with the new List All Categories menu item type, there are also a few new module types that provide you with new ways to display links to categories and their article contents. Let's have a closer look at the new category views. New category view # 1: List All Categories The new category system rationalizes the organization of content, even in large or complex websites. One advantage of this is that you can more easily give visitors (and search engines!) access to all that well-structured content, just by adding one menu link to a main level category. This will allow visitors to easily drill down the different layers (the category levels) of the site structure. To achieve this, the new List All Categories menu link type allows you to display categories as well as their subcategory contents. You can see an example of this menu organization if you select the Site Map link on the This Site menu in the frontend of the sample Joomla! 1.6 content. As we've previously seen, the sample data that comes with Joomla! 1.6 is organized in a structured way. The Site Map link uses the List All Categories menu item type to show all levels in the category hierarchy.

0
0
3492

article-image-various-components-sencha-touch

Packt

16 Feb 2012

8 min read

The Various Components in Sencha Touch

Packt

16 Feb 2012

8 min read

0
0
3491

article-image-wordpress-buddypress-courseware

Packt

19 Jun 2012

10 min read

Wordpress: Buddypress Courseware

Packt

19 Jun 2012

10 min read

0
0
3491

How-To Tutorials

Packt

26 Aug 2014

14 min read

Using OSGi Services

Packt

26 Aug 2014

14 min read

0
0
3488

article-image-internet-peas-gardening-javascript-part-2

Anna Gerber

23 Nov 2015

6 min read

The Internet of Peas? Gardening with JavaScript Part 2

Anna Gerber

23 Nov 2015

6 min read

In this two-part article series, we're building an internet-connected garden bot using JavaScript. In part one, we set up a Particle Core board, created a Johnny-Five project, and ran a Node.js program to read raw values from a soil moisture sensor. Adding a light sensor Let's connect another sensor. We'll extend our circuit to add a photo resistor to measure the ambient light levels around our plants. Connect one lead of the photo resistor to ground, and the other to analog pin 4, with a 1K pull-down resistor from A4 to the 3.3V pin. The value of the pull-down resistor determines the raw readings from the sensor. We're using a 1K resistor so that the sensor values don't saturate under tropical sun conditions. For plants kept inside a dark room, or in a less sunny climate, a 10K resistor might be a better choice. Read more about how pull-down resistors work with photo resistors at AdaFruit. Now, in our board's ready callback function, we add another sensor instance, this time on pin A4: var lightSensor = new five.Sensor({ pin: "A4", freq: 1000 }); lightSensor.on("data", function() { console.log("Light reading " + this.value); }); For this sensor we are logging the sensor value every second, not just when it changes. We can control how often sensor events are emitted by specifying the number of milliseconds in the freq option when creating the sensor. We can use the threshold config option can be used to control when the change callback occurs. Calibrating the soil sensor The soil sensor uses the electrical resistance between two probes to provide a measure of the moisture content of the soil. We're using a commercial sensor, but you could make your own simply using two pieces of wire spaced about an inch apart (using galvinized wire to avoid rust). Water is a good conductor of electricity, so a low reading means that the soil is moist, while a high amount of resistance indicates that the soil is dry. Because these aren't very sophisticated sensors, the readings will vary from sensor to sensor. In order to do anything meaningful with the readings within our application, we'll need to calibrate our sensor. Calibrate by making a note of the sensor values for very dry soil, wet soil, and in between to get a sense of what the optimal range of values should be. For an imprecise sensor like this, it also helps to map the raw readings onto ranges that can be used to display different messages (e.g. very dry, dry, damp, wet) or trigger different actions. The scale method on the Sensor class can come in handy for this. For example, we could convert the raw readings from 0 - 1023 to a 0 - 5 scale: soilSensor.scale(0, 5).on("change", function() { console.log(this.value); }); However, the raw readings for this sensor range between about 50 (wet) to 500 (fairly dry soil). If we're only interested in when the soil is dry, i.e. when readings are above 300, we could use a conditional statement within our callback function, or use the within method so that the function is only triggered when the values are inside a range of values we care about. soilSensor.within([ 300, 500 ], function() { console.log("Water me!"); }); Our raw soil sensor values will vary depending on the temperature of the soil, so this type of sensor is best for indoor plants that aren't exposed to weather extremes. If you are installing a soil moisture sensor outdoors, consider adding a temperature sensor and then calibrate for values at different temperature ranges. Connecting more sensors We have seven analog and seven digital IO pins on the Particle Core, so we could attach more sensors, perhaps more of the same type to monitor additional planters, or different types of sensors to monitor additional conditions. There are many kinds of environmental sensors available through online marketplaces like AliExpress and ebay. These include sensors for temperature, humidity, dust, gas, water depth, particulate detection etc. Some of these sensors are straightforward analog or digital devices that can be used directly with the Johnny-Five Sensor class, as we have with our soil and light sensors. The Johnny-Five API also includes subclasses like Temperature, with controllers for some widely used sensor components. However, some sensors use protocols like SPI, I2C or OneWire, which are not as well supported by Johnny-Five across all platforms. This is always improving, for example, I2C was added to the Particle-IO plugin in October 2015. Keep an eye on I2C component backpacks which are providing support for additional sensors via secondary microcontrollers. Automation If you are gardening at scale, or going away on extended vacation, you might want more than just monitoring. You might want to automate some basic garden maintenance tasks, like turning on grow lights on overcast days, or controlling a pump to water the plants when the soil moisture level gets low. This can be acheived with relays. For example, we can connect a relay with a daylight bulb to a digital pin, and use it to turn lights on in response to the light readings, but only between certain hours: var five = require("johnny-five"); var Particle = require("particle-io"); var moment = require("moment"); var board = new five.Board({ io: new Particle({ token: process.env.PARTICLE_TOKEN, deviceId: process.env.PARTICLE_DEVICE_ID }) }); board.on("ready", function() { var lightSensor = new five.Sensor("A4"); var lampRelay = new five.Relay(2); lightSensor.scale(0, 5).on("change", function() { console.log("light reading is " + this.value) var now = moment(); var nightCurfew = now.endOf('day').subtract(4,'h'); var morningCurfew = now.startOf('day').add(6,'h'); if (this.value > 4) { if (!lampRelay.isOn && now.isAfter(morningCurfew) && now.isBefore(nightCurfew)) { lampRelay.on(); } } else { lampRelay.off(); } }); }); And beyond... One of the great things about using Node.js with hardware is that we can extend our apps with modules from npm. We could publish an Atom feed of sensor readings over time, push the data to a web UI using socket-io, build an alert system or create a data visualization layer, or we might build an API to control lights or pumps attached via relays to our board. It's never been easier to program your own internet-connected robot helpers and smart devices using JavaScript. Build more exciting robotics projects with servos and motors – click here to find out how. About the author Anna Gerber is a full-stack developer with 15 years’ experience in the university sector, formerly a Technical Project Manager at The University of Queensland ITEE eResearchspecializing in Digital Humanities and Research Scientist at the Distributed System Technology Centre (DSTC). Anna is a JavaScript robotics enthusiast and maker who enjoys tinkering with soft circuits and 3D printers.

0
0
3488

How-To Tutorials

article-image-recording-calls-freepbx-25

Packt

16 Oct 2009

2 min read

Recording Calls in FreePBX 2.5

Packt

16 Oct 2009

2 min read

Asterisk has a wonderful, built-in ability to record calls. No additional software is required to make this happen. When Asterisk records a call, both sides of the call are recorded and written out to a file for playback on a computer. Call recording is often performed in call centers to ensure call quality, or to keep calls for later review, should the need arise. Asterisk provides the ability to record all of the calls, or to selectively record calls. In this article, we will look the following: General recording options Recording calls to extensions Recording calls to queues Recording calls to conferences Maintaining call recordings Before enabling call recording for your PBX, make sure that you are aware of the legalities surrounding call recordings and privacy laws. Call recordings are prohibited in certain places, unless the caller is told that the call will be recorded. For example, in the state of California all of the parties on the call must consent to the call being recorded before it begins. Playing back a message stating that the call is being recorded prior to the call being answered is considered a valid form of consent. Recording formats FreePBX allows calls to be recorded in the following formats: WAV WAV49 ULAW ALAW SLN GSM Each format has its own ratio of file size to recording quality, and certain formats will not play on all of the computers. A comparison between all of the available formats is as follows:

0
0
3484

article-image-eu-approves-labour-protection-laws-for-whistleblowers-and-gig-economy-workers-with-implications-for-tech-companies

Savia Lobo

17 Apr 2019

5 min read

EU approves labour protection laws for ‘Whistleblowers’ and ‘Gig economy’ workers with implications for tech companies

Savia Lobo

17 Apr 2019

5 min read

The European Union approved two new labour protection laws recently. This time, for the two not so hyped sects, the whistleblowers and the ones earning their income via the ‘gig economy’. As for the whistleblowers, with the new law, they receive an increased protection landmark legislation aimed at encouraging reports of wrongdoing. On the other hand, for those working for ‘on-demand’ jobs, thus, termed as the gig economy, the law sets minimum rights and demands increased transparency for such workers. Let’s have a brief look at each of the newly approved by the EU. Whistleblowers’ shield against retaliation On Tuesday, the EU parliament approved a new law for whistleblowers safeguarding them from any retaliation within an organization. The law protects whistleblowers against dismissal, demotion and other forms of punishment. “The law now needs to be approved by EU ministers. Member states will then have two years to comply with the rules”, the EU proposal states. Transparency International calls this as “pathbreaking legislation”, which will also give employees a "greater legal certainty around their rights and obligations". The new law creates a safe channel which allows the whistleblowers to report of an EU law breach both within an organization and to public authorities. “It is the first time whistleblowers have been given EU-wide protection. The law was approved by 591 votes, with 29 votes against and 33 abstentions”, the BBC reports. In cases where no appropriate action is taken by the organization’s authorities even after reporting, whistleblowers are allowed to make public disclosure of the wrongdoing by communicating with the media. European Commission Vice President, Frans Timmermans, says, “potential whistleblowers are often discouraged from reporting their concerns or suspicions for fear of retaliation. We should protect whistleblowers from being punished, sacked, demoted or sued in court for doing the right thing for society.” He further added, “This will help tackle fraud, corruption, corporate tax avoidance and damage to people's health and the environment.” “The European Commission says just 10 members - France, Hungary, Ireland, Italy, Lithuania, Malta, the Netherlands, Slovakia, Sweden, and the UK - had a "comprehensive law" protecting whistleblowers”, the BBC reports. “Attempts by some states to water down the reform earlier this year were blocked at an early stage of the talks with Luxembourg, Ireland, and Hungary seeking to have tax matters excluded. However, a coalition of EU states, including Germany, France, and Italy, eventually prevailed in keeping tax revelations within the proposal”, the Reuters report. “If member states fail to properly implement the law, the European Commission can take formal disciplinary steps against the country and could ultimately refer the case to the European Court of Justice”, BBC reports. To know more about this new law for whistleblowers, read the official proposal. EU grants protection to workers in Gig economy (casual or short-term employment) In a vote on Tuesday, the Members of the European Parliament (MEP) announced minimum rights for workers with on-demand, voucher-based or platform jobs, such as Uber or Deliveroo. However, genuinely self-employed workers would be excluded from the new rules. “The law states that every person who has an employment contract or employment relationship as defined by law, collective agreements or practice in force in each member state should be covered by these new rights”, BBC reports. “This would mean that workers in casual or short-term employment, on-demand workers, intermittent workers, voucher-based workers, platform workers, as well as paid trainees and apprentices, deserve a set of minimum rights, as long as they meet these criteria and pass the threshold of working 3 hours per week and 12 hours per 4 weeks on average”, according to EU’s official website. For this, all workers need to be informed from day one as a general principle, but no later than seven days where justified. Following are the specific set of rights to cover new forms of employment includes: Workers with on-demand contracts or similar forms of employment should benefit from a minimum level of predictability such as predetermined reference hours and reference days. They should also be able to refuse, without consequences, an assignment outside predetermined hours or be compensated if the assignment was not cancelled in time. Member states shall adopt measures to prevent abusive practices, such as limits to the use and duration of the contract. The employer should not prohibit, penalize or hinder workers from taking jobs with other companies if this falls outside the work schedule established with that employer. Enrique Calvet Chambon, the MEP responsible for seeing the law through, said, “This directive is the first big step towards the implementation of the European Pillar of Social Rights, affecting all EU workers. All workers who have been in limbo will now be granted minimum rights thanks to this directive, and the European Court of Justice rulings, from now on no employer will be able to abuse the flexibility in the labour market.” To know more about this new law on Gig economy, visit EU’s official website. 19 nations including The UK and Germany give thumbs-up to EU’s Copyright Directive Facebook discussions with the EU resulted in changes of its terms and services for users The EU commission introduces guidelines for achieving a ‘Trustworthy AI’

0
0
3482

Packt

08 Sep 2010

13 min read

Human Interactions in BPEL

Packt

08 Sep 2010

13 min read

(For more resources on BPEL, SOA and Oracle see here.) Human interactions in business processes The main objective of BPEL has been to standardize the process automation. BPEL business processes make use of services and externalize their functionality as services. BPEL processes are defined as a collection of activities through which services are invoked. BPEL does not make a distinction between services provided by applications and other interactions, such as human interactions, which are particularly important. Real-world business processes namely often integrate not only systems and services, but also humans. Human interactions in business processes can be very simple, such as approval of certain tasks or decisions, or complex, such as delegation, renewal, escalation, nomination, chained execution, and so on. Human interactions are not limited to approvals and can include data entries, process monitoring and management, process initiation, exception handling, and so on. Task approval is the simplest and probably the most common human interaction. In a business process for opening a new account, a human interaction might be required to decide whether the user is allowed to open the account. In a travel approval process, a human might approve the decision from which airline to buy the ticket (as shown in the following figure). If the situation is more complex, a business process might require several users to make approvals, either in sequence or in parallel. In sequential scenarios, the next user often wants to see the decision made by the previous user. Sometimes, particularly in parallel human interactions, users are not allowed to see the decisions taken by other users. This improves the decision potential. Sometimes one user does not even know which other users are involved, or whether any other users are involved at all. A common scenario for involving more than one user is workflow with escalation. Escalation is typically used in situations where an activity does not fulfill a time constraint. In such a case, a notification is sent to one or more users. Escalations can be chained, going first to the first-line employees and advancing to senior staff if the activity is not fulfilled. Sometimes it is difficult or impossible to define in advance which user should perform an interaction. In this case, a supervisor might manually nominate the task to other employees; the nomination can also be made by a group of users or by a decision-support system. In other scenarios, a business process may require a single user to perform several steps that can be defined in advance or during the execution of the process instance. Even more complex processes might require that one workflow is continued with another workflow. Human interactions are not limited to only approvals; they may also include data entries or process management issues, such as process initiation, suspension, and exception management. This is particularly true for long-running business processes, where, for example, user exception handling can prevent costly process termination and related compensation for those activities that have already been successfully completed. As a best practice for human workflows, it is usually not wise to associate human interactions directly to specific users; it is better to connect tasks to roles and then associate those roles with individual users. This gives business processes greater flexibility, allowing any user with a certain role to interact with the process and enabling changes to users and roles to be made dynamically. To achieve this, the process has to gain access to users and roles, stored in the enterprise directory, such as LDAP (Lightweight Directory Access Protocol). Workflow theory has defined several workflow patterns, which specify the abovedescribed scenarios in detail. Examples of workflow patterns include sequential workflow, parallel workflow, workflow with escalation, workflow with nomination, ad-hoc workflow, workflow continuation, and so on. Human Tasks in BPEL So far we have seen that human interaction in business processes can get quite complex. Although BPEL specification does not specifically cover human interactions, BPEL is appropriate for human workflows. BPEL business processes are defined as collections of activities that invoke services. BPEL does not make a distinction between services provided by applications and other interactions, such as human interactions. There are mainly two approaches to support human interactions in BPEL. The first approach is to use a human workflow service. Several vendors today have created workflow services that leverage the rich BPEL support for asynchronous services. In this fashion, people and manual tasks become just another asynchronous service from the perspective of the orchestrating process and the BPEL processes stay 100% standard. The other approach has been to standardize the human interactions and go beyond the service invocations. This approach resulted in the workflow specifications emerging around BPEL with the objective to standardize the explicit inclusion of human tasks in BPEL processes. The BPEL4People specification has emerged, which was originally put forth by IBM and SAP in July 2005. Other companies, such as Oracle, Active Endpoints, and Adobe joined later. Finally, this specification is now being advanced within the OASIS BPEL4People Technical Committee. The BPEL4People specification contains two parts: BPEL4People version 1.0, which introduces BPEL extensions to address human interactions in BPEL as a first-class citizen. It defines a new type of basic activity, which uses human tasks as an implementation, and allows specifying tasks local to a process or use tasks defined outside of the process definition. BPEL4People is based on the WS-HumanTask specification that it uses for the actual specification of human tasks. Web Services Human Task (WS-HumanTask) version 1.0 introduces the definition of human tasks, including their properties, behavior, and a set of operations used to manipulate human tasks. It also introduces a coordination protocol in order to control autonomy and lifecycle of service-enabled human tasks in an interoperable manner. The most important extensions introduced in BPEL4People are people activities and people links. People activity is a new BPEL activity used to define user interactions; in other words, tasks that a user has to perform. For each people activity, the BPEL server must create work items and distribute them to users eligible to execute them. People activities can have input and output variables and can specify deadlines. To specify the implementation of people activities, BPEL4People introduced tasks. Tasks specify actions that users must perform. Tasks can have descriptions, priorities, deadlines, and other properties. To represent tasks to users, we need a client application that provides a user interface and interacts with tasks. It can query available tasks, claim and revoke them, and complete or fail them. To associate people activities and the related tasks with users or groups of users, BPEL4People introduced people links. People links are somewhat similar to partner links; they associate users with one or more people activities. People links are usually associated with generic human roles, such as process initiator, process stakeholders, owners, and administrators. The actual users that are associated with people activities can be determined at design time, deployment time, or runtime. BPEL4People anticipates the use of directories such as LDAP to select users. However, it doesn't define the query language used to select users. Rather, it foresees the use of LDAP filters, SQL, XQuery, or other methods. BPEL4People proposes complex extensions to the BPEL specification. However, so far it is still quite high level and doesn't yet specify the exact syntax of the new activities mentioned above. Until the specification becomes more concrete, we don't expect vendors to implement the proposed extensions. But while BPEL4People is early in the standardization process, it shows a great deal of promise. The BPEL4People proposal raises an important question: Is it necessary to introduce such complex extensions to BPEL to cover user interactions? Some vendor solutions model user interactions as just another web service, with well-defined interfaces for both BPEL processes and client applications. This approach does not require any changes to BPEL. To become portable, it would only need an industry-wide agreement on the two interfaces. And, of course, both interfaces can be specified with WSDL, which gives developers great flexibility and lets them use practically any environment, language, or platform that supports Web Services. Clearly, a single standard approach has not yet been adopted for extending BPEL to include Human Tasks and workflow services. However, this does not mean that developers cannot use BPEL to develop business processes with user interactions. Human Task integration with BPEL To interleave user interactions with service invocations in BPEL processes we can use a workflow service, which interacts with BPEL using standard WSDL interfaces. This way, the BPEL process can assign user tasks and wait for responses by invoking the workflow service using the same syntax as for any other service. The BPEL process can also perform more complex operations such as updating, completing, renewing, routing, and escalating tasks. After the BPEL process has assigned tasks to users, users can act on the tasks by using the appropriate applications. The applications communicate with the workflow service by using WSDL interfaces or another API (such as Java) to acquire the list of tasks for selected users, render appropriate user interfaces, and return results to the workflow service, which forwards them to the BPEL process. User applications can also perform other tasks such as reassign, escalate, route, suspend, resume, and withdraw. Finally, the workflow service may allow other communication channels, such as e-mail and SMS, as shown in the following figure: Oracle Human Workflow concepts Oracle SOA Suite 11g provides the Human Workflow component, which enables including human interaction in BPEL processes in a relatively easy way. The Human Workflow component consists of different services that handle various aspects of human interaction with business process and expose their interfaces through WSDL; therefore, BPEL processes invoke them just like any other service. The following figure shows the overall architecture of the Oracle Workflow services: As we can see in the previous figure, the Workflow consists of the following services: Task Service exposes operations for task state management, such as operations to update a task, complete a task, escalate a task, reassign a task, and so on. When we add a human task to the BPEL process, the corresponding partner link for the Task Service is automatically created. Task Assignment Service provides functionality to route, escalate, reassign tasks, and more. Task Query Service enables retrieving the task list for a user based on a search criterion. Task Metadata Service enables retrieving the task metadata. Identity Service provides authentication and authorization of users and lookup of user properties and privileges. Notification Service enables sending of notifications to users using various channels (e-mail, voice message, IM, SMS, and so on). User Metadata Service manages metadata, related to workflow users, such as user work queues, preferences, and so on. Runtime Configuration Service provides functionality for managing metadata used in the task service runtime environment. Evidence Store Service supports management of digitally-signed workflow tasks. BPEL processes use the Task Service to assign tasks to users. More specifically, tasks can be assigned to: Users: Users are defined in an identity store configured with the SOA infrastructure. Groups: Groups contain individual users, which can claim a task and act upon it. Application roles: Used to logically group users and other roles. These roles are application specific and are not stored in the identity store. Assigning tasks to groups or roles is more flexible, as every user in a certain group (role) can review the task to complete it. Oracle SOA Suite 11g provides three methods for assigning users, groups, and application roles to tasks: Static assignment: Static users, groups, or application roles can be assigned at design time. Dynamic assignment: We can define an XPath expression to determine the task participant at runtime. Rule-based assignment: We can create a list of participants with complex expressions. Once the user has completed the task, the BPEL process receives a callback from the Task Service with the result of the user action. The BPEL process continues to execute. The Oracle Workflow component provides several possibilities regarding how users can review the tasks that have been assigned to them, and take the corresponding actions. The most straightforward approach is to use the Oracle BPM Worklist application. This application comes with Oracle SOA Suite 11g and allows users to review the tasks, to see the task details, and to select the decision to complete the task. If the Oracle BPM Worklist application is not appropriate, we can develop our own user interface in Java (using JSP, JSF, Swing, and so on) or almost any other environment that supports Web Services (such as .NET for example). In this respect, the Workflow service is very flexible and we can use a portal, such as Oracle Portal, a web application, or almost any other application to review the tasks. The third possibility is to use e-mail for task reviews. We use e-mails over the Notification service. Workflow patterns To simplify the development of workflows, Oracle SOA Suite 11g provides a library of workflow patterns (participant types). Workflow patterns define typical scenarios of human interactions with BPEL processes. The following participant types are supported: Single approver: Used when a participant maps to a user, group, or role. Parallel: Used if multiple users have to act in parallel (for example, if multiple users have to provide their opinion or vote). The percentage of required user responses can be specified. Serial: Used if multiple users have to act in a sequence. A management chain or a list of users can be specified. FYI (For Your Information): Used if a user only needs to be notified about a task, but a user response is not required. With these, we can realize various workflow patterns, such as: Simple workflow: Used if a single user action is required, such as confirmation, decision, and so on. A timeout can also be specified. Simple workflow has two extension patterns: Escalation: Provides the ability to escalate the task to another user or role if the original user does not complete the task in the specified amount of time. Renewal: Provides the ability to extend the timeout if the user does not complete the task in the specified time. Sequential workflow: Used if multiple users have to act in a sequence. A management chain or a list of users can be specified. Sequential workflow has one extension pattern: Escalation: Same functionality as above. Parallel workflow: Used if multiple users have to act in parallel (for example, if multiple users have to provide their opinion or vote). The percentage of required user responses can be specified. This pattern has an extension pattern: Final reviewer: Is used when the final review has to act after parallel users have provided feedback. Ad-hoc (dynamic) workflow: Used to assign the task to one user, who can then route the task to other user. The task is completed when the user does not route it forward. FYI workflow: Used if a user only needs to be notified about a task, but a user response is not required. Task continuation: Used to build complex workflow patterns as a chain of simple patterns (those described above).

0
0
3481

Packt

01 Jun 2011

15 min read

HTML5: Generic Containers

Packt

01 Jun 2011

15 min read

0
0
3477

article-image-random-value-generators-elm

Eduard Kyvenko

19 Dec 2016

5 min read

Random Value Generators in Elm

Eduard Kyvenko

19 Dec 2016

5 min read

Purely functional nature of Elm leads to certain implications when used for generating random values. On the other hand, it opens up a completely new dimension for producing values of any desired shape, which is extremely useful in some cases. This article covers the core concepts for working with Random module. JavaScript offers Math.random as a way of producing random numbers; it does not expect a seed unlike traditional Pseudorandom number generator. Even though Elm is compiled to JavaScript, it does not rely on the native implementation for random numbers generation. It gives you more control by offering an API for both producing random values without explicitly specifying the seed and having the option to specify the seed explicitly and preserve its state. Both ways have tradeoffs and should be used in different situations. Random values without a Seed Before digging deeper I recommend that you look into the official Elm Guide Effects / Random, where you will find the most basic example of Random.generate. It is the easiest way to put your hands on random values. There are some significant tradeoffs you should be aware of. It relies on Time.now behind the scenes, which means you cannot guarantee efficient randomness if you run this command multiple times consecutively. In other words, there is a risk of getting the same value from running Random.generate multiple times within a short period of time. A good use case for this kind of command would be generating the seed for future, more efficient, and secure random values. I have written a little seed generator, which can be used for providing a seed for future Random.step calls: seedGenerator : Generator Seed seedGenerator = Random.int Random.minInt Random.maxInt |> Random.map (Random.initialSeed) Current time serves as a seed for Random.generate, and as you might know already, retrieving current time from the JavaScript is a side effect. Every value will arrive with a message. I will go ahead and define it; the generator will return value of the Seed type: type Msg = Update Seed init = ( { seed = Nothing } -- Initial command to create independent Seed. , Random.generate Update seedGenerator ) Storing the seed as a Maybe value makes a lot of sense since it is not present in the model at the very beginning. The initial application state will execute the generator and produce a message with a new seed, which will be accessible inside the update function: update msg model = case msg of Update seed -> -- Save newly created Seed into state. ( { model | seed = Just seed }, Cmd.none ) This concludes the initial setup for using random value generators with a Seed. As I have mentioned already, Random.generate is a not statistically reliable source of random values, therefore you should avoid relying on it too much in situations when you need multiple random values at the time. Random values with a Seed Using Random.step might be a little hard at the start. The type definition annotation for this function suggests that you will get a tuple with your newly generated value and the next seed state for future steps: Generator a -> Seed -> (a, Seed) This example application will put every new random value on a stack and display that in the DOM: I will extend the model with an additional key for saving random integers: type alias Model = { seed : Maybe Seed , stack : List Int } In the new handler for putting random values on the stack, I heavily rely on Maybe.map. It is very convenient when you want to make an impossible state impossible. In this case, I don’t want to generate any new values if the seed is missing for some reason: update msg model = case msg of Update seed -> -- Preserve newly initialized Seed state. ( { model | seed = Just seed }, Cmd.none ) PutRandomNumber -> let {- In case if seed was present, new model will contain the new value and a new state for the seed. -} newModel : Model newModel = model.seed |> Maybe.map (Random.step (Random.int 0 10)) |> Maybe.map (( number, seed ) -> { model | seed = Just seed , stack = number :: model.stack } ) |> Maybe.withDefault model in ( newModel , Cmd.none ) In short, the new branch will generate a random integer and a new seed and update the model with those new values if the seed was present. This concludes the basic example of Random.step usage, but there’s a lot more to learn. Generators You can get pretty far with Generator and define something more complex than just an integer. Let’s define a generator for producing random stats for calculating BMI: type alias Model = { seed : Maybe Seed , stack : List BMI } type alias BMI = { weight : Float , height : Float , bmi : Float } valueGenerator : Generator BMI valueGenerator = Random.map2 (w h -> BMI w h (w / (h * h))) (Random.float 60 150) (Random.float 0.6 1.2) Random.map allows using values from a passed generator and applying a function to the results, which is very convenient for making simple calculations, such as BMI: You can raise the bar with Random.andThen and produce generators based on random values. This is super useful for making combinations without repeats. Check the source of this example application on GitHub elm-examples/random-initial-seed Conclusion Elm offers a powerful abstraction for declarative definition of random value generators. Building values of any complex shape becomes quite simple by combining the power of Random.map. However, it might be a little overwhelming after JavaScript or any other imperative language. Give it a chance, maybe you will need a reliable generator for custom values in your next project! About the author Eduard Kyvenko is a frontend lead at Dinero. He has been working with Elm for over half a year and has built a tax return and financial statements app for Dinero. You can find him on GitHub at @halfzebra.

0
0
3477

How-To Tutorials

article-image-learning-how-classify-real-world-examples

Packt

22 Aug 2013

24 min read

Learning How to Classify with Real-world Examples

Packt

22 Aug 2013

24 min read

(For more resources related to this topic, see here.) The Iris dataset The Iris dataset is a classic dataset from the 1930s; it is one of the first modern examples of statistical classification. The setting is that of Iris flowers, of which there are multiple species that can be identified by their morphology. Today, the species would be defined by their genomic signatures, but in the 1930s, DNA had not even been identified as the carrier of genetic information. The following four attributes of each plant were measured: Sepal length Sepal width Petal length Petal width In general, we will call any measurement from our data as features. Additionally, for each plant, the species was recorded. The question now is: if we saw a new flower out in the field, could we make a good prediction about its species from its measurements? This is the supervised learning or classification problem; given labeled examples, we can design a rule that will eventually be applied to other examples. This is the same setting that is used for spam classification; given the examples of spam and ham (non-spam e-mail) that the user gave the system, can we determine whether a new, incoming message is spam or not? For the moment, the Iris dataset serves our purposes well. It is small (150 examples, 4 features each) and can easily be visualized and manipulated. The first step is visualization Because this dataset is so small, we can easily plot all of the points and all two-dimensional projections on a page. We will thus build intuitions that can then be extended to datasets with many more dimensions and datapoints. Each subplot in the following screenshot shows all the points projected into two of the dimensions. The outlying group (triangles) are the Iris Setosa plants, while Iris Versicolor plants are in the center (circle) and Iris Virginica are indicated with "x" marks. We can see that there are two large groups: one is of Iris Setosa and another is a mixture of Iris Versicolor and Iris Virginica. We are using Matplotlib; it is the most well-known plotting package for Python. We present the code to generate the top-left plot. The code for the other plots is similar to the following code: from matplotlib import pyplot as plt from sklearn.datasets import load_iris import numpy as np # We load the data with load_iris from sklearn data = load_iris() features = data['data'] feature_names = data['feature_names'] target = data['target'] for t,marker,c in zip(xrange(3),">ox","rgb"): # We plot each class on its own to get different colored markers plt.scatter(features[target == t,0], features[target == t,1], marker=marker, c=c) Building our first classification model If the goal is to separate the three types of flower, we can immediately make a few suggestions. For example, the petal length seems to be able to separate Iris Setosa from the other two flower species on its own. We can write a little bit of code to discover where the cutoff is as follows: plength = features[:, 2] # use numpy operations to get setosa features is_setosa = (labels == 'setosa') # This is the important step: max_setosa =plength[is_setosa].max() min_non_setosa = plength[~is_setosa].min() print('Maximum of setosa: {0}.'.format(max_setosa)) print('Minimum of others: {0}.'.format(min_non_setosa)) This prints 1.9 and 3.0. Therefore, we can build a simple model: if the petal length is smaller than two, this is an Iris Setosa flower; otherwise, it is either Iris Virginica or Iris Versicolor. if features[:,2] < 2: print 'Iris Setosa' else: print 'Iris Virginica or Iris Versicolour' This is our first model, and it works very well in that it separates the Iris Setosa flowers from the other two species without making any mistakes. What we had here was a simple structure; a simple threshold on one of the dimensions. Then we searched for the best dimension threshold. We performed this visually and with some calculation; machine learning happens when we write code to perform this for us. The example where we distinguished Iris Setosa from the other two species was very easy. However, we cannot immediately see what the best threshold is for distinguishing Iris Virginica from Iris Versicolor. We can even see that we will never achieve perfect separation. We can, however, try to do it the best possible way. For this, we will perform a little computation. We first select only the non-Setosa features and labels: features = features[~is_setosa] labels = labels[~is_setosa] virginica = (labels == 'virginica') Here we are heavily using NumPy operations on the arrays. is_setosa is a Boolean array, and we use it to select a subset of the other two arrays, features and labels. Finally, we build a new Boolean array, virginica, using an equality comparison on labels. Now, we run a loop over all possible features and thresholds to see which one results in better accuracy. Accuracy is simply the fraction of examples that the model classifies correctly: best_acc = -1.0 for fi in xrange(features.shape[1]): # We are going to generate all possible threshold for this feature thresh = features[:,fi].copy() thresh.sort() # Now test all thresholds: for t in thresh: pred = (features[:,fi] > t) acc = (pred == virginica).mean() if acc > best_acc: best_acc = acc best_fi = fi best_t = t The last few lines select the best model. First we compare the predictions, pred, with the actual labels, virginica. The little trick of computing the mean of the comparisons gives us the fraction of correct results, the accuracy. At the end of the for loop, all possible thresholds for all possible features have been tested, and the best_fi and best_t variables hold our model. To apply it to a new example, we perform the following: if example[best_fi] > t: print 'virginica' else: print 'versicolor' What does this model look like? If we run it on the whole data, the best model that we get is split on the petal length. We can visualize the decision boundary. In the following screenshot, we see two regions: one is white and the other is shaded in grey. Anything that falls in the white region will be called Iris Virginica and anything that falls on the shaded side will be classified as Iris Versicolor: In a threshold model, the decision boundary will always be a line that is parallel to one of the axes. The plot in the preceding screenshot shows the decision boundary and the two regions where the points are classified as either white or grey. It also shows (as a dashed line) an alternative threshold that will achieve exactly the same accuracy. Our method chose the first threshold, but that was an arbitrary choice. Evaluation – holding out data and cross-validation The model discussed in the preceding section is a simple model; it achieves 94 percent accuracy on its training data. However, this evaluation may be overly optimistic. We used the data to define what the threshold would be, and then we used the same data to evaluate the model. Of course, the model will perform better than anything else we have tried on this dataset. The logic is circular. What we really want to do is estimate the ability of the model to generalize to new instances. We should measure its performance in instances that the algorithm has not seen at training. Therefore, we are going to do a more rigorous evaluation and use held-out data. For this, we are going to break up the data into two blocks: on one block, we'll train the model, and on the other—the one we held out of training—we'll test it. The output is as follows: Training error was 96.0%. Testing error was 90.0% (N = 50). The result of the testing data is lower than that of the training error. This may surprise an inexperienced machine learner, but it is expected and typical. To see why, look back at the plot that showed the decision boundary. See if some of the examples close to the boundary were not there or if one of the ones in between the two lines was missing. It is easy to imagine that the boundary would then move a little bit to the right or to the left so as to put them on the "wrong" side of the border. The error on the training data is called a training error and is always an overly optimistic estimate of how well your algorithm is doing. We should always measure and report the testing error; the error on a collection of examples that were not used for training. These concepts will become more and more important as the models become more complex. In this example, the difference between the two errors is not very large. When using a complex model, it is possible to get 100 percent accuracy in training and do no better than random guessing on testing! One possible problem with what we did previously, which was to hold off data from training, is that we only used part of the data (in this case, we used half of it) for training. On the other hand, if we use too little data for testing, the error estimation is performed on a very small number of examples. Ideally, we would like to use all of the data for training and all of the data for testing as well. We can achieve something quite similar by cross-validation. One extreme (but sometimes useful) form of cross-validation is leave-one-out. We will take an example out of the training data, learn a model without this example, and then see if the model classifies this example correctly: error = 0.0 for ei in range(len(features)): # select all but the one at position 'ei': training = np.ones(len(features), bool) training[ei] = False testing = ~training model = learn_model(features[training], virginica[training]) predictions = apply_model(features[testing], virginica[testing], model) error += np.sum(predictions != virginica[testing]) At the end of this loop, we will have tested a series of models on all the examples. However, there is no circularity problem because each example was tested on a model that was built without taking the model into account. Therefore, the overall estimate is a reliable estimate of how well the models would generalize. The major problem with leave-one-out cross-validation is that we are now being forced to perform 100 times more work. In fact, we must learn a whole new model for each and every example, and this will grow as our dataset grows. We can get most of the benefits of leave-one-out at a fraction of the cost by using x-fold cross-validation; here, "x" stands for a small number, say, five. In order to perform five-fold cross-validation, we break up the data in five groups, that is, five folds. Then we learn five models, leaving one fold out of each. The resulting code will be similar to the code given earlier in this section, but here we leave 20 percent of the data out instead of just one element. We test each of these models on the left out fold and average the results: The preceding figure illustrates this process for five blocks; the dataset is split into five pieces. Then for each fold, you hold out one of the blocks for testing and train on the other four. You can use any number of folds you wish. Five or ten fold is typical; it corresponds to training with 80 or 90 percent of your data and should already be close to what you would get from using all the data. In an extreme case, if you have as many folds as datapoints, you can simply perform leave-one-out cross-validation. When generating the folds, you need to be careful to keep them balanced. For example, if all of the examples in one fold come from the same class, the results will not be representative. We will not go into the details of how to do this because the machine learning packages will handle it for you. We have now generated several models instead of just one. So, what final model do we return and use for the new data? The simplest solution is now to use a single overall model on all your training data. The cross-validation loop gives you an estimate of how well this model should generalize. A cross-validation schedule allows you to use all your data to estimate if your methods are doing well. At the end of the cross-validation loop, you can use all your data to train a final model. Although it was not properly recognized when machine learning was starting out, nowadays it is seen as a very bad sign to even discuss the training error of a classification system. This is because the results can be very misleading. We always want to measure and compare either the error on a held-out dataset or the error estimated using a cross-validation schedule. Building more complex classifiers In the previous section, we used a very simple model: a threshold on one of the dimensions. Throughout this article, you will see many other types of models, and we're not even going to cover everything that is out there. What makes up a classification model? We can break it up into three parts: The structure of the model: In this, we use a threshold on a single feature. The search procedure: In this, we try every possible combination of feature and threshold. The loss function: Using the loss function, we decide which of the possibilities is less bad (because we can rarely talk about the perfect solution). We can use the training error or just define this point the other way around and say that we want the best accuracy. Traditionally, people want the loss function to be minimum. We can play around with these parts to get different results. For example, we can attempt to build a threshold that achieves minimal training error, but we will only test three values for each feature: the mean value of the features, the mean plus one standard deviation, and the mean minus one standard deviation. This could make sense if testing each value was very costly in terms of computer time (or if we had millions and millions of datapoints). Then the exhaustive search we used would be infeasible, and we would have to perform an approximation like this. Alternatively, we might have different loss functions. It might be that one type of error is much more costly than another. In a medical setting, false negatives and false positives are not equivalent. A false negative (when the result of a test comes back negative, but that is false) might lead to the patient not receiving treatment for a serious disease. A false positive (when the test comes back positive even though the patient does not actually have that disease) might lead to additional tests for confirmation purposes or unnecessary treatment (which can still have costs, including side effects from the treatment). Therefore, depending on the exact setting, different trade-offs can make sense. At one extreme, if the disease is fatal and treatment is cheap with very few negative side effects, you want to minimize the false negatives as much as you can. With spam filtering, we may face the same problem; incorrectly deleting a non-spam e-mail can be very dangerous for the user, while letting a spam e-mail through is just a minor annoyance. What the cost function should be is always dependent on the exact problem you are working on. When we present a general-purpose algorithm, we often focus on minimizing the number of mistakes (achieving the highest accuracy). However, if some mistakes are more costly than others, it might be better to accept a lower overall accuracy to minimize overall costs. Finally, we can also have other classification structures. A simple threshold rule is very limiting and will only work in the very simplest cases, such as with the Iris dataset. A more complex dataset and a more complex classifier We will now look at a slightly more complex dataset. This will motivate the introduction of a new classification algorithm and a few other ideas. Learning about the Seeds dataset We will now look at another agricultural dataset; it is still small, but now too big to comfortably plot exhaustively as we did with Iris. This is a dataset of the measurements of wheat seeds. Seven features are present, as follows: Area (A) Perimeter (P) Compactness () Length of kernel Width of kernel Asymmetry coefficient Length of kernel groove There are three classes that correspond to three wheat varieties: Canadian, Koma, and Rosa. As before, the goal is to be able to classify the species based on these morphological measurements. Unlike the Iris dataset, which was collected in the 1930s, this is a very recent dataset, and its features were automatically computed from digital images. This is how image pattern recognition can be implemented: you can take images in digital form, compute a few relevant features from them, and use a generic classification system. Later, we will work through the computer vision side of this problem and compute features in images. For the moment, we will work with the features that are given to us. UCI Machine Learning Dataset Repository The University of California at Irvine (UCI) maintains an online repository of machine learning datasets (at the time of writing, they are listing 233 datasets). Both the Iris and Seeds dataset used in this article were taken from there. The repository is available online: http://archive.ics.uci.edu/ml/ Features and feature engineering One interesting aspect of these features is that the compactness feature is not actually a new measurement, but a function of the previous two features, area and perimeter. It is often very useful to derive new combined features. This is a general area normally termed feature engineering; it is sometimes seen as less glamorous than algorithms, but it may matter more for performance (a simple algorithm on well-chosen features will perform better than a fancy algorithm on not-so-good features). In this case, the original researchers computed the "compactness", which is a typical feature for shapes (also called "roundness"). This feature will have the same value for two kernels, one of which is twice as big as the other one, but with the same shape. However, it will have different values for kernels that are very round (when the feature is close to one) as compared to kernels that are elongated (when the feature is close to zero). The goals of a good feature are to simultaneously vary with what matters and be invariant with what does not. For example, compactness does not vary with size but varies with the shape. In practice, it might be hard to achieve both objectives perfectly, but we want to approximate this ideal. You will need to use background knowledge to intuit which will be good features. Fortunately, for many problem domains, there is already a vast literature of possible features and feature types that you can build upon. For images, all of the previously mentioned features are typical, and computer vision libraries will compute them for you. In text-based problems too, there are standard solutions that you can mix and match. Often though, you can use your knowledge of the specific problem to design a specific feature. Even before you have data, you must decide which data is worthwhile to collect. Then, you need to hand all your features to the machine to evaluate and compute the best classifier. A natural question is whether or not we can select good features automatically. This problem is known as feature selection. There are many methods that have been proposed for this problem, but in practice, very simple ideas work best. It does not make sense to use feature selection in these small problems, but if you had thousands of features, throwing out most of them might make the rest of the process much faster. Nearest neighbor classification With this dataset, even if we just try to separate two classes using the previous method, we do not get very good results. Let me introduce, therefore, a new classifier: the nearest neighbor classifier. If we consider that each example is represented by its features (in mathematical terms, as a point in N-dimensional space), we can compute the distance between examples. We can choose different ways of computing the distance, for example: def distance(p0, p1): 'Computes squared euclidean distance' return np.sum( (p0-p1)**2) Now when classifying, we adopt a simple rule: given a new example, we look at the dataset for the point that is closest to it (its nearest neighbor) and look at its label: def nn_classify(training_set, training_labels, new_example): dists = np.array([distance(t, new_example) for t in training_set]) nearest = dists.argmin() return training_labels[nearest] In this case, our model involves saving all of the training data and labels and computing everything at classification time. A better implementation would be to actually index these at learning time to speed up classification, but this implementation is a complex algorithm. Now, note that this model performs perfectly on its training data! For each point, its closest neighbor is itself, and so its label matches perfectly (unless two examples have exactly the same features but different labels, which can happen). Therefore, it is essential to test using a cross-validation protocol. Using ten folds for cross-validation for this dataset with this algorithm, we obtain 88 percent accuracy. As we discussed in the earlier section, the cross-validation accuracy is lower than the training accuracy, but this is a more credible estimate of the performance of the model. We will now examine the decision boundary. For this, we will be forced to simplify and look at only two dimensions (just so that we can plot it on paper). In the preceding screenshot, the Canadian examples are shown as diamonds, Kama seeds as circles, and Rosa seeds as triangles. Their respective areas are shown as white, black, and grey. You might be wondering why the regions are so horizontal, almost weirdly so. The problem is that the x axis (area) ranges from 10 to 22 while the y axis (compactness) ranges from 0.75 to 1.0. This means that a small change in x is actually much larger than a small change in y. So, when we compute the distance according to the preceding function, we are, for the most part, only taking the x axis into account. If you have a physics background, you might have already noticed that we had been summing up lengths, areas, and dimensionless quantities, mixing up our units (which is something you never want to do in a physical system). We need to normalize all of the features to a common scale. There are many solutions to this problem; a simple one is to normalize to Z-scores. The Z-score of a value is how far away from the mean it is in terms of units of standard deviation. It comes down to this simple pair of operations: # subtract the mean for each feature: features -= features.mean(axis=0) # divide each feature by its standard deviation features /= features.std(axis=0) Independent of what the original values were, after Z-scoring, a value of zero is the mean and positive values are above the mean and negative values are below it. Now every feature is in the same unit (technically, every feature is now dimensionless; it has no units) and we can mix dimensions more confidently. In fact, if we now run our nearest neighbor classifier, we obtain 94 percent accuracy! Look at the decision space again in two dimensions; it looks as shown in the following screenshot: The boundaries are now much more complex and there is interaction between the two dimensions. In the full dataset, everything is happening in a seven-dimensional space that is very hard to visualize, but the same principle applies: where before a few dimensions were dominant, now they are all given the same importance. The nearest neighbor classifier is simple, but sometimes good enough. We can generalize it to a k-nearest neighbor classifier by considering not just the closest point but the k closest points. All k neighbors vote to select the label. k is typically a small number, such as 5, but can be larger, particularly if the dataset is very large. Binary and multiclass classification The first classifier we saw, the threshold classifier, was a simple binary classifier (the result is either one class or the other as a point is either above the threshold or it is not). The second classifier we used, the nearest neighbor classifier, was a naturally multiclass classifier (the output can be one of several classes). It is often simpler to define a simple binary method than one that works on multiclass problems. However, we can reduce the multiclass problem to a series of binary decisions. This is what we did earlier in the Iris dataset in a haphazard way; we observed that it was easy to separate one of the initial classes and focused on the other two, reducing the problem to two binary decisions: Is it an Iris Setosa (yes or no)? If no, check whether it is an Iris Virginica (yes or no). Of course, we want to leave this sort of reasoning to the computer. As usual, there are several solutions to this multiclass reduction. The simplest is to use a series of "one classifier versus the rest of the classifiers". For each possible label l, we build a classifier of the type "is this l or something else?". When applying the rule, exactly one of the classifiers would say "yes" and we would have our solution. Unfortunately, this does not always happen, so we have to decide how to deal with either multiple positive answers or no positive answers. Alternatively, we can build a classification tree. Split the possible labels in two and build a classifier that asks "should this example go to the left or the right bin?" We can perform this splitting recursively until we obtain a single label. The preceding diagram depicts the tree of reasoning for the Iris dataset. Each diamond is a single binary classifier. It is easy to imagine we could make this tree larger and encompass more decisions. This means that any classifier that can be used for binary classification can also be adapted to handle any number of classes in a simple way. There are many other possible ways of turning a binary method into a multiclass one. There is no single method that is clearly better in all cases. However, which one you use normally does not make much of a difference to the final result. Most classifiers are binary systems while many real-life problems are naturally multiclass. Several simple protocols reduce a multiclass problem to a series of binary decisions and allow us to apply the binary models to our multiclass problem. Summary In a sense, this was a very theoretical article, as we introduced generic concepts with simple examples. We went over a few operations with a classic dataset. This, by now, is considered a very small problem. However, it has the advantage that we were able to plot it out and see what we were doing in detail. This is something that will be lost when we move on to problems with many dimensions and many thousands of examples. The intuitions we gained here will all still be valid. Classification means generalizing from examples to build a model (that is, a rule that can automatically be applied to new, unclassified objects). It is one of the fundamental tools in machine learning. We also learned that the training error is a misleading, over-optimistic estimate of how well the model does. We must, instead, evaluate it on testing data that was not used for training. In order to not waste too many examples in testing, a cross-validation schedule can get us the best of both worlds (at the cost of more computation). We also had a look at the problem of feature engineering. Features are not something that is predefined for you, but choosing and designing features is an integral part of designing a machine-learning pipeline. In fact, it is often the area where you can get the most improvements in accuracy as better data beats fancier methods. In this article, we wrote all of our own code (except when we used NumPy, of course). We needed to build up intuitions on simple cases to illustrate the basic concepts. Resources for Article: Further resources on this subject: Python Testing: Installing the Robot Framework [Article] Getting Started with Spring Python [Article] Creating Skeleton Apps with Coily in Spring Python [Article]

0
0
3477

Introducing Test-driven Machine Learning

Working with Drupal Audio in Flash (part 2)

Machine Learning Models

Getting Started with OpenLayers

Joomla! 1.6: Organizing and Managing Content

The Various Components in Sencha Touch

Wordpress: Buddypress Courseware

Using OSGi Services

The Internet of Peas? Gardening with JavaScript Part 2

Recording Calls in FreePBX 2.5

Trending Topics

EU approves labour protection laws for ‘Whistleblowers’ and ‘Gig economy’ workers with implications for tech companies

Human Interactions in BPEL

HTML5: Generic Containers

Random Value Generators in Elm

Learning How to Classify with Real-world Examples

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access