How-To Tutorials

article-image-basics-image-histograms-opencv

12 Oct 2016

11 min read

Basics of Image Histograms in OpenCV

12 Oct 2016

In this article by Samyak Datta, author of the book Learning OpenCV 3 Application Development we are going to focus our attention on a different style of processing pixel values. The output of the techniques, which would comprise our study in the current article, will not be images, but other forms of representation for images, namely image histograms. We have seen that a two-dimensional grid of intensity values is one of the default forms of representing images in digital systems for processing as well as storage. However, such representations are not at all easy to scale. So, for an image with a reasonably low spatial resolution, say 512 x 512 pixels, working with a two-dimensional grid might not pose any serious issues. However, as the dimensions increase, the corresponding increase in the size of the grid may start to adversely affect the performance of the algorithms that work with the images. A primary advantage that an image histogram has to offer is that the size of a histogram is a constant that is independent of the dimensions of the image. As a consequence of this, we are guaranteed that irrespective of the spatial resolution of the images that we are dealing with, the algorithms that power our solutions will have to deal with a constant amount of data if they are working with image histograms. (For more resources related to this topic, see here.) Each descriptor captures some particular aspects or features of the image to construct its own form of representation. One of the common pitfalls of using histograms as a form of image representation as compared to its native form of using the entire two-dimensional grid of values is loss of information. A full-fledged image representation using pixel intensity values for all pixel locations naturally consists of all the information that you would need to reconstruct a digital image. However, the same cannot be said about histograms. When we study about image histograms in detail, we'll get to see exactly what information do we stand to lose. And this loss in information is prevalent across all forms of image descriptors. The basics of histograms At the outset, we will briefly explain the concept of a histogram. Most of you might already know this from your lessons on basic statistics. However, we will reiterate this for the sake of completeness. Histogram is a form of data representation technique that relies on an aggregation of data points. The data is aggregated into a set of predefined bins that are represented along the x axis, and the number of data points that fall within each of the bins make up the corresponding counts on the y axis. For example, let's assume that our data looks something like the following: D={2,7,1,5,6,9,14,11,8,10,13} If we define three bins, namely Bin_1 (1 - 5), Bin_2 (6 - 10), and Bin_3 (11 - 15), then the histogram corresponding to our data would look something like this: Bins Frequency Bin_1 (1 - 5) 3 Bin_2 (6 - 10) 5 Bin_3 (11 - 15) 3 What this histogram data tells us is that we have three values between 1 and 5, five between 6 and 10, and three again between 11 and 15. Note that it doesn't tell us what the values are, just that some n values exist in a given bin. A more familiar visual representation of the histogram in discussion is shown as follows: As you can see, the bins have been plotted along the x axis and their corresponding frequencies along the y axis. Now, in the context of images, how is a histogram computed? Well, it's not that difficult to deduce. Since the data that we have comprise pixel intensity values, an image histogram is computed by plotting a histogram using the intensity values of all its constituent pixels. What this essentially means is that the sequence of pixel intensity values in our image becomes the data. Well, this is in fact the simplest kind of histogram that you can compute using the information available to you from the image. Now, coming back to image histograms, there are some basic terminologies (pertaining to histograms in general) that you need to be aware of before you can dip your hands into code. We have explained them in detail here: Histogram size: The histogram size refers to the number of bins in the histogram. Range: The range of a histogram is the range of data that we are dealing with. The range of data as well as the histogram size are both important parameters that define a histogram. Dimensions: Simply put, dimensions refer to the number of the type of items whose values we aggregate in the histogram bins. For example, consider a grayscale image. We might want to construct a histogram using the pixel intensity values for such an image. This would be an example of a single-dimensional histogram because we are just interested in aggregating the pixel intensity values and nothing else. The data, in this case, is spread over a range of 0 to 255. On account of being one-dimensional, such histograms can be represented graphically as 2D plots—one-dimensional data (pixel intensity values) being plotted on the x axis (in the form of bins) along with the corresponding frequency counts along the y axis. We have already seen an example of this before. Now, imagine a color image with three channels: red, green, and blue. Let's say that we want to plot a histogram for the intensities in the red and green channels combined. This means that our data now becomes a pair of values (r, g). A histogram that is plotted for such data will have a dimensionality of 2. The plot for such a histogram will be a 3D plot with the data bins covering the x and y axes and the frequency counts plotted along the z axis. Now that we have discussed the theoretical aspects of image histograms in detail, let's start thinking along the lines of code. We will start with the simplest (and in fact the most ubiquitous) design of image histograms. The range of our data will be from 0 to 255 (both inclusive), which means that all our data points will be integers that fall within the specified range. Also, the number of data points will equal the number of pixels that make up our input image. The simplicity in design comes from the fact that we fix the size of the histogram (the number of bins) as 256. Now, take a moment to think about what this means. There are 256 different possible values that our data points can take and we have a separate bin corresponding to each one of those values. So such an image histogram will essentially depict the 256 possible intensity values along with the counts of the number of pixels in the image that are colored with each of the different intensities. Before taking a peek at what OpenCV has to offer, let's try to implement such a histogram on our own! We define a function named computeHistogram() that takes the grayscale image as an input argument and returns the image histogram. From our earlier discussions, it is evident that the histogram must contain 256 entries (for the 256 bins): one for each integer between 0 and 255. The value stored in the histogram corresponding to each of the 256 entries will be the count of the image pixels that have a particular intensity value. So, conceptually, we can use an array for our implementation such that the value stored in the histogram [ i ] (for 0≤i≤255) will be the count of the number of pixels in the image having the intensity of i. However, instead of using a C++ array, we will comply with the rules and standards followed by OpenCV and represent the histogram as a Mat object. We have already seen that a Mat object is nothing but a multidimensional array store. The implementation is outlined in the following code snippet: Mat computeHistogram(Mat input_image) { Mat histogram = Mat::zeros(256, 1, CV_32S); for (int i = 0; i < input_image.rows; ++i) { for (int j = 0; j < input_image.cols; ++j) { int binIdx = (int) input_image.at<uchar>(i, j); histogram.at<int>(binIdx, 0) += 1; } } return histogram; } As you can see, we have chosen to represent the histogram as a 256-element-column-vector Mat object. We iterate over all the pixels in the input image and keep on incrementing the corresponding counts in the histogram (which had been initialized to 0). As per our description of the image histogram properties, it is easy to see that the intensity value of any pixel is the same as the bin index that is used to index into the appropriate histogram bin to increment the count. Having such an implementation ready, let's test it out with the help of an actual image. The following code demonstrates a main() function that reads an input image, calls the computeHistogram() function that we have defined just now, and displays the contents of the histogram that is returned as a result: int main() { Mat input_image = imread("/home/samyak/Pictures/lena.jpg", IMREAD_GRAYSCALE); Mat histogram = computeHistogram(input_image); cout << "Histogram...n"; for (int i = 0; i < histogram.rows; ++i) cout << i << " : " << histogram.at<int>(i, 0) << "n"; return 0; } We have used the fact that the histogram that is returned from the function will be a single column Mat object. This makes the code that displays the contents of the histogram much cleaner. Histograms in OpenCV We have just seen the implementation of a very basic and minimalistic histogram using the first principles in OpenCV. The image histogram was basic in the sense that all the bins were uniform in size and comprised only a single pixel intensity. This made our lives simple when we designed our code for the implementation; there wasn't any need to explicitly check the membership of a data point (the intensity value of a pixel) with all the bins of our histograms. However, we know that a histogram can have bins whose sizes span more than one. Can you think of the changes that we might need to make in the code that we had written just now to accommodate for bin sizes larger than 1? If this change seems doable to you, try to figure out how to incorporate the possibility of non-uniform bin sizes or multidimensional histograms. By now, things might have started to get a little overwhelming to you. No need to worry. As always, OpenCV has you covered! The developers at OpenCV have provided you with a calcHist() function whose sole purpose is to calculate the histograms for a given set of arrays. By arrays, we refer to the images represented as Mat objects, and we use the term set because the function has the capability to compute multidimensional histograms from the given data: Mat computeHistogram(Mat input_image) { Mat histogram; int channels[] = { 0 }; int histSize[] = { 256 }; float range[] = { 0, 256 }; const float* ranges[] = { range }; calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges, true, false); return histogram; } Before we move on to an explanation of the different parameters involved in the calcHist() function call, I want to bring your attention to the abundant use of arrays in the preceding code snippet. Even arguments as simple as histogram sizes are passed to the function in the form of arrays rather than integer values, which at first glance seem quite unnecessary and counter-intuitive. The usage of arrays is due to the fact that the implementation of calcHist() is equipped to handle multidimensional histograms as well, and when we are dealing with such multidimensional histogram data, we require multiple parameters to be passed, one for each dimension. This would become clearer once we demonstrate an example of calculating multidimensional histograms using the calcHist() function. For the time being, we just wanted to clear the immediate confusion that might have popped up in your minds upon seeing the array parameters. Here is a detailed list of the arguments in the calcHist() function call: Source images Number of source images Channel indices Mask Dimensions (dims) Histogram size Ranges Uniform flag Accumulate flag The last couple of arguments (the uniform and accumulate flags) have default values of true and false, respectively. Hence, the function call that you have seen just now can very well be written as follows: calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges); Summary Thus in this article we have successfully studied fundamentals of using histograms in OpenCV for image processing. Resources for Article: Further resources on this subject: Remote Sensing and Histogram [article] OpenCV: Image Processing using Morphological Filters [article] Learn computer vision applications in Open CV [article]

0
0
22139

article-image-reactive-python-asynchronous-programming-rescue-part-2

Xavier Bruhiere

10 Oct 2016

5 min read

Reactive Python - Asynchronous programming to the rescue, Part 2

Xavier Bruhiere

10 Oct 2016

5 min read

This two-part series explores asynchronous programming with Python using Asyncio. In Part 1 of this series, we started by building a project that shows how you can use Reactive Python in asynchronous programming. Let’s pick it back up here by exploring peer-to-peer communication and then just touching on service discovery before examining the streaming machine-to-machine concept. Peer-to-peer communication So far we’ve established a websocket connection to process clock events asynchronously. Now that one pin swings between 1's and 0's, let's wire a buzzer and pretend it buzzes on high states (1) and remains silent on low ones (0). We can rephrase that in Python, like so: # filename: sketches.py import factory class Buzzer(factory.FactoryLoop): """Buzz on light changes.""" def setup(self, sound): # customize buzz sound self.sound = sound @factory.reactive async def loop(self, channel, signal): """Buzzing.""" behavior = self.sound if signal == '1' else '...' self.out('signal {} received -> {}'.format(signal, behavior)) return behavior So how do we make them to communicate? Since they share a common parent class, we implement a stream method to send arbitrary data and acknowledge reception with, also, arbitrary data. To sum up, we want IOPin to use this API: class IOPin(factory.FactoryLoop): # [ ... ] @protocol.reactive async def loop(self, channel, msg): # [ ... ] await self.stream('buzzer', bits_stream) return 'acknowledged' Service discovery The first challenge to solve is service discovery. We need to target specific nodes within a fleet of reactive workers. This topic, however, goes past the scope of this post series. The shortcut below will do the job (that is, hardcode the nodes we will start), while keeping us focused on reactive messaging. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: mesh.py """Provide nodes network knowledge.""" import websockets class Node(object): def __init__(self, name, socket, port): print('[ mesh ] registering new node: {}'.format(name)) self.name = name self._socket = socket self._port = port def uri(self, path): return 'ws://{socket}:{port}/{path}'.format(socket=self._socket, port=self._port, path=path) def connection(self, path=''): # instanciate the same connection as `clock` method return websockets.connect(self.uri(path)) # TODO service discovery def grid(): """Discover and build nodes network.""" # of course a proper service discovery should be used here # see consul or zookkeeper for example # note: clock is not a server so it doesn't need a port return [ Node('clock', 'localhost', None), Node('blink', 'localhost', 8765), Node('buzzer', 'localhost', 8765 + 1) ] Streaming machine-to-machine chat Let's provide FactoryLoop with the knowledge of the grid and implement an asynchronous communication channel. # filename: factory.py (continued) import mesh class FactoryLoop(object): def __init__(self, *args, **kwargs): # now every instance will know about the other ones self.grid = mesh.grid() # ... def node(self, name): """Search for the given node in the grid.""" return next(filter(lambda x: x.name == name, self.grid)) async def stream(self, target, data, channel): self.out('starting to stream message to {}'.format(target)) # use the node webscoket connection defined in mesh.py # the method is exactly the same as the clock async with self.node(target).connection(channel) as ws: for partial in data: self.out('> sending payload: {}'.format(partial)) # websockets requires bytes or strings await ws.send(str(partial)) self.out('< {}'.format(await ws.recv())) We added a bit of debugging lines to better understand how the data flows through the network. Every implementation of the FactoryLoop can both react to events and communicate with other nodes it is aware of. Wrapping up Time to update arduino.py and run our cluster of three reactive workers in three @click.command()# [ ... ]def main(sketch, **flags): # [ ... ] elif sketch == 'buzzer': sketchs.Buzzer(sound='buzz buzz buzz').run(flags['socket'], flags['port']) Launch three terminals or use a tool such as foreman to spawn multiple processes. Either way, keep in mind that you will need to track the scripts output. way, keep in mind that you will need to track the scripts output. $ # start IOPin and Buzzer on the same ports we hardcoded in mesh.py $ ./arduino.py buzzer --port 8766 $ ./arduino.py iopin --port 8765 $ # now that they listen, trigger actions with the clock (targetting IOPin port) $ ./arduino.py clock --port 8765 [ ... ] $ # Profit ! We just saw one worker reacting to a clock and another reacting to randomly generated events. The websocket protocol allowed us to exchange streaming data and receive arbitrary responses, unlocking sophisticated fleet orchestration. While we limited this example to two nodes, a powerful service discovery mechanism could bring to life a distributed network of microservices. By completing this post series, you should now have a better understanding of how to use Python with Asyncio for asynchronous programming. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high-intensity sports.

0
0
5890

Masayuki Takagi

07 Oct 2016

11 min read

How to start Chainer

Masayuki Takagi

07 Oct 2016

11 min read

0
0
4495

How-To Tutorials

article-image-modern-natural-language-processing-part-3

Brian McMahan

07 Oct 2016

8 min read

Modern Natural Language Processing – Part 3

Brian McMahan

07 Oct 2016

8 min read

In the previous two posts, I walked through how to preprocess raw data to a cleaner version and then turn that into a form which can be used in a machine learning experiment. I also discussed how you can set up a modular infrastructure so changing components isn't a hassle and your workflow is streamlined. In this final post in the series, I will outline a language model and discuss the modeling choices. I will outline the algorithms needed to both decode from the language model and to sample from it. Note that if you want to do a sequence labeling task instead of a language modeling task, the outputs must become your sequence labels, but the inputs are your sentences. The Language Model A language model has one goal: do not be surprised by the next token given all previous tokens. This translates into trying to maximize the probability of the next word given the previously seen words. It is useful to think of the 'shape' of our data's matrices at each step in our model. Specifically, though, our model will go through the following steps: Take as input the data being served from our server 2-dimensional matrices: (batch, sequence_length) Embed it using the matrix we constructed 3-dimensional tensors: (batch, sequence_length, embedding_size) Use any RNN variant to sequentially go through our data This will assume each example is on the batch dimension and time on the sequence_length dimension It will take vectors of size embedding_size and perform operations on them. Using the RNN output, we will apply a Dense layer, which will perform a classification back to our vocabulary space. This is our target. 0. Imports from keras.layers import Input, Embedding, LSTM, Dropout, Dense, TimeDistributed from keras.engine import Model from keras.optimizers import Adam from keras.callbacks import ModelCheckpoint class TrainingModel(object): def__init__(self, igor): self.igor = igor def make(self): ### code below goes here 1. Input Defining an entry point into Keras is very similar to the other layers. The only difference is that you have to give it information about the shape of the input data. I tend to give it more information than it needs—the batch size in addition to the sequence length—because omitting the batch size is useful when you watch variable batch sizes, but it serves no purpose otherwise. It also quells any paranoid worries that the model will break because it got the shape wrong at some point. words_in = Input(batch_shape=(igor.batch_size, igor.sequence_length), dtype='int32') 2. Embed This is where we can use the embeddings we had previously calculated. Note the mask_zero flag. This is set to True so that the Layer will calculate the mask—where each position in the input tensor is equal to 0. The Layer, in accordance to Keras' underlying functionality, is then pushed through the network to be used in final calculations. emb_W = self.igor.embeddings.astype(K.floatx()) words_embedded = Embedding(igor.vocab_size, igor.embedding_size, mask_zero=True, weights=[emb_W])(words_in) 3. Recurrence word_seq = LSTM(igor.rnn_size, return_sequences=True)(words_embedded) 4. Classification predictions = TimeDistributed(Dense(igor.vocab_size, activation='softmax'))(word_seq) 5. Compile Model Now, we can compile the model. Keras makes this simple: specify the inputs, outputs, loss, optimizer and metrics. I have omitted the custom metrics for now. I will bring them back up in evaluations below. optimizer = Adam(igor.learning_rate) model = Model(input=words_in, output=predictions) model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=custom_metrics) All together ### model.py from keras.layers import Input, Embedding, LSTM, Dropout, Dense, TimeDistributed from keras.engine import Model from keras.optimizers import Adam from keras.callbacks import ModelCheckpoint class TrainingModel(object): def__init__(self, igor): self.igor = igor def make(self): words_in = Input(batch_shape=(igor.batch_size, igor.sequence_length), dtype='int32') words_embedded = Embedding(igor.vocab_size, igor.embedding_size, mask_zero=True)(words_in) word_seq = LSTM(igor.rnn_size, return_sequences=True)(words_embedded) predictions = TimeDistributed(Dense(igor.vocab_size, activation='softmax'))(word_seq) optimizer = Adam(igor.learning_rate) self.model = Model(input=words_in, output=predictions) self.model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=custom_metrics) Training The driver is a useful part of the pipeline. Not only does it give a convenient entry point to the training, but it also allows you to easily switch between training, debugging, and testing. ### driver.py if__name__ == "__main__": import sys if__name__ == "__main__": import sys igor = Igor.from_file(sys.argv[2]) if sys.argv[1] == "train": igor.prep() next(igor.train_gen(forever=True)) model = TrainingModel(igor) model.make() try: model.train() exceptKeyboardInterruptas e: # safe exitting stuff here. # perhaps, model save. print("death by keyboard") Train Function ### model.py class TrainingModel(object): # ... def train(self): igor = self.igor train_data = igor.train_gen(forever=True) dev_data = igor.dev_gen(forever=True) callbacks = [ModelCheckpoint(filepath=igor.checkpoint_filepath, verbose=1, save_best_only=True)] self.model.fit_generator(generator=train_data, samples_per_epoch=igor.num_train_samples, nb_epoch=igor.num_epochs, callbacks=callbacks, verbose=1, validation_data=dev_data, nb_val_samples=igor.num_dev_samples) Failure to learn There are many ways that learning can fail. Stanford's CS231N course has a few things on this. Additionally, here are many Quora and Stack Overflow posts on debugging the learning process. Evaluating Language model evaluations aim to quantify how well the model captures the signal and anticipates the noise. For this, there are two standard metrics. The first is an aggregate of the probabilities of the model: Log Likelihood or Negative Log Likelihood. I will use Negative Log Likelihood (NLL) because it is more interpretable. The other is Perplexity. This is very related to NLL and originates from information theory as a way to quantify the information gain of the model's learned distribution to the empirical distribution of the test dataset. It is usually interpreted as the uniform uncertainty left in the data. At the time of writing this blog, masks in Keras currently do not get used in the accuracy calculations. But this will soon be implemented. Until then, there is a Keras fork that has these implemented. It can be found here. The custom_metrics from above would then simply be ['accuracy', 'perplexity']. Decoding Decoding is the process by which you infer a sequence of labels from a sequence of inputs. The idea and algorithms for it come from the signal processing research in which a noisy channel is emitting a signal and the task is to recover the signal. Typically, there is an encoder at one end that provides information so that a decoder at the other end can decode it. This means sequentially deciding which discrete token each part of the signal represents. In NLP, decoding is essentially the same task. In a sequence, the history of tokens can influence the likelihood of future tokens, so naive decoding by selecting the most likely token at each time step may not be the optimal sequence. The alternative solution, enumerating all possible sequences, is prohibitively expensive because of the combinatoric explosion of paths. Luckily, there are dynamic programming algorithms, such as the Viterbi algorithm, which solve such issues. The idea behind Viterbi is simple: Obtain the maximum likelihood classification at the last time step. At each time t, there is a set of k hypotheses that can be True. By looking at the previous time steps k hypotheses scores and the cost of transition from each to the current k possible, you can compute the best path so far for each of the k hypotheses. Thus, at every time step, you can do a linear update and ensure the optimal set of paths. At every time step, the backpointers were cached and used at the final time step to decode the path through the k states. Viterbi has its own limitations, such as also becoming expensive when the discrete hypothesis space is large. There are additional approximations, such as Beam Search, which uses a subset of the viterbi paths (selected by score at every time step). Several tasks are accomplished with this decoding. Sampling to produce a sentence (or caption an image) is typically done with a Beam search. Additionally, labeling each word in a sentence (such as part of speech tagging or entity tagging) is done with a sequential decoding procedure. Conclusion Now that you have completed this three part series, you can start to run your own NLP experiments! About the author Brian McMahan is in his final year of graduate school at Rutgers University, completing a PhD in computer science and an MS in cognitive psychology. He holds a BS in cognitive science from Minnesota State University, Mankato. At Rutgers, Brian investigates how natural language and computer vision can be brought closer together with the aim of developing interactive machines that can coordinate in the real world. His research uses machine learning models to derive flexible semantic representations of open-ended perceptual language.

0
0
1379

How-To Tutorials

article-image-introduction-neural-networks-chainer-part-3

Hiroyuki Vincent

06 Oct 2016

7 min read

Introduction to Neural Networks with Chainer – Part 3

Hiroyuki Vincent

06 Oct 2016

7 min read

In this final post of this three part series we cover the optimizer, batches, complex and networks and we also discuss running the code of our example on the GPU. Let’s continue from where we left off in Part 2 with training the model to go over optimizers. Optimizers The optimizer module chainer.optimizer is responsible for orchestrating the parameter updates minimizing the loss. It is instantiated with a class that inherits the base class chainer.optimizer.Optimizer. Once instantiated, it needs to be set up by calling its setup() method, passing it a Link to optimize, that in turn contains all the Chainer variables that are to be trained. Remember that we give it the whole model, an instance of the chainer.link.Chain class which is a subclass of the Chainer chainer.link.Link. We can then call the update method every time we want to optimize the parameters in the training loop. The update method can be invoked by passing a loss function and the arguments to it, in this case the input value and the target value. One motivation for defining the __call__ method as the loss function is precisely so that the model instance can be passed to the optimzer here. Note that the loss is both stored in the class instance and returned. It is returned so that the model instance can be directly passed to the optimizer, which expects a loss function. But, the loss is also stored in the model itself so that it can be read by the training loop to compute the average loss over the course of an epoch. The chainer.optimizers.SGD used in this example inherits from chainer.optimizer.GradientMethod which in turn inherits from the base optimizer class. The SGD, or Stochastic Gradient Descent optimizer performs a parameter optimization in each update much similar to the one we did in the previous article when performing a gradient descent. You have probably noted that it takes the learning rate as an argument in its constructor. What the update method actually does is that it first resets the gradients for the variables in the model, computes the loss, runs the backward() method on the parameters and then updates the parameters using the algorithm defined by the optimizer instance, a simple SGD in our case. Other optimization algorithms in the framework include Momentum SGD, AdaGrad, AdaDelta, Adam and RMSProp. To train the model in this example using AdaGrad instead of SGD, you only need to change the instantiation of the optimizer from optimizer = optimizers.SGD(lr=learning_rate) to optimizer = optimizers.AdaGrad(lr=learning_rate, eps=1e-08). The arguments differ from optimizer to optimizer. AdaGrad also takes the epsilon smoothing term for instance, usually a small number that in Chainer defaults to 1e-08. Always in Batches Each time you feed the model with training data (invoke the __call__ method directly on the model instance or use optimizer.update()), the data being passed needs to be in a mini-batch format. This means that you cannot feed the model above with data such as x = Variable(np.array([1, 2, 3], dtype=np.float32)) because you want to do online training. This would result in the following error. ... Invalid operation is performed in: LinearFunction (Forward) Expect: in_types[0].ndim >= 2 Actual: 1 < 2 However, wrapping the array in another list x = Variable(np.array([[1, 2, 3]], dtype=np.float32)) would work since you simply made a mini-batch of size one, which would yield the same results as online training. You can of course train the model using regular batch training by passing the whole dataset to the model to perform one update per epoch. You might have noticed that the loss being returned is multiplied by the size of the batch. This is because the Chainer loss functions such as chainer.functions.mean_squared_error returns the average loss over the given mini-batch. In order to compute the average loss over the complete dataset, we keep track of an accumulated loss over each individual sample. If you don't want to do that, you could simply pass the whole dataset as a single batch into the model at the end of an epoch to compute the loss. Extending to More Complex Networks Adding activation functions or noise such as dropout can be done by adding a single function calls in the model definition. They are all part of the chainer.functions module. If you'd like to add 50% dropout to the hidden layer in our autoencoder, you'd change the first forward pass line of code from h = self.l1(x) to h = F.dropout(self.l1(x), ratio=0.5). Since this is such a small network, you will see that the loss increases quite significantly. Adding a ReLU activation function would look like this, F.relu(self.l1(x)). These methods can be applied to other Links as well and not just the linear connection that we've used in the autoencoder. Creating other types of networks such as convolutional neural networks or recurrent ones are done by changing the layer definitions in the Chain constructor. What you need to be careful of when training is mainly to make sure that the dimensions of the Chainer variables that are passed between the layers and especially in the input layer match. If the first layer of a network is a convolutional layer, that is chainer.links.Convolution2D, the input dimensions is slightly different from this autoencoder example since there is an additional channel, width and height dimensionality to the data. Remember the data is still passed in batches. Running the Same Code on The GPU Assuming that CUDA is installed, you only need to add a few lines of code in order to train the model on the GPU. What you only need to do is to copy the Chainer Links and the training data to the GPU and you can run the same training code. Since the CuPy interface in Chainer implements a subset of the NumPy interface we could write it nicely in the following way. import numpy as np from chainer import cuda from chainer.cuda import cupy as cp xp = None model = Autoencoder() if device_id > 0: cuda.check_cuda_available() # A CUDA installation was found, set the default device cuda.get_device(device_id).use() # Copy the Links to the default device model.to_gpu() xp = cp else: xp = np # Replace all old occurrences of np with xp The device_id in this case identifies a GPU. If you have installed CUDA correctly, you should be able to list all devices with the nvidia-smi command in the CLI to see exactly which devices are available. The id can of course be hardcoded in the code itself but could for example be passed as an argument to the Python script. Depending on the specified id and the availability, the variable xp is set to NumPy or CuPy accordingly. What you need to change in the rest of the code is simply replacing all previous occurrences of np with xp. Saving and Loading Training Data The trained parameters can be written to files for persistence. This is natively supported by the framework using any of the modules in the chainer.serializers. It is also possible to load parameters into existing models and their layers in the same manner. This is useful when you need to stop the training or want to take snapshots during the training process. Summary Defining and training neural networks with Chainer is intuitive and requires little code. It is easy to maintain and experiment with various hyper parameters because of its design. In this second part of the series with Chainer, we implemented a neural network and trained it with randomly generated data and common patterns were introduced such as how to design the model and the loss function to demonstrate this fact. The network covered in this article along with its data is more or less a toy problem. But, hopefully you will try Chainer out on your own and experiment with what it's capable of. About the Author Hiroyuki Vincent Yamazaki is a graduate student at KTH, Royal Institute of Technology in Sweden, currently conducting research in convolutional neural networks at Keio University in Tokyo, partially using Chainer as a part of a double-degree programme. GitHub LinkedIn

0
0
1406

How-To Tutorials

Amit Kothari

06 Oct 2016

6 min read

How to Manage Legacy Code

Amit Kothari

06 Oct 2016

6 min read

In my 10+ years of experience working as a software developer, I have spent more time working on legacy systems than on greenfield projects. What is legacy code and why do companies and teams invest so much to rewrite the code which is already working? What is legacy code? When we talk about legacy code, we think of old code written in old technology. But in reality, any code that is difficult to work with is legacy. If you are working on a code base that is difficult to understand, and making a small change takes a long time, then the code is legacy code. “Code without tests is bad code. It doesn't matter how well written it is; it doesn't matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don't know if our code is getting better or worse.” - Michael C. Feathers, Working Effectively with Legacy Code In today’s competitive world, companies want the software to be able to change quickly. However, working with legacy code is always slow because of the fear of introducing bugs, since there is no way to verify that the system is working as expected. Software modernization In my experience, the most common strategy to deal with legacy code is to replace it with clean code. If we keep adding more code to a legacy system without cleaning it up, we are just increasing to the technical debt. The process of rewriting legacy code to a modern technology is known as software modernization. Rewriting an existing application is quite different from writing a new app. In a greenfield app, we can start with a minimal viable product and then evolve our system based on user feedback. This image nicely explains the development cycle of a new product, so that you solve a user problem and then iteratively make it better. Unfortunately, when rewriting an existing app, we cannot follow the same approach because the end users already have the full-fledged solution. It will be like asking a user to use a skateboard as a replacement for a car. Because of this reason, some teams decide to write a new system in parallel to the legacy system. This means the legacy system is still live and in operation while the team is working on a new system to replace it. I have seen this approach work in a few small projects where the legacy system was quite stable and not in active development. However, in case of a large system that was developed over many years and is still changing based on new requirements, there is a lot of code to rewrite, which in most cases is not well documented and the requirements are not clear. All of this results in projects getting delayed, and business is not getting anything in return of their investments, and eventually projects get cancelled after companies have already invested a lot of time and money into it. Strategy to manage legacy systems Just like a new application, it is not a great idea to follow a big bang approach to tackle a legacy system. Here is the strategy to convert large legacy codebase to clean code. Divide and rule - Divide the system into different domains or modules and start rewriting one module at the time. The smaller the modules, the easier and quicker they will be to replace. The business will see return for their investment, users will get a better system, and along the way the development team can learn how to make the whole process smoother and better for the next module rewrite. Integrate with the existing system - It is worth investigating on how to integrate a module written in the new technology with the existing system. Start with a walking skeleton, that is, a very small functionality written in the new technology, and integrate it with the existing system. It is better to link all of the main architectural components as soon as possible instead of leaving the integration part for later. Test covering - While dealing with legacy code, there is a very good chance that the requirements are not very clear or well documented. We want the new module to work the same way as the existing legacy system. The best way to achieve this is by writing tests. Start with writing tests to run against the existing code and verify its behavior. Then the same tests can be reused and run against the new module to make sure it works the same way as the existing code. Faster feedback cycle - Even if we break down the system into smaller modules, we want to keep the feedback loop as small as possible. Use a continuous delivery approach to release software faster and more frequently to the users. Automate the release and deployment process and keep the development and testing environment in production—try to avoid any last-minute issues on deployment. Build a better system not a replica - We are putting all of this effort in rewriting the system, in order to think about making it better. Improve user experience by making the interface intuitive and simplify workflow, save time and effort by removing features that are not required anymore, and add the features that the users always wanted. Work in feature/domain teams - One key difference I have noticed between successful and failed/delayed projects is the team structure. Projects with cross functional teams have higher chances of meeting their goals, because everyone required to deliver the project is part of the same team. On the other hand, dividing teams based on technology like frontends, backends, and operations team can slow down the delivery process due to the lack of communication, and because each team has different priorities. Hopefully this has given you a good starting point for approaching legacy code. It can certainly be a bit of a challenge, and it demands that you think about the specifics of a given situation and ask yourself what's at stake and what's really important. If you're interested in learning more about dealing with legacy code, check out this article. About the author Amit Kothari is a full-stack software developer based in Melbourne, Australia. He has 10+ years of experience in designing and implementing software mainly in Java/JEE. His recent experience is in building web applications using JavaScript frameworks like React and AngularJS and backend micro services/ REST API in Java. He is passionate about lean software development and continuous delivery.

0
0
4349

How-To Tutorials

article-image-frontend-development-bootstrap-4

Packt

06 Oct 2016

19 min read

Frontend development with Bootstrap 4

Packt

06 Oct 2016

19 min read

0
0
34510

How-To Tutorials

article-image-basics-classes-and-objects

Packt

06 Oct 2016

11 min read

Basics of Classes and Objects

Packt

06 Oct 2016

11 min read

In this article by Steven Lott, the author of the book Modern Python Cookbook, we will see how to use a class to encapsulate data plus processing. (For more resources related to this topic, see here.) Introduction The point of computing is to process data. Even when building something like an interactive game, the game state and the player's actions are the data, the processing computes the next game state and the display update. The data plus processing is ubiquitous. Some games can have a relatively complex internal state. When we think of console games with multiple players and complex graphics, there are complex, real-time state changes. On the other hand, when we think of a very simple casino game like Craps, the game state is very simple. There may be no point established, or one of the numbers 4, 5, 6, 8, 9, 10 may be the established point. The transitions are relatively simple, and are often denoted by moving markers and chips around on the casino table. The data includes the current state, player actions, and rolls of the dice. The processing is the rules of the game. A game like Blackjack has a somewhat more complex internal state change as each card is accepted. In games where the hands can be split, the state of play can become quite complex. The data includes the current game state, the player's commands, and the cards drawn from the deck. Processing is defined by the rules of the game as modified by any house rules. In the case of Craps, the player may place bets. Interestingly, the player's input, has no effect on the game state. The internal state of the game object is determined entirely by the next throw of the dice. This leads to a class design that's relatively easy to visualize. Using a class to encapsulate data plus processing The essential idea of computing is to process data. This is exemplified when we write functions that process data. Often, we'd like to have a number of closely related functions that work with a common data structure. This concept is the heart of object-oriented programming. A class definition will contain a number of methods that will control the internal state of an object. The unifying concept behind a class definition is often captured as a summary of the responsibilities allocated to the class. How can we do this effectively? What's a good way to design a class? Getting Ready Let's look at a simple, stateful object—a pair of dice. The context for this would be an application which simulates the casino game of Craps. The goal is to use simulation of results to help invent a better playing strategy. This will save us from losing real money while we try to beat the house edge. There's an important distinction between the class definition and an instance of the class, called an object. We call this idea – as a whole – Object-Oriented Programming. Our focus is on writing class definitions. Our overall application will create instances of the classes. The behavior that emerges from the collaboration of the instances is the overall goal of the design process. Most of the design effort is on class definitions. Because of this, the name object-oriented programming can be misleading. The idea of emergent behavior is an essential ingredient in object-oriented programming. We don't specify every behavior of a program. Instead, we decompose the program into objects, define the object's state and behavior via the object's classes. The programming decomposes into class definitions based on their responsibilities and collaborations. An object should be viewed as a thing—a noun. The behavior of the class should be viewed as verbs. This gives us a hint as to how we can proceed with design classes that work effectively. Object-oriented design is often easiest to understand when it relates to tangible real-world things. It's often easier to write a software to simulate a playing card than to create a software that implements an Abstract Data Type (ADT). For this example, we'll simulate the rolling of die. For some games – like the casino game of Craps – two dice are used. We'll define a class which models the pair of dice. To be sure that the example is tangible, we'll model the pair of dice in the context of simulating a casino game. How to do it... Write down simple sentences that describe what an instance of the class does. We can call these as the problem statements. It's essential to focus on short sentences, and emphasize the nouns and verbs. The game of Craps has two standard dice. Each die has six faces with point values from 1 to 6. Dice are rolled by a player. The total of the dice changes the state of the craps game. However, those rules are separate from the dice. If the two dice match, the number was rolled the hard way. If the two dice do not match, the number was easy. Some bets depend on this hard vs easy distinction. Identify all of the nouns in the sentences. Nouns may identify different classes of objects. These are collaborators. Examples include player and game. Nouns may also identify attributes of objects in questions. Examples include face and point value. Identify all the verbs in the sentences. Verbs are generally methods of the class in question. Examples include rolled and match. Sometimes, they are methods of other classes. Examples include change the state, which applies to the Craps game. Identify any adjectives. Adjectives are words or phrases which clarify a noun. In many cases, some adjectives will clearly be properties of an object. In other cases, the adjectives will describe relationships among objects. In our example, a phrase like the total of the dice is an example of a prepositional phrase taking the role of an adjective. The the total of phrase modifies the noun the dice. The total is a property of the pair of dice. Start writing the class with the class statement. class Dice: Initialize the object's attributes in the __init__ method. def __init__(self): self.faces = None We'll model the internal state of the dice with the self.faces attribute. The self variable is required to be sure that we're referencing an attribute of a given instance of a class. The object is identified by the value of the instance variable, self We could put some other properties here as well. The alternative is to implement the properties as separate methods. These details of the design decision is the subject for using properties for lazy attributes. Define the object's methods based on the various verbs. In our case, we have several methods that must be defined. Here's how we can implement dice are rolled by a player. def roll(self): self.faces = (random.randint(1,6), random.randint(1,6)) We've updated the internal state of the dice by setting the self.faces attribute. Again, the self variable is essential for identifying the object to be updated. Note that this method mutates the internal state of the object. We've elected to not return a value. This makes our approach somewhat like the approach of Python's built-in collection classes. Any method which mutates the object does not return a value. This method helps implement the total of the dice changes the state of the Craps game. The game is a separate object, but this method provides a total that fits the sentence. def total(self): return sum(self.faces) These two methods help answer the hard way and easy way questions. def hardway(self): return self.faces[0] == self.faces[1] def easyway(self): return self.faces[0] != self.faces[1] It's rare in a casino game to have a rule that has a simple logical inverse. It's more common to have a rare third alternative that has a remarkably bad payoff rule. In this case, we could have defined easy way as return not self.hardway(). Here's an example of using the class. First, we'll seed the random number generator with a fixed value, so that we can get a fixed sequence of results. This is a way to create a unit test for this class. >>> import random >>> random.seed(1) We'll create a Dice object, d1. We can then set its state with the roll() method. We'll then look at the total() method to see what was rolled. We'll examine the state by looking at the faces attribute. >>> from ch06_r01 import Dice >>> d1 = Dice() >>> d1.roll() >>> d1.total() 7 >>> d1.faces (2, 5) We'll create a second Dice object, d2. We can then set its state with the roll() method. We'll look at the result of the total() method, as well as the hardway() method. We'll examine the state by looking at the faces attribute. >>> d2 = Dice() >>> d2.roll() >>> d2.total() 4 >>> d2.hardway() False >>> d2.faces (1, 3) Since the two objects are independent instances of the Dice class, a change to d2 has no effect on d1. >>> d1.total() 7 How it works... The core idea here is to use ordinary rules of grammar – nouns, verbs, and adjectives – as a way to identify basic features of a class. Noun represents things. A good descriptive sentence should focus on tangible, real-world things more than ideas or abstractions. In our example, dice are real things. We try to avoid using abstract terms like randomizers or event generators. It's easier to describe the tangible features of real things, and then locate an abstract implementation that offers some of the tangible features. The idea of rolling the dice is an example of physical action that we can model with a method definition. Clearly, this action changes the state of the object. In rare cases – one time in 36 – the next state will happen to match the previous state. Adjectives often hold the potential for confusion. There are several cases such as: Some adjectives like first, last, least, most, next, previous, and so on will have a simple interpretation. These can have a lazy implementation as a method or an eager implementation as an attribute value. Some adjectives are more complex phrase like "the total of the dice". This is an adjective phrase built from a noun (total) and a preposition (of). This, too, can be seen as a method or an attribute. Some adjectives involve nouns that appear elsewhere in our software. We might have had a phrase like "the state of the Craps game" is a phrase where "state of" modifies another object, the "Craps game". This is clearly only tangentially related to the dice themselves. This may reflect a relationship between "dice" and "game". We might add a sentence to the problem statement like "The dice are part of the game". This can help clarify the presence of a relationship between game and dice. Prepositional phrases like "are part of" can always be reversed to create the a statement from the other object's point of view—"The game contains dice". This can help clarify the relationships among objects. In Python, the attributes of an object are – by default – dynamic. We don't specific a fixed list of attributes. We can initialize some (or all) of the attributes in the __init__() method of a class definition. Since attributes aren't static, we have considerable flexibility in our design. There's more... Capturing the essential internal state, and methods that cause state change is the first step in good class design. We can summarize some helpful design principles using the acronym SOLID. Single Responsibility Principle: A class should have one clearly-defined responsibility. Open/Closed Principle: A class should be open to extension – generally via inheritance – but closed to modification. We should design our classes so that we don't need to tweak the code to add or change features. Liskov Substitution Principle: We need to design inheritance so that a subclass can be used in place of the superclass. Interface Segregation Principle: When writing a problem statement, we want to be sure that collaborating classes have as few dependencies as possible. In many cases, this principle will lead us to decompose large problems into many small class definitions. Dependency Inversion Principle: It's less than ideal for a class to depend directly on other classes. It's better if a class depends on an abstraction, and a concrete implementation class is substituted for the abstract class. The goal is to create classes that have the proper behavior and also adhere to the design principles. Resources for Article: Further resources on this subject: Python Data Structures [article] Web scraping with Python (Part 2) [article] How is Python code organized [article]

0
0
4277

Packt

06 Oct 2016

7 min read

Python for Driving Hardware

Packt

06 Oct 2016

7 min read

In this article by Tim Cox, author of the book Raspberry Pi Cookbook for Python Programmers - Second Edition, we will see how to control Raspberry Pi with the help of your own buttons and switches. (For more resources related to this topic, see here.) Responding to a button Many applications using the Raspberry Pi require that actions are activated without a keyboard and screen attached to it. The GPIO pins provide an excellent way for the Raspberry Pi to be controlled by your own buttons and switches without a mouse/keyboard and screen. Getting ready You will need the following equipment: 2 x DuPont female to male patch wires Mini breadboard (170 tie points) or a larger one Push button switch (momentary close) or a wire connection to make/break the circuit Breadboarding wire (solid core) 1k ohm resistor The switches are as seen in the following diagram: The push button switch and other types of switches The switches used in the following examples are single pole single throw (SPST) momentary close push button switches. Single pole (SP) means that there is one set of contacts that makes a connection. In the case of the push switch used here, the legs on each side are connected together with a single pole switch in the middle. A double pole (DP) switch acts just like a single pole switch, except that the two sides are separated electrically, allowing you to switch two separate components on/off at the same time. Single throw (ST) means the switch will make a connection with just one position; the other side will be left open. Double throw (DT) means both positions of the switch will connect to different parts. Momentary close means that the button will close the switch when pressed and automatically open it when released. A latched push button switch will remain closed until it is pressed again. The layout of the button circuit We will use sound in this example, so you will also need speakers or headphones attached to audio socket of the Raspberry Pi. You will need to install a program called flite using the following command, which will let us make the Raspberry Pi talk: sudo apt-get install flite After it has been installed, you can test it with the following command: sudo flite -t "hello I can talk" If it is a little too quiet (or too loud), you can adjust the volume (0-100 percent) using the following command: amixer set PCM 100% How to do it… Create the btntest.py script as follows: #!/usr/bin/python3 #btntest.py import time import os import RPi.GPIO as GPIO #HARDWARE SETUP # GPIO # 2[==X==1=======]26[=======]40 # 1[=============]25[=======]39 #Button Config BTN = 12 def gpio_setup(): #Setup the wiring GPIO.setmode(GPIO.BOARD) #Setup Ports GPIO.setup(BTN,GPIO.IN,pull_up_down=GPIO.PUD_UP) def main(): gpio_setup() count=0 btn_closed = True while True: btn_val = GPIO.input(BTN) if btn_val and btn_closed: print("OPEN") btn_closed=False elif btn_val==False and btn_closed==False: count+=1 print("CLOSE %s" % count) os.system("flite -t '%s'" % count) btn_closed=True time.sleep(0.1) try: main() finally: GPIO.cleanup() print("Closed Everything. END") #End How it works… We set up the GPIO pin as required, but this time as an input, and we also enable the internal pull-up resistor (refer to the Pull-up and pull-down resistor circuits subsection in the There's more… section of this recipe for more information) using the following code: GPIO.setup(BTN,GPIO.IN,pull_up_down=GPIO.PUD_UP) After the GPIO pin is set up, we create a loop that will continuously check the state of BTN using GPIO.input(). If the value returned is false, the pin has been connected to 0V (ground) through the switch, and we will use flite to count out loud for us each time the button is pressed. Since we have called the main function from within a try/finally condition, it will still call GPIO.cleanup() even if we close the program using Ctrl + Z. We use a short delay in the loop; this ensures that any noise from the contacts on the switch is ignored. This is because when we press the button, there isn't always perfect contact as we press or release it, and it may produce several triggers if we press it again too quickly. This is known as software debouncing; we ignore the bounce in the signal here. There's more… The Raspberry Pi GPIO pins must be used with care; voltages used for inputs should be within specific ranges, and any current drawn from them should be minimized using protective resistors. Safe voltages We must ensure that we only connect inputs that are between 0V (Ground) and 3.3V. Some processors use voltages between 0V and 5V, so extra components are required to interface safely with them. Never connect an input or component that uses 5V unless you are certain it is safe, or you will damage the GPIO ports of the Raspberry Pi. Pull-up and pull-down resistor circuits The previous code sets the GPIO pins to use an internal pull-up resistor. Without a pull-up resistor (or pull-down resistor) on the GPIO pin, the voltage is free to float somewhere between 3.3V and 0V, and the actual logical state remains undetermined (sometimes 1 and sometimes 0). Raspberry Pi's internal pull-up resistors are 50k ohm - 65k ohm and the pull-down resistors are 50k ohm - 65k ohm. External pull-up/pull-down resistors are often used in GPIO circuits (as shown in the following diagram), typically using 10k ohm or larger for similar reasons (giving a very small current draw when not active). A pull-up resistor allows a small amount of current to flow through the GPIO pin and will provide a high voltage when the switch isn't pressed. When the switch is pressed, the small current is replaced by the larger one flowing to 0V, so we get a low voltage on the GPIO pin instead. The switch is active low and logic 0 when pressed. It works as shown in the following diagram: A pull-up resistor circuit Pull-down resistors work in the same way, except the switch is active high (the GPIO pin is logic 1 when pressed). It works as shown in the following diagram: A pull-down resistor circuit Protection resistors In addition to the switch, the circuit includes a resistor in series with the switch to protect the GPIO pin as shown in the following diagram: A GPIO protective current-limiting resistor The purpose of the protection resistor is to protect the GPIO pin if it is accidentally set as an output rather than an input. Imagine, for instance, that we have our switch connected between the GPIO and ground. Now the GPIO pin is set as an output and switched on (driving it to 3.3V) as soon as we press the switch; without a resistor present, the GPIO pin will directly be connected to 0V. The GPIO will still try to drive it to 3.3V; this would cause the GPIO pin to burn out (since it would use too much current to drive the pin to the high state). If we use a 1k ohm resistor here, the pin is able to be driven high using an acceptable amount of current (I = V/R = 3.3/1k = 3.3mA). Resources for Article: Further resources on this subject: Raspberry Pi LED Blueprints [article] Raspberry Pi and 1-Wire [article] Learning BeagleBone Python Programming [article]

0
0
14300

How-To Tutorials

article-image-getting-organized-npm-and-bower

Packt

06 Oct 2016

13 min read

Getting Organized with NPM and Bower

Packt

06 Oct 2016

13 min read

In this article by Philip Klauzinski and John Moore, the authors of the book Mastering JavaScript Single Page Application Development, we will learn about the basics of NMP and Bower. JavaScript was the bane of the web development industry during the early days of the browser-rendered Internet. Now, powers hugely impactful libraries such as jQuery, and JavaScript-rendered content (as opposed to server-side-rendered content) is even indexed by many search engines. What was once largely considered an annoying language used primarily to generate popup windows and alert boxes has now become, arguably, the most popular programming language in the world. (For more resources related to this topic, see here.) Not only is JavaScript now more prevalent than ever in frontend architecture, but it has become a server-side language as well, thanks to the Node.js runtime. We have also now seen the proliferation of document-oriented databases, such as MongoDB, which store and return JSON data. With JavaScript present throughout the development stack, the door is now open for JavaScript developers to become full-stack developers without the need to learn a traditional server-side language. Given the right tools and know-how, any JavaScript developer can create single page applications (SPAs) comprising entirely the language they know best, and they can do so using an architecture such as MEAN (MongoDB, Express, AngularJS, and Node.js). Organization is key to the development of any complex single page application. If you don't get organized from the beginning, you are sure to introduce an inordinate number of regressions to your app. The Node.js ecosystem will help you do this with a full suite of indispensable and open source tools, three of which we will discuss here. In this article, you will learn about: Node Package Manager The Bower front-end package manager What is Node Package Manager? Within any full-stack JavaScript environment, Node Package Manager (NPM) will be your go-to tool for setting up your development environment and managing server-side libraries. NPM can be used within both global and isolated environment contexts. We will first explore the use of NPM globally. Installing Node.js and NPM NPM is a component of Node.js, so before you can use it, you must install Node.js. You can find installers for both Mac and Windows at nodejs.org. Once you have Node.js installed, using NPM is incredibly easy and is done from the command-line interface (CLI). Start by ensuring you have the latest version of NPM installed, as it is updated more often than Node.js itself: $ npm install -g npm When using NPM, the -g option will apply your changes to your global environment. In this case, you want your version of NPM to apply globally. As stated previously, NPM can be used to manage packages both globally and within isolated environments. Therefore, we want essential development tools to be applied globally so that you can use them in multiple projects on the same system. On Mac and some Unix-based systems, you may have to run the npm command as the superuser (prefix the command with sudo) in order to install packages globally, depending on how NPM was installed. If you run into this issue and wish to remove the need to prefix npm with sudo, see docs.npmjs.com/getting-started/fixing-npm-permissions. Configuring your package.json file For any project you develop, you will keep a local package.json file to manage your Node.js dependencies. This file should be stored at the root of your project directory, and it will only pertain to that isolated environment. This allows you to have multiple Node.js projects with different dependency chains on the same system. When beginning a new project, you can automate the creation of the package.json file from the command line: $ npm init Running npm init will take you through a series of JSON property names to define through command-line prompts, including your app's name, version number, description, and more. The name and version properties are required, and your Node.js package will not install without them being defined. Several of the properties will have a default value given within parentheses in the prompt so that you may simply hit Enter to continue. Other properties will simply allow you to hit Enter with a blank entry and will not be saved to the package.json file or be saved with a blank value: name: (my-app) version: (1.0.0) description: entry point: (index.js) The entry point prompt will be defined as the main property in package.json and is not necessary unless you are developing a Node.js application. In our case, we can forgo this field. The npm init command may in fact force you to save the main property, so you will have to edit package.json afterward to remove it; however, that field will have no effect on your web app. You may also choose to create the package.json file manually using a text editor if you know the appropriate structure to employ. Whichever method you choose, your initial version of the package.json file should look similar to the following example: { "name": "my-app", "version": "1.0.0", "author": "Philip Klauzinski", "license": "MIT", "description": "My JavaScript single page application." } If you want your project to be private and want to ensure that it does not accidently get published to the NPM registry, you may want to add the private property to your package.json file and set it to true. Additionally, you may remove some properties that only apply to a registered package: { "name": "my-app", "author": "Philip Klauzinski", "description": "My JavaScript single page application.", "private": true } Once you have your package.json file set up the way you like it, you can begin installing Node.js packages locally for your app. This is where the importance of dependencies begins to surface. NPM dependencies There are three types of dependencies that can be defined for any Node.js project in your package.json file: dependencies, devDependencies, and peerDependencies. For the purpose of building a web-based SPA, you will only need to use the devDependencies declaration. The devDependencies ones are those that are required for developing your application, but not required for its production environment or for simply running it. If other developers want to contribute to your Node.js application, they will need to run npm install from the command line to set up the proper development environment. For information on the other types of dependencies, see docs.npmjs.com. When adding devDependencies to your package.json file, the command line again comes to the rescue. Let's use the installation of Browserify as an example: $ npm install browserify --save-dev This will install Browserify locally and save it along with its version range to the devDependencies object in your package.json file. Once installed, your package.json file should look similar to the following example: { "name": "my-app", "version": "1.0.0", "author": "Philip Klauzinski", "license": "MIT", "devDependencies": { "browserify": "^12.0.1" } } The devDependencies object will store each package as key-value pairs, in which the key is the package name and the value is the version number or version range. Node.js uses semantic versioning, where the three digits of the version number represent MAJOR.MINOR.PATCH. For more information on semantic version formatting, see semver.org. Updating your development dependencies You will notice that the version number of the installed package is preceded by a caret (^) symbol by default. This means that package updates will only allow patch and minor updates for versions above 1.0.0. This is meant to prevent major version changes from breaking your dependency chain when updating your packages to the latest versions. To update your devDependencies and save the new version numbers, you will enter the following from the command line. $ npm update --save-dev Alternatively, you can use the -D option as a shortcut for --save-dev: $ npm update -D To update all globally installed NPM packages to their latest versions, run npm update with the -g option: $ npm update -g For more information on semantic versioning within NPM, see docs.npmjs.com/misc/semver. Now that you have NPM set up and you know how to install your development dependencies, you can move on to installing Bower. Bower Bower is a package manager for frontend web assets and libraries. You will use it to maintain your frontend stack and control version chains for libraries such as jQuery, AngularJS, and any other components necessary to your app's web interface. Installing Bower Bower is also a Node.js package, so you will install it using NPM, much like you did with the Browserify example installation in the previous section, but this time you will be installing the package globally. This will allow you to run bower from the command line anywhere on your system without having to install it locally for each project. $ npm install -g bower You can alternatively install Bower locally as a development dependency so that you may maintain different versions of it for different projects on the same system, but this is generally not necessary. $ npm install bower --save-dev Next, check that Bower is properly installed by querying the version from the command line. $ bower -v Bower also requires the Git version control system (VCS) to be installed on your system in order to work with packages. This is because Bower communicates directly with GitHub for package management data. If you do not have Git installed on your system, you can find instructions for Linux, Mac, and Windows at git-scm.com. Configuring your bower.json file The process of setting up your bower.json file is comparable to that of the package.json file for NPM. It uses the same JSON format, has both dependencies and devDependencies, and can also be automatically created. $ bower init Once you type bower init from the command line, you will be prompted to define several properties with some defaults given within parentheses: ? name: my-app ? version: 0.0.0 ? description: My app description. ? main file: index.html ? what types of modules does this package expose? (Press <space> to? what types of modules does this package expose? globals ? keywords: my, app, keywords ? authors: Philip Klauzinski ? license: MIT ? homepage: http://gui.ninja ? set currently installed components as dependencies? No ? add commonly ignored files to ignore list? Yes ? would you like to mark this package as private which prevents it from being accidentally published to the registry? Yes These questions may vary depending on the version of Bower you install. Most properties in the bower.json file are not necessary unless you are publishing your project to the Bower registry, indicated in the final prompt. You will most likely want to mark your package as private unless you plan to register it and allow others to download it as a Bower package. Once you have created the bower.json file, you can open it in a text editor and change or remove any properties you wish. It should look something like the following example: { "name": "my-app", "version": "0.0.0", "authors": [ "Philip Klauzinski" ], "description": "My app description.", "main": "index.html", "moduleType": [ "globals" ], "keywords": [ "my", "app", "keywords" ], "license": "MIT", "homepage": "http://gui.ninja", "ignore": [ "**/.*", "node_modules", "bower_components", "test", "tests" ], "private": true } If you wish to keep your project private, you can reduce your bower.json file to two properties before continuing: { "name": "my-app", "private": true } Once you have the initial version of your bower.json file set up the way you like it, you can begin installing components for your app. Bower components location and the .bowerrc file Bower will install components into a directory named bower_components by default. This directory will be located directly under the root of your project. If you wish to install your Bower components under a different directory name, you must create a local system file named .bowerrc and define the custom directory name there: { "directory": "path/to/my_components" } An object with only a single directory property name is all that is necessary to define a custom location for your Bower components. There are many other properties that can be configured within a .bowerrc file. For more information on configuring Bower, see bower.io/docs/config/. Bower dependencies Bower also allows you to define both the dependencies and devDependencies objects like NPM. The distinction with Bower, however, is that the dependencies object will contain the components necessary for running your app, while the devDependencies object is reserved for components that you might use for testing, transpiling, or anything that does not need to be included in your frontend stack. Bower packages are managed using the bower command from the CLI. This is a user command, so it does not require super user (sudo) permissions. Let's begin by installing jQuery as a frontend dependency for your app: $ bower install jquery --save The --save option on the command line will save the package and version number to the dependencies object in bower.json. Alternatively, you can use the -S option as a shortcut for --save: $ bower install jquery -S Next, let's install the Mocha JavaScript testing framework as a development dependency: $ bower install mocha --save-dev In this case, we will use --save-dev on the command line to save the package to the devDependencies object instead. Your bower.json file should now look similar to the following example: { "name": "my-app", "private": true, "dependencies": { "jquery": "~2.1.4" }, "devDependencies": { "mocha": "~2.3.4" } } Alternatively, you can use the -D option as a shortcut for --save-dev: $ bower install mocha –D You will notice that the package version numbers are preceded by the tilde (~) symbol by default, in contrast to the caret (^) symbol, as is the case with NPM. The tilde serves as a more stringent guard against package version updates. With a MAJOR.MINOR.PATCH version number, running bower update will only update to the latest patch version. If a version number is composed of only the major and minor versions, bower update will update the package to the latest minor version. Searching the Bower registry All registered Bower components are indexed and searchable through the command line. If you don't know the exact package name of a component you wish to install, you can perform a search to retrieve a list of matching names. Most components will have a list of keywords within their bower.json file so that you can more easily find the package without knowing the exact name. For example, you may want to install PhantomJS for headless browser testing: $ bower search phantomjs The list returned will include any package with phantomjs in the package name or within its keywords list: phantom git://github.com/ariya/phantomjs.git dt-phantomjs git://github.com/keesey/dt-phantomjs qunit-phantomjs-runner git://github.com/jonkemp/... parse-cookie-phantomjs git://github.com/sindresorhus/... highcharts-phantomjs git://github.com/pesla/highcharts-phantomjs.git mocha-phantomjs git://github.com/metaskills/mocha-phantomjs.git purescript-phantomjs git://github.com/cxfreeio/purescript-phantomjs.git You can see from the returned list that the correct package name for PhantomJS is in fact phantom and not phantomjs. You can then proceed to install the package now that you know the correct name: $ bower install phantom --save-dev Now, you have Bower installed and know how to manage your frontend web components and development tools, but how do you integrate them into your SPA? This is where Grunt comes in. Summary Now that you have learned to set up an optimal development environment with NPM and supply it with frontend dependencies using Bower, it's time to start learning more about building a real app. Resources for Article: Further resources on this subject: API with MongoDB and Node.js [article] Tips & Tricks for Ext JS 3.x [article] Responsive Visualizations Using D3.js and Bootstrap [article]

0
0
12818

How-To Tutorials

article-image-reactive-python-asynchronous-programming-rescue-part-1

Xavier Bruhiere

05 Oct 2016

7 min read

Reactive Python – Asynchronous programming to the rescue, Part 1

Xavier Bruhiere

05 Oct 2016

7 min read

On the Confluent website, you can find this title: Stream data changes everything From the createors of Kafka, a real-time messaging system, this is not a surprising assertion. Yet, data streaming infrastructures have gained in popularity and many projects require the data to be processed as soon as it shows up. This contributed to the development of famous technologies like Spark Stremaing, Apache Storm and more broadly websockets. This latest piece of software in particular brought real-time data feeds to web applications, trying to solve low-latency connections. Coupled with the asynchronous Node.js, you can build a powerful event-based reactive system. But what about Python? Given the popularity of the language in data science, would it be possible to bring the benefits of this kind of data ingestion? As this two-part post series will show, it turns out that modern Python (Python 3.4 or later) supports asynchronous data streaming apps. Introducing asyncio Python 3.4 introduced in the standard library the module asyncio to provision the language with: Asynchronous I/O, event loop, coroutines and tasks While Python treats functions as first-class objects (meaning you can assign them to variables and pass them as arguments), most developers follow an imperative programming style. It seems on purpose: It requires super human discipline to write readable code in callbacks and if you don’t believe me look at any piece of JavaScript code. - Guido van Rossum So Asyncio is the pythonic answer to asynchronous programming. This paradigm makes a lot of sense for otherwise costly I/O operations or when we need events to trigger code. Scenario For fun and profit, let's build such a project. We will simulate a dummy electrical circuit composed of three components: A clock regularly ticking A board I/O pin randomly choosing to toggle its binary state on clock events A buzzer buzzing when the I/O pin flips to one This set us up with an interesting machine-to-machine communication problem to solve. Note that the code snippets in this post make use of features like async and await introduced in Python 3.5. While it would be possible to backport to Python 3.4, I highly recommend that you follow along with the same version or newer. Anaconda or Pyenv can ease the installation process if necessary. $ python --version Python 3.5.1 $ pip --version pip 8.1.2 Asynchronous webscoket Client/Server Our first step, the clock, will introduce both asyncio and websocket basics. We need a straightforward method that fires tick signals through a websocket and wait for acknowledgement. # filename: sketch.py async def clock(socket, port, tacks=3, delay=1) The async keyword is sugar syntaxing introduced in Python 3.5 to replace the previous @asyncio.coroutine. The official pep 492 explains it all but the tldr : API quality. To simplify websocket connection plumbing, we can take advantage of the eponymous package: pip install websockets==3.5.1. It hides the protocol's complexity behind an elegant context manager. # filename: sketch.py # the path "datafeed" in this uri will be a parameter available in the other side but we won't use it for this example uri = 'ws://{socket}:{port}/datafeed'.format(socket=socket, port=port) # manage asynchronously the connection async with websockets.connect(uri) as ws: for payload in range(tacks): print('[ clock ] > {}'.format(payload)) # send payload and wait for acknowledgement await ws.send(str(payload)) print('[ clock ] < {}'.format(await ws.recv())) time.sleep(delay) The keyword await was introduced with async and replaces the old yield from to read values from asynchronous functions. Inside the context manager the connection stays open and we can stream data to the server we contacted. The server: IOPin At the core of our application are entities capable of speaking to each other directly. To make things fun, we will expose the same API as Arduino sketches, or a setup method that runs once at startup and a loop called when new data is available. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: factory.py import abc import asyncio import websockets class FactoryLoop(object): """ Glue components to manage the evented-loop model. """ __metaclass__ = abc.ABCMeta def__init__(self, *args, **kwargs): # call user-defined initialization self.setup(*args, **kwargs) def out(self, text): print('[ {} ] {}'.format(type(self).__name__, text)) @abc.abstractmethod def setup(self, *args, **kwargs): pass @abc.abstractmethod async def loop(self, channel, data): pass def run(self, host, port): try: server = websockets.serve(self.loop, host, port) self.out('serving on {}:{}'.format(host, port)) asyncio.get_event_loop().run_until_complete(server) asyncio.get_event_loop().run_forever() exceptOSError: self.out('Cannot bind to this port! Is the server already running?') exceptKeyboardInterrupt: self.out('Keyboard interruption, aborting.') asyncio.get_event_loop().stop() finally: asyncio.get_event_loop().close() The child objects will be required to implement setup and loop, while this class will take care of: Initializing the sketch Registering a websocket server based on a asynchronous callback (loop) Telling the event loop to poll for... events The websockets states the server callback is expected to have the signature on_connection(websocket, path). This is too low-level for our purpose. Instead, we can write a decorator to manage asyncio details, message passing, or error handling. We will only call self.loop with application-level-relevant information: the actual message and the websocket path. # filename: factory.py import functools import websockets def reactive(fn): @functools.wraps(fn) async def on_connection(klass, websocket, path): """Dispatch events and wrap execution.""" klass.out('** new client connected, path={}'.format(path)) # process messages as long as the connection is opened or # an error is raised whileTrue: try: message = await websocket.recv() aknowledgement = await fn(klass, path, message) await websocket.send(aknowledgement or 'n/a') except websockets.exceptions.ConnectionClosed as e: klass.out('done processing messages: {}n'.format(e)) break return on_connection Now we can develop a readable IOPin object. # filename: sketch.py import factory class IOPin(factory.FactoryLoop): """Set an IO pin to 0 or 1 randomly.""" def setup(self, chance=0.5, sequence=3): self.chance = chance self.sequence = chance def state(self): """Toggle state, sometimes.""" return0if random.random() < self.chance else1 @factory.reactive async def loop(self, channel, msg): """Callback on new data.""" self.out('new tick triggered on {}: {}'.format(channel, msg)) bits_stream = [self.state() for _ in range(self.sequence)] self.out('toggling pin state: {}'.format(bits_stream)) # ... # ... toggle pin state here # ... return'acknowledged' We finally need some glue to run both the clock and IOPin and test if the latter toggles its state when the former fires new ticks. The following snippet uses a convenient library, click 6.6, to parse command-line arguments. #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: arduino.py import sys import asyncio import click import sketchs @click.command() @click.argument('sketch') @click.option('-s', '--socket', default='localhost', help='Websocket to bind to') @click.option('-p', '--port', default=8765, help='Websocket port to bind to') @click.option('-t', '--tacks', default=5, help='Number of clock ticks') @click.option('-d', '--delay', default=1, help='Clock intervals') def main(sketch, **flags): if sketch == 'clock': # delegate the asynchronous execution to the event loop asyncio.get_event_loop().run_until_complete(sketchs.clock(**flags)) elif sketch == 'iopin': # arguments in the constructor go as is to our `setup` method sketchs.IOPin(chance=0.6).run(flags['socket'], flags['port']) else: print('unknown sketch, please choose clock, iopin or buzzer') return1 return0 if__name__ == '__main__': sys.exit(main()) Don't forget to chmod +x the script and start the server in a first terminal ./arduino.py iopin. When it is listening for connections, start the clock with ./arduino.py clock and watch them communicate! Note that we used here common default host and port so they can find each other. We have a good start with our app, and now in Part 2 we will further explore peer-to-peer communication, service discovery, and the streaming machine-to-machine concept. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.

0
0
7553

article-image-modern-natural-language-processing-part-2

Brian McMahan

05 Oct 2016

10 min read

Modern Natural Language Processing – Part 2

Brian McMahan

05 Oct 2016

10 min read

In this series I am going to first introduce the basics of data munging—converting from raw data into a processed form amenable for machine learning tasks. Then, I will cover the basics of prepping the data for a learning algorithm, including constructing a customized embedding matrix from the current state of the art embeddings (and if you don't know what embeddings are, I will cover that too). I will be going over a useful way of structuring the various components--data manager, training model, driver, and utilities—that simultaneously allows for fast implementation and flexibility for future modifications to the experiment. And finally, I will cover an instance of a training model, showing how it connects up to the infrastructure outlined here, then consequently trained on the data, evaluated for performance, and used for tasks like sampling sentences. Here in Part 2, we cover Igor, embeddings, serving data, and different sized sentences and masking. Prep and Data Servers Given the earlier implementations (see Part 1), the data is a much more amenable format. However, now it needs to be loaded, prepped, and poised for use. Igor The manager for our data and parameters is nicknamed Igor for the assistant to Frankenstein. I will get into many of these functions in the next blog post. For now, it is vital to know that Igor can store the parameters in its __dict__ which allows for referencing using dot notation. ## igor.py import yaml class Igor(object): def__init__(self, config): self.__dict__.update(config) @classmethod def from_file(cls, yaml_file): withopen(yaml_file) as fp: return cls(yaml.load(fp)) Embeddings Now that we have our data in integer format, let's prep the rest of the experiment. A vital component in many modern day NLP systems are what have been called the 'sriracha' of NLP: word embeddings. What exactly are they though? They are individual vectors mapped to tokens (like our integers) that were trained to optimize learning objectives that encouraged similar words to have similar vectors. The reason they are so useful is that it gives the model a head start—it can immediately start associating overlapping signals from similar words in different sentences. We're going to work with GloVe embeddings. You can obtain all of them for free from the Stanford website. The following code assumes an igor that has various vital parameters: #### embedding.conf embedding_size: 300 target_glove: /path/to/glove/glove.840B.300d.txt vocab_file: /path/to/vocab/words.vocab save_dir: /path/to/savedir/ data_file: data.pkl It also assumes that the 300-dimensional, 840 billion common crawl vectors are used. There are smaller ones if they are more appropriate to your task. We will be only using a subset of the vectors. And then you can use a function like the following to compute an embedding matrix. In the next blog post I will cover how to use it. Note that tqdm is used here, but it doesn't have to be. It's a very handy progress bar. Also note this: I use the keras implementation of the Glorot Uniform initializer for words that aren't in the embedding data. ### utils.py from os import makedirs, path import tqdm from keras.initializations import glorot_uniform def embeddings_from_vocab(igor, vocab): print("using vocab and glove file to generate embedding matrix") remaining_vocab = set(vocab.keys()) embeddings = np.zeros((len(vocab), igor.embedding_size)) print("{} words to convert".format(len(remaining_vocab))) if igor.save_dir[-1] != "/": igor.save_dir += "/" if not path.exists(igor.save_dir): makedirs(igor.save_dir) fileiter = open(igor.target_glove).readlines() for line in tqdm(fileiter): line = line.replace("n","").split(" ") try: word, nums = line[0], [float(x.strip()) for x in line[1:]] if word in remaining_vocab: embeddings[vocab[word]] = np.array(nums) remaining_vocab.remove(word) exceptExceptionas e: print("{} broke. exception: {}. line: {}.".format(word, e, x)) print("{} words were not in glove; saving to oov.txt".format(len(remaining_vocab))) withopen(path.join(igor.save_dir, "oov.txt"), "w") as fp: fp.write("n".join(remaining_vocab)) for word in tqdm(remaining_vocab): embeddings[vocab[word]] = np.asarray(glorot_uniform((igor.embedding_size,)).eval()) withopen(path.join(igor.save_dir, "embedding.npy"), "wb") as fp: np.save(fp, embeddings) Serving Data Igor's main task of serving data is broken down into two key functions: serve_single and serve_batch. The class below is more fleshed out than the last Igor class. This time, it includes these two functions as well as some others. There are several main things to notice in the implementation below: 1. Each sentence is placed into a zero-matrix that is potentially larger than it. This is essential to what is known as masking (more on this below). 2. The sentences are offset by one and the target data is the next word. 3. The data is being served in batches. This is to maximize the efficiency of GPU capabilities. 4. The target variable, out_Y is being formatted with a to_categorical function. This encodes an integer as a one-hot vector. A one-hot vector is a vector with zeros at every spot except for one position - It is going to be used here with a cross entropy loss - which basically just means it will compute the dot product between the probability of every output (which is the same size as out_Y) and out_Y. In effect, this is the same thing as selecting a single element from the output probability vector. ### igor.py from keras.utils.data_utils import get_file from keras.utils.np_utils import to_categorical import yaml import itertools import numpy as np try: import cPickle as pickle except: import pickle from utils import Vocabulary class Igor(object): def__init__(self, config): self.__dict__.update(config) @classmethod def from_file(cls, yaml_file): withopen(yaml_file) as fp: return cls(yaml.load(fp)) @property def num_train_batches(self): returnlen(self.train_data)//self.batch_size @property def num_dev_batches(self): returnlen(self.dev_data)//self.batch_size @property def num_test_batches(self): returnlen(self.test_data)//self.batch_size @property def num_train_samples(self): returnself.num_train_batches * self.batch_size @property def num_dev_samples(self): returnself.num_dev_batches * self.batch_size @property def num_test_samples(self): returnself.num_test_batches * self.batch_size def _serve_single(self, data): for data_i in np.random.choice(len(data), len(data), replace=False): in_X = np.zeros(self.sequence_length) out_Y = np.zeros(self.sequence_length, dtype=np.int32) bigram_data = zip(data[data_i][0:-1], data[data_i][1:]) for datum_j,(datum_in, datum_out) in enumerate(bigram_data): in_X[datum_j] = datum_in out_Y[datum_j] = datum_out yield in_X, out_Y def _serve_batch(self, data): dataiter = self._serve_single(data) V = self.vocab_size S = self.sequence_length B = self.batch_size while dataiter: in_X = np.zeros((B, S), dtype=np.int32) out_Y = np.zeros((B, S, V), dtype=np.int32) next_batch = list(itertools.islice(dataiter, 0, self.batch_size)) iflen(next_batch) < self.batch_size: raiseStopIteration for d_i, (d_X, d_Y) in enumerate(next_batch): in_X[d_i] = d_X out_Y[d_i] = to_categorical(d_Y, V) yield in_X, out_Y def _data_gen(self, data, forever=True): ### extra boolean here so that it can go once through while loop working = True while working: for batch in self._serve_batch(data): yield batch working = working and forever def dev_gen(self, forever=True): returnself._data_gen(self.dev_data, forever) def train_gen(self, forever=True): returnself._data_gen(self.train_data, forever) def test_gen(self): returnself._data_gen(self.test_data, False) def prep(self): ## this assumes converted integer data has been placed into a pickle withopen(self.data_file) as fp: self.train_data, self.dev_data, self.test_data = pickle.load(fp) ifself.embeddings_file: self.saved_embeddings = np.load(self.embeddings_file) else: self.saved_embeddings = None self.vocab = Vocabulary.load(self.vocab_file) self.vocab_size = len(self.vocab) self.sequence_length = max(map(len, self.train_data+self.dev_data+self.train_data)) Different Sized Sentences and Masking There is one last piece of set-up information. In order to handle different sized sentences, you need to use a mask. What exactly is a mask, though? Well, since we are loading our data into a matrix that has the same size on each dimension, we have to adjust the values for the sentences of different length. For this task, we will use a specific numeric value set at the positions where there is no data. This is recognized by Keras internally as corresponding to positions where it should mask. More specifically, since we are using the Embedding layer, it will check where the input data equals this masked value. It will then push a binary matrix forward through your construct model so that it gets used in the correct spots. Note: There are a few types of layers which Keras can't push the mask through (without some clever finagling), but for this model it will. I will discuss how the mask gets used in the next post, but just know that the zero indexed token in our Vocabulary and the zeros in the data matrix correspond to masked positions. Conclusion And that's it! The data is now ready to be loaded up and served. An end-of-post note: Most of the prep code should be placed into a single, preprocessing script. It's sometimes easy just to add it to the bottom of the utils file. #### at the bottom of utils.py if__name__ == "__main__": print("getting data") raw_data = get_data() print("processing data") data, indices = process_raw_data(raw_data) print("formatting data") data, vocab = format_data(*data) print("making embeddings") from igor import Igor igor = Igor.from_file('embedding.conf') withopen(igor.data_file, 'w') as fp: pickle.dump(data, fp) vocab.save(path.join(igor.save_dir, igor.vocab_file)) embeddings_from_vocab(igor, vocab) and some of the important igor parameters so far: batch_size: 64 embedding_size: 300 rnn_size: 32 learning_rate: 0.0001 num_epochs: 100 ### set during computation vocab_size: 0 sequence_length: 0 ### file stuff data_file: data.pkl vocab_file: words.vocab embeddings_file: embedding.npy #~ # /path/to/embedding.npy ## or, if none, then ~ checkpoint_filepath: cp_weights.h5 Be sure to read Part 3 where I outline a language model and discuss the modeling choices. I will outline the algorithms needed to both decode from the language model and to sample from it. About the author Brian McMahan is in his final year of graduate school at Rutgers University, completing a PhD in computer science and an MS in cognitive psychology. He holds a BS in cognitive science from Minnesota State University, Mankato. At Rutgers, Brian investigates how natural language and computer vision can be brought closer together with the aim of developing interactive machines that can coordinate in the real world. His research uses machine learning models to derive flexible semantic representations of open-ended perceptual language.

0
0
1618

How-To Tutorials

article-image-reactive-python-real-time-events-processing

Xavier Bruhiere

04 Oct 2016

8 min read

Reactive Python - Real-time events processing

Xavier Bruhiere

04 Oct 2016

8 min read

A recent trend in programming literature promotes functional programming as a sensible alternative to object-oriented programs for many use cases. This subject feeds many discussions and highlights how important program design is as our applications are becoming more and more complex. Although there might be here some seductive intellectual challenge (because yeah, we love to juggle with elegant abstractions), there are also real business values : Building sustainable, maintainable programs Decoupling architecture components for proper team work Limiting bug exposure Better product iteration When developers spot an interesting approach to solve a recurrent issue in our industry, they formalize it as a design pattern. Today, we will discuss a powerful member of this family: the pattern observer. We won't dive into the strict rhetorical details (sorry, not sorry). Instead, we will delve how reactive programming can level up the quality of our work. It's Python Week. That means you can not only save 50% on some of our latest Python products, but you can also pick up a free Python eBook every single day! The scene That was a bold statement; let's illustrate that with a real-world scenario. Say we were tasked to build a monitoring system. We need some way to collect data, analyze it, and take actions when things go unexpected. Anomaly detection is an exciting yet challenging problem. We don't want our data scientists to be bothered by infrastructure failures. And in the same spirit, we need other engineers to focus only on how to react to specific disaster scenarios. The core of our approach consists of two components—a monitoring module firing and forgetting its discoveries on channels and another processing brick intercepting those events with an appropriate response. The UNIX philosophy at its best: do one thing and do it well. We split the infrastructure by concerns and the workers by event types. Assuming that our team defines well-documented interfaces, this is a promising design. The rest of the article will discuss the technical implementation but keep in mind that I/O documentation and proper processing of load estimation are also fundamental. The strategy Our local lab is composed of three elements: The alert module that we will emulate with a simple cli tool, which publishes alert messages. The actual processing unit subscribing to events it knows how to react to. A message broker supporting the Publish / Subscribe (or PUBSUB) pattern. For this purpose, Redis offers a popular, efficient, and rock solid solution. This is highly recommended, but the database isn't designed for this case. NATS, however, presents itself as follows: NATS acts as a central nervous system for distributed systems such as mobile devices, IoT networks, enterprise microservices and cloud native infrastructure. Unlike traditional enterprise messaging systems, NATS provides an always on ‘dial-tone’. Sounds promising! Client libraries are available for major languages, and Apcera, the company sponsoring the technology, has a solid reputation for building reliable distributed systems. Again, we won't delve how processing actually happens, only the orchestration of this three moving parts. The setup Since NATS is a message broker, we need to run a server locally (version 0.8.0 as of today). Gnatsd is the official and scalable first choice. It is written in Go, so we get performances and drop-in binary out of the box. For fans of microservices (as I am), an official Docker image is available for pulling. Also, for lazy ones (as I am), a demo server is already running at nats://demo.nats.io:4222. Services will use Python 3.5.1, but 2.7.10 should do the job with minimal changes. Our scenario is mostly about data analysis and system administration on the backend, and Python has a wide range of tools for both areas. So let's install the requirements: $ pip --version pip 8.1.1 $ pip install -e git+https://github.com/mcuadros/pynats@6851e84eb4b244d22ffae65e9fbf79bd9872a5b3#egg=pynats click==6.6 # for cli integration Thats'all. We are now ready to write services. Publishing events Let's warm up by sending some alerts to the cloud. First, we need to connect to the NATS server: # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: broker.py import pynats def nats_conn(conf): """Connect to nats server from environment variables. The point is to allow easy switching without to change the code. You can read more on this approach stolen from 12 factors apps. """ # the default value comes from docker-compose (https://docs.docker.com/compose/) services link behavior host = conf.get('__BROKER_HOST__', 'nats') port = conf.get('__BROKER_PORT__', 4222) opts = { 'url': conf.get('url', 'nats://{host}:{port}'.format(host=host, port=port)), 'verbose': conf.get('verbose', False) } print('connecting to broker ({opts})'.format(opts=opts)) conn = pynats.Connection(**opts) conn.connect() return conn This should be enough to start our client: # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: observer.py import os import broker def send(channel, msg): # use environment variables for configuration nats = broker.nats_conn(os.environ) nats.publish(channel, msg) nats.close() And right after that, a few lines of code to shape a cli tool: #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: __main__.py import click @click.command() @click.argument('command') @click.option('--on', default='some_event', help='messages topic name') def main(command, on): if command == 'send': click.echo('publishing message') observer.send(on, 'Terminator just dropped in our space-time') if__name__ == '__main__': main() chmod +x ./__main__.py gives it execution permission so we can test how our first bytes are doing. $ # `click` package gives us a productive cli interface $ ./__main__.py --help Usage: __main__.py [OPTIONS] COMMAND Options: --on TEXT messages topic name --help Show this message and exit. $ __BROKER_HOST__="demo.nats.io"./__main__.py send --on=click connecting to broker ({'verbose': False, 'url': 'nats://demo.nats.io:4222'}) publishing message ... This is indeed quite poor in feedback, but no exception means that we did connect to the server and published a message. Reacting to events We're done with the heavy lifting! Now that interesting events are flying through the Internet, we can catch them and actually provide business values. Don't forget the point: let the team write reactive programs without worrying how it will be triggered. I found the following snippet to be a readable syntax for such a goal: # filename: __main__.py import observer @observer.On('terminator_detected') def alert_sarah_connor(msg): print(msg.data) As the capitalized letter of On suggests, this is a Python class, wrapping a NATS connection. It aims to call the decorated function whenever a new message goes through the given channel. Here is a naive implementation shamefully ignoring any reasonable error handling and safe connection termination (broker.nats_conn would be much more production-ready as a context manger, but hey, we do things that don't scale, move fast, and break things): # filename: observer.py class On(object): def__init__(self, event_name, **kwargs): self._count = kwargs.pop('count', None) self._event = event_name self._opts = kwargs or os.environ def__call__(self, fn): nats = broker.nats_conn(self._opts) subscription = nats.subscribe(self._event, fn) def inner(): print('waiting for incoming messages') nats.wait(self._count) # we are done nats.unsubscribe(subscription) return nats.close() return inner Instil some life into this file from the __main__.py: # filename: __main__.py @click.command() @click.argument('command') @click.option('--on', default='some_event', help='messages topic name') def main(command, on): if command == 'send': click.echo('publishing message') observer.send(on, 'bad robot detected') elif command == 'listen': try: alert_sarah_connor(): exceptKeyboardInterrupt: click.echo('caught CTRL-C, cleaning after ourselves...') Your linter might complain about the injection of the msg argument in alert_sarah_connor, but no offense, it should just work (tm): $ In a first terminal, listen to messages $ __BROKER_HOST__="demo.nats.io"./__main__.py listen connecting to broker ({'url': 'nats://demo.nats.io:4222', 'verbose': False}) waiting for incoming messages $ And fire up alerts in a second terminal __BROKER_HOST__="demo.nats.io"--on='terminator_detected' The data appears in the first terminal, celebrate! Conclusion Reactive programming implemented with the Publish/Subscribe pattern brings a lot of benefits for events-oriented products. Modular development, decoupled components, scalable distributed infrastructure, single-responsibility principle.One should think about how data flows into the system before diving into the technical details. This kind of approach also gains traction from real-time data processing pipelines (Riemann, Spark, and Kafka). NATS performances, indeed, allow ultra low-latency architectures development without too much of a deployment overhead. We covered in a few lines of Python the basics of a reactive programming design, with a lot of improvement opportunities: events filtering, built-in instrumentation, and infrastructure-wide error tracing. I hope you found in this article the building block to develop upon! About the author Xavier Bruhiere is the lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.

0
0
6204

Packt

04 Oct 2016

6 min read

Bootstrap and Angular: Saying Hello!

Packt

04 Oct 2016

6 min read

In this article by Sergey Akopkokhyants, author of the book Learning Web Development with Bootstrap and Angular (Second Edition), will establish a development environment for the simplest application possible. (For more resources related to this topic, see here.) Development environment setup It's time to set up your development environment. This process is one of the most overlooked and often frustrating parts of learning to program because developers don't want to think about it. The developers must know nuances how to install and configure many different programs before they start real development. Everyone's computers are different as a result; the same setup may not work on your computer. We will expose and eliminate all of these problems by defining the various pieces of environment you need to setup. Defining shell The shell is a required part of your software development environment. We will use the shell to install software, run commands to build and start the web server to bring the life to your web project. If your computer has installed Linux operating system then you will use the shell called Terminal. There are many Linux-based distributions out there that use diverse desktop environments, but most of them use the equivalent keyboard shortcut to open the Terminal. Use keyboard shortcut Ctrl + Alt + T to open Terminal in Linux. If you have a Mac computer with installed OS X, then you will use the Terminal shell as well. Use keyboard shortcut Command + Space to open the Spotlight, type Terminal to search and run. If you have a computer with installed Windows operation system, you can use the standard command prompt, but we can do better. In a minute later I will show you how can you install the Git on your computer, and you will have Git Bash free. You can open a Terminal with Git Bash shell program on Windows. I will use the shell bash for all exercises in this book whenever I need to work in the Terminal. Installing Node.js The Node.js is technology we will use as a cross-platform runtime environment for running server-side Web applications. It is a combination of native, platform independent runtime based on Google's V8 JavaScript engine and a huge number of modules written in JavaScript. Node.js ships with different connectors and libraries help you use HTTP, TLS, compression, file system access, raw TCP and UDP, and more. You as a developer can write own modules on JavaScript and run them inside Node.js engine. The Node.js runtime makes ease build a network, event-driven application servers. The terms package and library are synonymous in JavaScript so that we will use them interchangeably. Node.js is utilizing JavaScript Object Notation (JSON) format widely in data exchange between server and client sides because it readily expressed in several parse diagrams, notably without complexities of XML, SOAP, and other data exchange formats. You can use Node.js for the development of the service-oriented applications, doing something different than web servers. One of the most popular service-oriented application is Node Package Manager (NPM) we will use to manage library dependencies, deployment systems, and underlies the many platform-as-a-service (PaaS) providers for Node.js. If you do not have Node.js installed on your computer, you shall download the pre-build installer from https://nodejs.org/en/download. You can start to use the Node.js immediately after installation. Open the Terminal and type: node ––version The Node.js must respond with version number of installed runtime: v4.4.3 Setting up NPM The NPM is a package manager for JavaScript. You can use it to find, share, and reuse packages of code from many developers across the world. The number of packages dramatically grows every day and now is more than 250K. NPM is a Node.js package manager and utilizes it to run itself. NPM is included in setup bundle of Node.js and available just after installation. Open the Terminal and type: npm ––version The NPM must answer on your command with version number: 2.15.1 The following command gives us information about Node.js and NPM install: npm config list There are two ways to install NPM packages: locally or globally. In cases when you would like to use the package as a tool better install it globally: npm install ––global <package_name> If you need to find the folder with globally installed packages you can use the next command: npm config get prefix Installation global packages are important, but best avoid if not needed. Mostly you will install packages locally. npm install <package_name> You may find locally installed packages in a node_modules folder of your project. Installing Git You missed a lot if you are not familiar with Git. Git is a distributed version control system and each Git working directory is a full-fledged repository. It keeps the complete history of changes and has full version tracking capabilities. Each repository is entirely independent of network access or a central server. You can install Git on your computer via a set of pre-build installers available on official website https://git-scm.com/downloads. After installation, you can open the Terminal and type git –version Git must respond with version number git version 2.8.1.windows.1 As I said for developers who use computers with installed Windows operation system now, you have Git Bash free on your system. Code editor You can imagine how many programs for code editing exists but we will talk today only about free, open source and runs everywhere Visual Studio Code from Microsoft. You can use any program you prefer for development, but I use only Visual Studio Code in our future exercises, so please install it from http://code.visualstudio.com/Download. Summary This article, we learned about shell concept, how to install Node.js and Git, and setting up node packages. Resources for Article: Further resources on this subject: Gearing Up for Bootstrap 4 [article] API with MongoDB and Node.js [article] Mapping Requirements for a Modular Web Shop App [article]

0
0
29958

How-To Tutorials

article-image-supervised-machine-learning

Packt

04 Oct 2016

13 min read

Supervised Machine Learning

Packt

04 Oct 2016

13 min read

In this article by Anshul Joshi, the author of the book Julia for Data Science, we will learn that data science involves understanding data, gathering data, munging data, taking the meaning out of that data, and then machine learning if needed. Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. (For more resources related to this topic, see here.) The key features offered by Julia are: A general purpose high-level dynamic programming language designed to be effective for numerical and scientific computing A Low-Level Virtual Machine (LLVM) based Just-in-Time (JIT) compiler that enables Julia to approach the performance of statically-compiled languages like C/C++ What is machine learning? Generally, when we talk about machine learning, we get into the idea of us fighting wars with intelligent machines that we created but went out of control. These machines are able to outsmart the human race and become a threat to human existence. These theories are nothing but created for our entertainment. We are still very far away from such machines. So, the question is: what is machine learning? Tom M. Mitchell gave a formal definition- "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E." It says that machine learning is teaching computers to generate algorithms using data without programming them explicitly. It transforms data into actionable knowledge. Machine learning has close association with statistics, probability, and mathematical optimization. As technology grew, there is one thing that grew with it exponentially—data. We have huge amounts of unstructured and structured data growing at a very great pace. Lots of data is generated by space observatories, meteorologists, biologists, fitness sensors, surveys, and so on. It is not possible to manually go through this much amount of data and find patterns or gain insights. This data is very important for scientists, domain experts, governments, health officials, and even businesses. To gain knowledge out of this data, we need self-learning algorithms that can help us in decision making. Machine learning evolved as a subfield of artificial intelligence, which eliminates the need to manually analyze large amounts of data. Instead of using machine learning, we make data-driven decisions by gaining knowledge using self-learning predictive models. Machine learning has become important in our daily lives. Some common use cases include search engines, games, spam filters, and image recognition. Self-driving cars also use machine learning. Some basic terminologies used in machine learning: Features: Distinctive characteristics of the data point or record Training set: This is the dataset that we feed to train the algorithm that helps us to find relationships or build a model Testing set: The algorithm generated using the training dataset is tested on the testing dataset to find the accuracy Feature vector: An n-dimensional vector that contains the features defining an object Sample: An item from the dataset or the record Uses of machine learning Machine learning in one way or another is used everywhere. Its applications are endless. Let's discuss some very common use cases: E-mail spam filtering: Every major e-mail service provider uses machine learning to filter out spam messages from the Inbox to the Spam folder. Predicting storms and natural disasters: Machine learning is used by meteorologists and geologists to predict the natural disasters using weather data, which can help us to take preventive measures. Targeted promotions/campaigns and advertising: On social sites, search engines, and maybe in mailboxes, we see advertisements that somehow suit our taste. This is made feasible using machine learning on the data from our past searches, our social profile or the e-mail contents. Self-driving cars: Technology giants are currently working on self driving cars. This is made possible using machine learning on the feed of the actual data from human drivers, image and sound processing, and various other factors. Machine learning is also used by businesses to predict the market. It can also be used to predict the outcomes of elections and the sentiment of voters towards a particular candidate. Machine learning is also being used to prevent crime. By understanding the pattern of the different criminals, we can predict a crime that can happen in future and can prevent it. One case that got a huge amount of attention was of a big retail chain in the United States using machine learning to identify pregnant women. The retailer thought of the strategy to give discounts on multiple maternity products, so that they would become loyal customers and will purchase items for babies which have a high profit margin. The retailer worked on the algorithm to predict the pregnancy using useful patterns in purchases of different products which are useful for pregnant women. Once a man approached the retailer and asked for the reason that his teenage daughter is receiving discount coupons for maternity items. The retail chain offered an apology but later the father himself apologized when he got to know that his daughter was indeed pregnant. This story may or may not be completely true, but retailers indeed analyze their customers' data routinely to find out patterns and for targeted promotions, campaigns, and inventory management. Machine learning and ethics Let's see where machine learning is used very frequently: Retailers: In the previous example, we mentioned how retail chains use data for machine learning to increase their revenue as well as to retain their customers. Spam filtering: E-mails are processed using various machine learning algorithms for spam filtering. Targeted advertisements: In our mailbox, social sites, or search engines, we see advertisements of our liking. These are only some of the actual use cases that are implemented in the world today. One thing that is common between them is the user data. In the first example, retailers are using the history of transactions done by the user for targeted promotions and campaigns and for inventory management, among other things. Retail giants do this by providing users a loyalty or sign-up card. In the second example, the e-mail service provider uses trained machine learning algorithms to detect and flag spam. It does by going through the contents of e-mail/attachments and classifying the sender of the e-mail. In the third example, again the e-mail provider, social network, or search engine will go through our cookies, our profile, or our mails to do the targeted advertising. In all of these examples, it is mentioned in the terms and conditions of the agreement when we sign up with the retailer, e-mail provider, or social network that the user's data will be used but privacy will not be violated. It is really important that before using data that is not publicly available, we take the required permissions. Also, our machine learning models shouldn't do discrimination on the basis of region, race, and sex or of any other kind. The data provided should not be used for purposes not mentioned in the agreement or illegal in the region or country of existence. Machine learning – the process Machine learning algorithms are trained in keeping with the idea of how the human brain works. They are somewhat similar. Let's discuss the whole process. The machine learning process can be described in three steps: Input Abstraction Generalization These three steps are the core of how the machine learning algorithm works. Although the algorithm may or may not be divided or represented in such a way, this explains the overall approach. The first step concentrates on what data should be there and what shouldn't. On the basis of that, it gathers, stores, and cleans the data as per the requirements. The second step involves that the data be translated to represent the bigger class of data. This is required as we cannot capture everything and our algorithm should not be applicable for only the data that we have. The third step focuses on the creation of the model or an action that will use this abstracted data, which will be applicable for the broader mass. So, what should be the flow of approaching a machine learning problem? In this particular figure, we see that the data goes through the abstraction process before it can be used to create the machine learning algorithm. This process itself is cumbersome. The process follows the training of the model, which is fitting the model into the dataset that we have. The computer does not pick up the model on its own, but it is dependent on the learning task. The learning task also includes generalizing the knowledge gained on the data that we don't have yet. Therefore, training the model is on the data that we currently have and the learning task includes generalization of the model for future data. It depends on our model how it deduces knowledge from the dataset that we currently have. We need to make such a model that can gather insights into something that wasn't known to us before and how it is useful and can be linked to the future data. Different types of machine learning Machine learning is divided mainly into three categories: Supervised learning Unsupervised learning Reinforcement learning In supervised learning, the model/machine is presented with inputs and the outputs corresponding to those inputs. The machine learns from these inputs and applies this learning in further unseen data to generate outputs. Unsupervised learning doesn't have the required outputs; therefore it is up to the machine to learn and find patterns that were previously unseen. In reinforcement learning, the machine continuously interacts with the environment and learns through this process. This includes a feedback loop. Understanding decision trees Decision tree is a very good example of divide and conquer. It is one of the most practical and widely used methods for inductive inference. It is a supervised learning method that can be used for both classification and regression. It is non-parametric and its aim is to learn by inferring simple decision rules from the data and create such a model that can predict the value of the target variable. Before taking a decision, we analyze the probability of the pros and cons by weighing the different options that we have. Let's say we want to purchase a phone and we have multiple choices in the price segment. Each of the phones has something really good, and maybe better than the other. To make a choice, we start by considering the most important feature that we want. And like this, we create a series of features that it has to pass to become the ultimate choice. In this section, we will learn about: Decision trees Entropy measures Random forests We will also learn about famous decision tree learning algorithms such as ID3 and C5.0. Decision tree learning algorithms There are various decision tree learning algorithms that are actually variations of the core algorithm. The core algorithm is actually a top-down, greedy search through all possible trees. We are going to discuss two algorithms: ID3 C4.5 and C5.0 The first algorithm, Iterative Dichotomiser 3 (ID3), was developed by Ross Quinlan in 1986. The algorithm proceeds by creating a multiway tree, where it uses greedy search to find each node and the features that can yield maximum information gain for the categorical targets. As trees can grow to the maximum size, which can result in over-fitting of data, pruning is used to make the generalized model. C4.5 came after ID3 and eliminated the restriction that all features must be categorical. It does this by defining dynamically a discrete attribute based on the numerical variables. This partitions into a discrete set of intervals from the continuous attribute value. C4.5 creates sets of if-then rules from the trained trees of the ID3 algorithm. C5.0 is the latest version; it builds smaller rule sets and uses comparatively lesser memory. An example Let's apply what we've learned to create a decision tree using Julia. We will be using the example available for Python on scikit-learn.org and Scikitlearn.jl by Cedric St-Jean. We will first have to add the required packages: We will first have to add the required packages: julia> Pkg.update() julia> Pkg.add("DecisionTree") julia> Pkg.add("ScikitLearn") julia> Pkg.add("PyPlot") ScikitLearn provides the interface to the much-famous library of machine learning for Python to Julia: julia> using ScikitLearn julia> using DecisionTree julia> using PyPlot After adding the required packages, we will create the dataset that we will be using in our example: julia> # Create a random dataset julia> srand(100) julia> X = sort(5 * rand(80)) julia> XX = reshape(X, 80, 1) julia> y = sin(X) julia> y[1:5:end] += 3 * (0.5 – rand(16)) This will generate a 16-element Array{Float64,1}. Now we will create instances of two different models. One model is where we will not limit the depth of the tree, and in other model, we will prune the decision tree on the basis of purity: We will now fit the models to the dataset that we have. We will fit both the models. This is the first model. Here our decision tree has 25 leaf nodes and a depth of 8. This is the second model. Here we prune our decision tree. This has six leaf nodes and a depth of 4. Now we will use the models to predict on the test dataset: julia> # Predict julia> X_test = 0:0.01:5.0 julia> y_1 = predict(regr_1, hcat(X_test)) julia> y_2 = predict(regr_2, hcat(X_test)) This creates a 501-element Array{Float64,1}. To better understand the results, let's plot both the models on the dataset that we have: julia> # Plot the results julia> scatter(X, y, c="k", label="data") julia> plot(X_test, y_1, c="g", label="no pruning", linewidth=2) julia> plot(X_test, y_2, c="r", label="pruning_purity_threshold=0.05", linewidth=2) julia> xlabel("data") julia> ylabel("target") julia> title("Decision Tree Regression") julia> legend(prop=Dict("size"=>10)) Decision trees can tend to overfit data. It is required to prune the decision tree to make it more generalized. But if we do more pruning than required, then it may lead to an incorrect model. So, it is required that we find the most optimized pruning level. It is quite evident that the first decision tree overfits to our dataset, whereas the second decision tree model is comparatively more generalized. Summary In this article, we learned about machine learning and its uses. Providing computers the ability to learn and improve has far-reaching uses in this world. It is used in predicting disease outbreaks, predicting weather, games, robots, self-driving cars, personal assistants, and lot more. There are three different types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. We also learned about decision trees. Resources for Article: Further resources on this subject: Specialized Machine Learning Topics [article] Basics of Programming in Julia [article] More about Julia [article]

0
0
1994

Basics of Image Histograms in OpenCV

Reactive Python - Asynchronous programming to the rescue, Part 2

How to start Chainer

Modern Natural Language Processing – Part 3

Introduction to Neural Networks with Chainer – Part 3

How to Manage Legacy Code

Frontend development with Bootstrap 4

Basics of Classes and Objects

Python for Driving Hardware

Getting Organized with NPM and Bower

Trending Topics

Reactive Python – Asynchronous programming to the rescue, Part 1

Modern Natural Language Processing – Part 2

Reactive Python - Real-time events processing

Bootstrap and Angular: Saying Hello!

Supervised Machine Learning

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access