Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-reactive-python-asynchronous-programming-rescue-part-1
Xavier Bruhiere
05 Oct 2016
7 min read
Save for later

Reactive Python – Asynchronous programming to the rescue, Part 1

Xavier Bruhiere
05 Oct 2016
7 min read
On the Confluent website, you can find this title: Stream data changes everything From the createors of Kafka, a real-time messaging system, this is not a surprising assertion. Yet, data streaming infrastructures have gained in popularity and many projects require the data to be processed as soon as it shows up. This contributed to the development of famous technologies like Spark Stremaing, Apache Storm and more broadly websockets. This latest piece of software in particular brought real-time data feeds to web applications, trying to solve low-latency connections. Coupled with the asynchronous Node.js, you can build a powerful event-based reactive system. But what about Python? Given the popularity of the language in data science, would it be possible to bring the benefits of this kind of data ingestion? As this two-part post series will show, it turns out that modern Python (Python 3.4 or later) supports asynchronous data streaming apps. Introducing asyncio Python 3.4 introduced in the standard library the module asyncio to provision the language with: Asynchronous I/O, event loop, coroutines and tasks While Python treats functions as first-class objects (meaning you can assign them to variables and pass them as arguments), most developers follow an imperative programming style. It seems on purpose: It requires super human discipline to write readable code in callbacks and if you don’t believe me look at any piece of JavaScript code. - Guido van Rossum So Asyncio is the pythonic answer to asynchronous programming. This paradigm makes a lot of sense for otherwise costly I/O operations or when we need events to trigger code. Scenario For fun and profit, let's build such a project. We will simulate a dummy electrical circuit composed of three components: A clock regularly ticking A board I/O pin randomly choosing to toggle its binary state on clock events A buzzer buzzing when the I/O pin flips to one This set us up with an interesting machine-to-machine communication problem to solve. Note that the code snippets in this post make use of features like async and await introduced in Python 3.5. While it would be possible to backport to Python 3.4, I highly recommend that you follow along with the same version or newer. Anaconda or Pyenv can ease the installation process if necessary. $ python --version Python 3.5.1 $ pip --version pip 8.1.2 Asynchronous webscoket Client/Server Our first step, the clock, will introduce both asyncio and websocket basics. We need a straightforward method that fires tick signals through a websocket and wait for acknowledgement. # filename: sketch.py async def clock(socket, port, tacks=3, delay=1) The async keyword is sugar syntaxing introduced in Python 3.5 to replace the previous @asyncio.coroutine. The official pep 492 explains it all but the tldr : API quality. To simplify websocket connection plumbing, we can take advantage of the eponymous package: pip install websockets==3.5.1. It hides the protocol's complexity behind an elegant context manager. # filename: sketch.py # the path "datafeed" in this uri will be a parameter available in the other side but we won't use it for this example uri = 'ws://{socket}:{port}/datafeed'.format(socket=socket, port=port) # manage asynchronously the connection async with websockets.connect(uri) as ws: for payload in range(tacks): print('[ clock ] > {}'.format(payload)) # send payload and wait for acknowledgement await ws.send(str(payload)) print('[ clock ] < {}'.format(await ws.recv())) time.sleep(delay) The keyword await was introduced with async and replaces the old yield from to read values from asynchronous functions. Inside the context manager the connection stays open and we can stream data to the server we contacted. The server: IOPin At the core of our application are entities capable of speaking to each other directly. To make things fun, we will expose the same API as Arduino sketches, or a setup method that runs once at startup and a loop called when new data is available. # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: factory.py import abc import asyncio import websockets class FactoryLoop(object): """ Glue components to manage the evented-loop model. """ __metaclass__ = abc.ABCMeta def__init__(self, *args, **kwargs): # call user-defined initialization self.setup(*args, **kwargs) def out(self, text): print('[ {} ] {}'.format(type(self).__name__, text)) @abc.abstractmethod def setup(self, *args, **kwargs): pass @abc.abstractmethod async def loop(self, channel, data): pass def run(self, host, port): try: server = websockets.serve(self.loop, host, port) self.out('serving on {}:{}'.format(host, port)) asyncio.get_event_loop().run_until_complete(server) asyncio.get_event_loop().run_forever() exceptOSError: self.out('Cannot bind to this port! Is the server already running?') exceptKeyboardInterrupt: self.out('Keyboard interruption, aborting.') asyncio.get_event_loop().stop() finally: asyncio.get_event_loop().close() The child objects will be required to implement setup and loop, while this class will take care of: Initializing the sketch Registering a websocket server based on a asynchronous callback (loop) Telling the event loop to poll for... events The websockets states the server callback is expected to have the signature on_connection(websocket, path). This is too low-level for our purpose. Instead, we can write a decorator to manage asyncio details, message passing, or error handling. We will only call self.loop with application-level-relevant information: the actual message and the websocket path. # filename: factory.py import functools import websockets def reactive(fn): @functools.wraps(fn) async def on_connection(klass, websocket, path): """Dispatch events and wrap execution.""" klass.out('** new client connected, path={}'.format(path)) # process messages as long as the connection is opened or # an error is raised whileTrue: try: message = await websocket.recv() aknowledgement = await fn(klass, path, message) await websocket.send(aknowledgement or 'n/a') except websockets.exceptions.ConnectionClosed as e: klass.out('done processing messages: {}n'.format(e)) break return on_connection Now we can develop a readable IOPin object. # filename: sketch.py import factory class IOPin(factory.FactoryLoop): """Set an IO pin to 0 or 1 randomly.""" def setup(self, chance=0.5, sequence=3): self.chance = chance self.sequence = chance def state(self): """Toggle state, sometimes.""" return0if random.random() < self.chance else1 @factory.reactive async def loop(self, channel, msg): """Callback on new data.""" self.out('new tick triggered on {}: {}'.format(channel, msg)) bits_stream = [self.state() for _ in range(self.sequence)] self.out('toggling pin state: {}'.format(bits_stream)) # ... # ... toggle pin state here # ... return'acknowledged' We finally need some glue to run both the clock and IOPin and test if the latter toggles its state when the former fires new ticks. The following snippet uses a convenient library, click 6.6, to parse command-line arguments. #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: arduino.py import sys import asyncio import click import sketchs @click.command() @click.argument('sketch') @click.option('-s', '--socket', default='localhost', help='Websocket to bind to') @click.option('-p', '--port', default=8765, help='Websocket port to bind to') @click.option('-t', '--tacks', default=5, help='Number of clock ticks') @click.option('-d', '--delay', default=1, help='Clock intervals') def main(sketch, **flags): if sketch == 'clock': # delegate the asynchronous execution to the event loop asyncio.get_event_loop().run_until_complete(sketchs.clock(**flags)) elif sketch == 'iopin': # arguments in the constructor go as is to our `setup` method sketchs.IOPin(chance=0.6).run(flags['socket'], flags['port']) else: print('unknown sketch, please choose clock, iopin or buzzer') return1 return0 if__name__ == '__main__': sys.exit(main()) Don't forget to chmod +x the script and start the server in a first terminal ./arduino.py iopin. When it is listening for connections, start the clock with ./arduino.py clock and watch them communicate! Note that we used here common default host and port so they can find each other. We have a good start with our app, and now in Part 2 we will further explore peer-to-peer communication, service discovery, and the streaming machine-to-machine concept. About the author Xavier Bruhiere is a lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.
Read more
  • 0
  • 0
  • 7553

article-image-modern-natural-language-processing-part-2
Brian McMahan
05 Oct 2016
10 min read
Save for later

Modern Natural Language Processing – Part 2

Brian McMahan
05 Oct 2016
10 min read
In this series I am going to first introduce the basics of data munging—converting from raw data into a processed form amenable for machine learning tasks. Then, I will cover the basics of prepping the data for a learning algorithm, including constructing a customized embedding matrix from the current state of the art embeddings (and if you don't know what embeddings are, I will cover that too). I will be going over a useful way of structuring the various components--data manager, training model, driver, and utilities—that simultaneously allows for fast implementation and flexibility for future modifications to the experiment. And finally, I will cover an instance of a training model, showing how it connects up to the infrastructure outlined here, then consequently trained on the data, evaluated for performance, and used for tasks like sampling sentences. Here in Part 2, we cover Igor, embeddings, serving data, and different sized sentences and masking. Prep and Data Servers Given the earlier implementations (see Part 1), the data is a much more amenable format. However, now it needs to be loaded, prepped, and poised for use. Igor The manager for our data and parameters is nicknamed Igor for the assistant to Frankenstein. I will get into many of these functions in the next blog post. For now, it is vital to know that Igor can store the parameters in its __dict__ which allows for referencing using dot notation. ## igor.py import yaml class Igor(object): def__init__(self, config): self.__dict__.update(config) @classmethod def from_file(cls, yaml_file): withopen(yaml_file) as fp: return cls(yaml.load(fp)) Embeddings Now that we have our data in integer format, let's prep the rest of the experiment. A vital component in many modern day NLP systems are what have been called the 'sriracha' of NLP: word embeddings. What exactly are they though? They are individual vectors mapped to tokens (like our integers) that were trained to optimize learning objectives that encouraged similar words to have similar vectors. The reason they are so useful is that it gives the model a head start—it can immediately start associating overlapping signals from similar words in different sentences. We're going to work with GloVe embeddings. You can obtain all of them for free from the Stanford website. The following code assumes an igor that has various vital parameters: #### embedding.conf embedding_size: 300 target_glove: /path/to/glove/glove.840B.300d.txt vocab_file: /path/to/vocab/words.vocab save_dir: /path/to/savedir/ data_file: data.pkl It also assumes that the 300-dimensional, 840 billion common crawl vectors are used. There are smaller ones if they are more appropriate to your task. We will be only using a subset of the vectors. And then you can use a function like the following to compute an embedding matrix. In the next blog post I will cover how to use it. Note that tqdm is used here, but it doesn't have to be. It's a very handy progress bar. Also note this: I use the keras implementation of the Glorot Uniform initializer for words that aren't in the embedding data. ### utils.py from os import makedirs, path import tqdm from keras.initializations import glorot_uniform def embeddings_from_vocab(igor, vocab): print("using vocab and glove file to generate embedding matrix") remaining_vocab = set(vocab.keys()) embeddings = np.zeros((len(vocab), igor.embedding_size)) print("{} words to convert".format(len(remaining_vocab))) if igor.save_dir[-1] != "/": igor.save_dir += "/" if not path.exists(igor.save_dir): makedirs(igor.save_dir) fileiter = open(igor.target_glove).readlines() for line in tqdm(fileiter): line = line.replace("n","").split(" ") try: word, nums = line[0], [float(x.strip()) for x in line[1:]] if word in remaining_vocab: embeddings[vocab[word]] = np.array(nums) remaining_vocab.remove(word) exceptExceptionas e: print("{} broke. exception: {}. line: {}.".format(word, e, x)) print("{} words were not in glove; saving to oov.txt".format(len(remaining_vocab))) withopen(path.join(igor.save_dir, "oov.txt"), "w") as fp: fp.write("n".join(remaining_vocab)) for word in tqdm(remaining_vocab): embeddings[vocab[word]] = np.asarray(glorot_uniform((igor.embedding_size,)).eval()) withopen(path.join(igor.save_dir, "embedding.npy"), "wb") as fp: np.save(fp, embeddings) Serving Data Igor's main task of serving data is broken down into two key functions: serve_single and serve_batch. The class below is more fleshed out than the last Igor class. This time, it includes these two functions as well as some others. There are several main things to notice in the implementation below: 1. Each sentence is placed into a zero-matrix that is potentially larger than it. This is essential to what is known as masking (more on this below). 2. The sentences are offset by one and the target data is the next word. 3. The data is being served in batches. This is to maximize the efficiency of GPU capabilities. 4. The target variable, out_Y is being formatted with a to_categorical function. This encodes an integer as a one-hot vector. A one-hot vector is a vector with zeros at every spot except for one position - It is going to be used here with a cross entropy loss - which basically just means it will compute the dot product between the probability of every output (which is the same size as out_Y) and out_Y. In effect, this is the same thing as selecting a single element from the output probability vector. ### igor.py from keras.utils.data_utils import get_file from keras.utils.np_utils import to_categorical import yaml import itertools import numpy as np try: import cPickle as pickle except: import pickle from utils import Vocabulary class Igor(object): def__init__(self, config): self.__dict__.update(config) @classmethod def from_file(cls, yaml_file): withopen(yaml_file) as fp: return cls(yaml.load(fp)) @property def num_train_batches(self): returnlen(self.train_data)//self.batch_size @property def num_dev_batches(self): returnlen(self.dev_data)//self.batch_size @property def num_test_batches(self): returnlen(self.test_data)//self.batch_size @property def num_train_samples(self): returnself.num_train_batches * self.batch_size @property def num_dev_samples(self): returnself.num_dev_batches * self.batch_size @property def num_test_samples(self): returnself.num_test_batches * self.batch_size def _serve_single(self, data): for data_i in np.random.choice(len(data), len(data), replace=False): in_X = np.zeros(self.sequence_length) out_Y = np.zeros(self.sequence_length, dtype=np.int32) bigram_data = zip(data[data_i][0:-1], data[data_i][1:]) for datum_j,(datum_in, datum_out) in enumerate(bigram_data): in_X[datum_j] = datum_in out_Y[datum_j] = datum_out yield in_X, out_Y def _serve_batch(self, data): dataiter = self._serve_single(data) V = self.vocab_size S = self.sequence_length B = self.batch_size while dataiter: in_X = np.zeros((B, S), dtype=np.int32) out_Y = np.zeros((B, S, V), dtype=np.int32) next_batch = list(itertools.islice(dataiter, 0, self.batch_size)) iflen(next_batch) < self.batch_size: raiseStopIteration for d_i, (d_X, d_Y) in enumerate(next_batch): in_X[d_i] = d_X out_Y[d_i] = to_categorical(d_Y, V) yield in_X, out_Y def _data_gen(self, data, forever=True): ### extra boolean here so that it can go once through while loop working = True while working: for batch in self._serve_batch(data): yield batch working = working and forever def dev_gen(self, forever=True): returnself._data_gen(self.dev_data, forever) def train_gen(self, forever=True): returnself._data_gen(self.train_data, forever) def test_gen(self): returnself._data_gen(self.test_data, False) def prep(self): ## this assumes converted integer data has been placed into a pickle withopen(self.data_file) as fp: self.train_data, self.dev_data, self.test_data = pickle.load(fp) ifself.embeddings_file: self.saved_embeddings = np.load(self.embeddings_file) else: self.saved_embeddings = None self.vocab = Vocabulary.load(self.vocab_file) self.vocab_size = len(self.vocab) self.sequence_length = max(map(len, self.train_data+self.dev_data+self.train_data)) Different Sized Sentences and Masking There is one last piece of set-up information. In order to handle different sized sentences, you need to use a mask. What exactly is a mask, though? Well, since we are loading our data into a matrix that has the same size on each dimension, we have to adjust the values for the sentences of different length. For this task, we will use a specific numeric value set at the positions where there is no data. This is recognized by Keras internally as corresponding to positions where it should mask. More specifically, since we are using the Embedding layer, it will check where the input data equals this masked value. It will then push a binary matrix forward through your construct model so that it gets used in the correct spots. Note: There are a few types of layers which Keras can't push the mask through (without some clever finagling), but for this model it will. I will discuss how the mask gets used in the next post, but just know that the zero indexed token in our Vocabulary and the zeros in the data matrix correspond to masked positions. Conclusion And that's it! The data is now ready to be loaded up and served. An end-of-post note: Most of the prep code should be placed into a single, preprocessing script. It's sometimes easy just to add it to the bottom of the utils file. #### at the bottom of utils.py if__name__ == "__main__": print("getting data") raw_data = get_data() print("processing data") data, indices = process_raw_data(raw_data) print("formatting data") data, vocab = format_data(*data) print("making embeddings") from igor import Igor igor = Igor.from_file('embedding.conf') withopen(igor.data_file, 'w') as fp: pickle.dump(data, fp) vocab.save(path.join(igor.save_dir, igor.vocab_file)) embeddings_from_vocab(igor, vocab) and some of the important igor parameters so far: batch_size: 64 embedding_size: 300 rnn_size: 32 learning_rate: 0.0001 num_epochs: 100 ### set during computation vocab_size: 0 sequence_length: 0 ### file stuff data_file: data.pkl vocab_file: words.vocab embeddings_file: embedding.npy #~ # /path/to/embedding.npy ## or, if none, then ~ checkpoint_filepath: cp_weights.h5 Be sure to read Part 3 where I outline a language model and discuss the modeling choices. I will outline the algorithms needed to both decode from the language model and to sample from it. About the author Brian McMahan is in his final year of graduate school at Rutgers University, completing a PhD in computer science and an MS in cognitive psychology.  He holds a BS in cognitive science from Minnesota State University, Mankato.  At Rutgers, Brian investigates how natural language and computer vision can be brought closer together with the aim of developing interactive machines that can coordinate in the real world.  His research uses machine learning models to derive flexible semantic representations of open-ended perceptual language.
Read more
  • 0
  • 0
  • 1618

article-image-saying-hello
Packt
04 Oct 2016
6 min read
Save for later

Bootstrap and Angular: Saying Hello!

Packt
04 Oct 2016
6 min read
In this article by Sergey Akopkokhyants, author of the book Learning Web Development with Bootstrap and Angular (Second Edition), will establish a development environment for the simplest application possible. (For more resources related to this topic, see here.) Development environment setup It's time to set up your development environment. This process is one of the most overlooked and often frustrating parts of learning to program because developers don't want to think about it. The developers must know nuances how to install and configure many different programs before they start real development. Everyone's computers are different as a result; the same setup may not work on your computer. We will expose and eliminate all of these problems by defining the various pieces of environment you need to setup. Defining shell The shell is a required part of your software development environment. We will use the shell to install software, run commands to build and start the web server to bring the life to your web project. If your computer has installed Linux operating system then you will use the shell called Terminal. There are many Linux-based distributions out there that use diverse desktop environments, but most of them use the equivalent keyboard shortcut to open the Terminal. Use keyboard shortcut Ctrl + Alt + T to open Terminal in Linux. If you have a Mac computer with installed OS X, then you will use the Terminal shell as well. Use keyboard shortcut Command + Space to open the Spotlight, type Terminal to search and run. If you have a computer with installed Windows operation system, you can use the standard command prompt, but we can do better. In a minute later I will show you how can you install the Git on your computer, and you will have Git Bash free. You can open a Terminal with Git Bash shell program on Windows. I will use the shell bash for all exercises in this book whenever I need to work in the Terminal. Installing Node.js The Node.js is technology we will use as a cross-platform runtime environment for running server-side Web applications. It is a combination of native, platform independent runtime based on Google's V8 JavaScript engine and a huge number of modules written in JavaScript. Node.js ships with different connectors and libraries help you use HTTP, TLS, compression, file system access, raw TCP and UDP, and more. You as a developer can write own modules on JavaScript and run them inside Node.js engine. The Node.js runtime makes ease build a network, event-driven application servers. The terms package and library are synonymous in JavaScript so that we will use them interchangeably. Node.js is utilizing JavaScript Object Notation (JSON) format widely in data exchange between server and client sides because it readily expressed in several parse diagrams, notably without complexities of XML, SOAP, and other data exchange formats. You can use Node.js for the development of the service-oriented applications, doing something different than web servers. One of the most popular service-oriented application is Node Package Manager (NPM) we will use to manage library dependencies, deployment systems, and underlies the many platform-as-a-service (PaaS) providers for Node.js. If you do not have Node.js installed on your computer, you shall download the pre-build installer from https://nodejs.org/en/download. You can start to use the Node.js immediately after installation. Open the Terminal and type: node ––version The Node.js must respond with version number of installed runtime: v4.4.3 Setting up NPM The NPM is a package manager for JavaScript. You can use it to find, share, and reuse packages of code from many developers across the world. The number of packages dramatically grows every day and now is more than 250K. NPM is a Node.js package manager and utilizes it to run itself. NPM is included in setup bundle of Node.js and available just after installation. Open the Terminal and type: npm ––version The NPM must answer on your command with version number: 2.15.1 The following command gives us information about Node.js and NPM install: npm config list There are two ways to install NPM packages: locally or globally. In cases when you would like to use the package as a tool better install it globally: npm install ––global <package_name> If you need to find the folder with globally installed packages you can use the next command: npm config get prefix Installation global packages are important, but best avoid if not needed. Mostly you will install packages locally. npm install <package_name> You may find locally installed packages in a node_modules folder of your project. Installing Git You missed a lot if you are not familiar with Git. Git is a distributed version control system and each Git working directory is a full-fledged repository. It keeps the complete history of changes and has full version tracking capabilities. Each repository is entirely independent of network access or a central server. You can install Git on your computer via a set of pre-build installers available on official website https://git-scm.com/downloads. After installation, you can open the Terminal and type git –version Git must respond with version number git version 2.8.1.windows.1 As I said for developers who use computers with installed Windows operation system now, you have Git Bash free on your system. Code editor You can imagine how many programs for code editing exists but we will talk today only about free, open source and runs everywhere Visual Studio Code from Microsoft. You can use any program you prefer for development, but I use only Visual Studio Code in our future exercises, so please install it from http://code.visualstudio.com/Download. Summary This article, we learned about shell concept, how to install Node.js and Git, and setting up node packages. Resources for Article: Further resources on this subject: Gearing Up for Bootstrap 4 [article] API with MongoDB and Node.js [article] Mapping Requirements for a Modular Web Shop App [article]
Read more
  • 0
  • 0
  • 29958

article-image-thinking-probabilistically
Packt
04 Oct 2016
16 min read
Save for later

Thinking Probabilistically

Packt
04 Oct 2016
16 min read
In this article by Osvaldo Martin, the author of the book Bayesian Analysis with Python, we will learn that Bayesian statistics has been developing for more than 250 years now. During this time, it has enjoyed as much recognition and appreciation as disdain and contempt. In the last few decades, it has gained an increasing amount of attention from people in the field of statistics and almost all the other sciences, engineering, and even outside the walls of the academic world. This revival has been possible due to theoretical and computational developments; modern Bayesian statistics is mostly computational statistics. The necessity for flexible and transparent models and more intuitive interpretation of the results of a statistical analysis has only contributed to the trend. (For more resources related to this topic, see here.) Here, we will adopt a pragmatic approach to Bayesian statistics and we will not care too much about other statistical paradigms and their relationship with Bayesian statistics. The aim of this book is to learn how to do Bayesian statistics with Python; philosophical discussions are interesting but they have already been discussed elsewhere in a much richer way than we could discuss in these pages. We will use a computational and modeling approach, and we will learn to think in terms of probabilistic models and apply Bayes' theorem to derive the logical consequences of our models and data. Models will be coded using Python and PyMC3, a great library for Bayesian statistics that hides most of the mathematical details of Bayesian analysis from the user. Bayesian statistics is theoretically grounded in probability theory, and hence it is no wonder that many books about Bayesian statistics are full of mathematical formulas requiring a certain level of mathematical sophistication. Nevertheless, programming allows us to learn and do Bayesian statistics with only modest mathematical knowledge. This is not to say that learning the mathematical foundations of statistics is useless; don't get me wrong, that could certainly help you build better models and gain an understanding of problems, models, and results. In this article, we will cover the following topics: Statistical modeling Probabilities and uncertainty Statistical modeling Statistics is about collecting, organizing, analyzing, and interpreting data, and hence statistical knowledge is essential for data analysis. Another useful skill when analyzing data is knowing how to write code in a programming language such as Python. Manipulating data is usually necessary given that we live in a messy world with even more messy data, and coding helps to get things done. Even if your data is clean and tidy, programming will still be very useful since, as will see, modern Bayesian statistics is mostly computational statistics. Most introductory statistical courses, at least for non-statisticians, are taught as a collection of recipes that more or less go like this; go to the the statistical pantry, pick one can and open it, add data to taste and stir until obtaining a consisting p-value, preferably under 0.05 (if you don't know what a p-value is, don't worry; we will not use them in this book). The main goal in this type of course is to teach you how to pick the proper can. We will take a different approach: we will also learn some recipes, but this will be home-made food rather than canned food; we will learn hot to mix fresh ingredients that will suit different gastronomic occasions. But before we can cook we must learn some statistical vocabulary and also some concepts. Exploratory data analysis Data is an essential ingredient of statistics. Data comes from several sources, such as experiments, computer simulations, surveys, field observations, and so on. If we are the ones that will be generating or gathering the data, it is always a good idea to first think carefully about the questions we want to answer and which methods we will use, and only then proceed to get the data. In fact, there is a whole branch of statistics dealing with data collection known as experimental design. In the era of data deluge, we can sometimes forget that getting data is not always cheap. For example, while it is true that the Large Hadron Collider (LHC) produces hundreds of terabytes a day, its construction took years of manual and intellectual effort. In this book we will assume that we already have collected the data and also that the data is clean and tidy, something rarely true in the real world. We will make these assumptions in order to focus on the subject of this book. If you want to learn how to use Python for cleaning and manipulating data and also a primer on statistics and machine learning, you should probably read Python Data Science Handbook by Jake VanderPlas. OK, so let's assume we have our dataset; usually, a good idea is to explore and visualize it in order to get some idea of what we have in our hands. This can be achieved through what is known as Exploratory Data Analysis (EDA), which basically consists of the following: Descriptive statistics Data visualization The first one, descriptive statistics, is about how to use some measures (or statistics) to summarize or characterize the data in a quantitative manner. You probably already know that you can describe data using the mean, mode, standard deviation, interquartile ranges, and so forth. The second one, data visualization, is about visually inspecting the data; you probably are familiar with representations such as histograms, scatter plots, and others. While EDA was originally thought of as something you apply to data before doing any complex analysis or even as an alternative to complex model-based analysis, through the book we will learn that EDA is also applicable to understanding, interpreting, checking, summarizing, and communicating the results of Bayesian analysis. Inferential statistics Sometimes, plotting our data and computing simple numbers, such as the average of our data, is all what we need. Other times, we want to go beyond our data to understand the underlying mechanism that could have generated the data, or maybe we want to make predictions for future data, or we need to choose among several competing explanations for the same data. That's the job of inferential statistics. To do inferential statistics we will rely on probabilistic models. There are many types of model and most of science, and I will add all of our understanding of the real world, is through models. The brain is just a machine that models reality (whatever reality might be) http://www.tedxriodelaplata.org/videos/m%C3%A1quina-construye-realidad. What are models? Models are a simplified descriptions of a given system (or process). Those descriptions are purposely designed to capture only the most relevant aspects of the system, and hence, most models do not try to pretend they are able to explain everything; on the contrary, if we have a simple and a complex model and both models explain the data well, we will generally prefer the simpler one. Model building, no matter which type of model you are building, is an iterative process following more or less the same basic rules. We can summarize the Bayesian modeling process using three steps: Given some data and some assumptions on how this data could have been generated, we will build models. Most of the time, models will be crude approximations, but most of the time this is all we need. Then we will use Bayes' theorem to add data to our models and derive the logical consequences of mixing the data and our assumptions. We say we are conditioning the model on our data. Lastly, we will check that the model makes sense according to different criteria, including our data and our expertise on the subject we are studying. In general, we will find ourselves performing these three steps in a non-linear iterative fashion. Sometimes we will retrace our steps at any given point: maybe we made a silly programming mistake, maybe we found a way to change the model and improve it, maybe we need to add more data. Bayesian models are also known as probabilistic models because they are built using probabilities. Why probabilities? Because probabilities are the correct mathematical tool for dealing with uncertainty in our data and models, so let's take a walk through the garden of forking paths. Probabilities and uncertainty While probability theory is a mature and well-established branch of mathematics, there is more than one interpretation of what probabilities are. To a Bayesian, a probability is a measure that quantifies the uncertainty level of a statement. If we know nothing about coins and we do not have any data about coin tosses, it is reasonable to think that the probability of a coin landing heads could take any value between 0 and 1; that is, in the absence of information, all values are equally likely, our uncertainty is maximum. If we know instead that coins tend to be balanced, then we may say that the probability of acoin landing is exactly 0.5 or may be around 0.5 if we admit that the balance is not perfect. If we collect data, we can update these prior assumptions and hopefully reduce the uncertainty about the bias of the coin. Under this definition of probability, it is totally valid and natural to ask about the probability of life on Mars, the probability of the mass of the electron being 9.1 x 10-31 kg, or the probability of the 9th of July of 1816 being a sunny day. Notice for example that life on Mars exists or not; it is a binary outcome, but what we are really asking is how likely is it to find life on Mars given our data and what we know about biology and the physical conditions on that planet? The statement is about our state of knowledge and not, directly, about a property of nature. We are using probabilities because we can not be sure about the events, not because the events are necessarily random. Since this definition of probability is about our epistemic state of mind, sometimes it is referred to as the subjective definition of probability, explaining the slogan of subjective statistics often attached to the Bayesian paradigm. Nevertheless, this definition does not mean all statements should be treated as equally valid and so anything goes; this definition is about acknowledging that our understanding about the world is imperfect and conditioned by the data and models we have made. There is not such a thing as a model-free or theory-free understanding of the world; even if it will be possible to free ourselves from our social preconditioning, we will end up with a biological limitation: our brain, subject to the evolutionary process, has been wired with models of the world. We are doomed to think like humans and we will never think like bats or anything else! Moreover, the universe is an uncertain place and all we can do is make probabilistic statements about it. Notice that does not matter if the underlying reality of the world is deterministic or stochastic; we are using probability as a tool to quantify uncertainty. Logic is about thinking without making mistakes. In Aristotelian or classical logic, we can only have statements that are true or false. In Bayesian definition of probability, certainty is just a special case: a true statement has a probability of 1, a false one has probability. We would assign a probability of 1 about life on Mars only after having conclusive data indicating something is growing and reproducing and doing other activities we associate with living organisms. Notice, however, that assigning a probability of 0 is harder because we can always think that there is some Martian spot that is unexplored, or that we have made mistakes with some experiment, or several other reasons that could lead us to falsely believe life is absent on Mars when it is not. Interesting enough, Cox mathematically proved that if we want to extend logic to contemplate uncertainty we must use probabilities and probability theory, from which Bayes' theorem is just a logical consequence as we will see soon. Hence, another way of thinking about Bayesian statistics is as an extension of logic when dealing with uncertainty, something that clearly has nothing to do with subjective reasoning in the pejorative sense. Now that we know the Bayesian interpretation of probability, let's see some of the mathematical properties of probabilities. For a more detailed study of probability theory, you can read Introduction to probability by Joseph K Blitzstein & Jessica Hwang. Probabilities are numbers in the interval [0, 1], that is, numbers between 0 and 1, including both extremes. Probabilities follow some rules; one of these rules is the product rule: We read this as follows: the probability of A and B is equal to the probability of A given B, multiplied by the probability of B. The expression p(A|B) is used to indicate a conditional probability; the name refers to the fact that the probability of A is conditioned by knowing B. For example, the probability that a pavement is wet is different from the probability that the pavement is wet if we know (or given that) is raining. In fact, a conditional probability is always larger than or equal to the unconditioned probability. If knowing B does not provides us with information about A, then p(A|B)=p(A). That is A and B are independent of each other. On the contrary, if knowing B give as useful information about A, then p(A|B) > p(A). Conditional probabilities are a key concept in statistics, and understanding them is crucial to understanding Bayes' theorem, as we will see soon. Let's try to understand them from a different perspective. If we reorder the equation for the product rule, we get the following: Hence, p(A|B) is the probability that both A and B happens, relative to the probability of B happening. Why do we divide by p(B)? Knowing B is equivalent to saying that we have restricted the space of possible events to B and thus, to find the conditional probability, we take the favorable cases and divide them by the total number of events. Is important to realize that all probabilities are indeed conditionals, there is not such a thing as an absolute probability floating in vacuum space. There is always some model, assumptions, or conditions, even if we don't notice or know them. The probability of rain is not the same if we are talking about Earth, Mars, or some other place in the Universe, the same way the probability of a coin landing heads or tails depends on our assumptions of the coin being biased in one way or another. Now that we are more familiar with the concept of probability, let's jump to the next topic, probability distributions. Probability distributions A probability distribution is a mathematical object that describes how likely different events are. In general, these events are restricted somehow to a set of possible events. A common and useful conceptualization in statistics is to think that data was generated from some probability distribution with unobserved parameters. Since the parameters are unobserved and we only have data, we will use Bayes' theorem to invert the relationship, that is, to go from the data to the parameters. Probability distributions are the building blocks of Bayesian models; by combining them in proper ways we can get useful complex models. We will meet several probability distributions throughout the book; every time we discover one we will take a moment to try to understand it. Probably the most famous of all of them is the Gaussian or normal distribution. A variable x follows a Gaussian distribution if its values are dictated by the following formula: In the formula, and are the parameters of the distributions. The first one can take any real value, that is, , and dictates the mean of the distribution (and also the median and mode, which are all equal). The second is the standard deviation, which can only be positive and dictates the spread of the distribution. Since there are an infinite number of possible combinations of and values, there is an infinite number of instances of the Gaussian distribution and all of them belong to the same Gaussian family. Mathematical formulas are concise and unambiguous and some people say even beautiful, but we must admit that meeting them can be intimidating; a good way to break the ice is to use Python to explore them. Let's see what the Gaussian distribution family looks like: import matplotlib.pyplot as plt import numpy as np from scipy import stats import seaborn as sns mu_params = [-1, 0, 1] sd_params = [0.5, 1, 1.5] x = np.linspace(-7, 7, 100) f, ax = plt.subplots(len(mu_params), len(sd_params), sharex=True, sharey=True) for i in range(3): for j in range(3): mu = mu_params[i] sd = sd_params[j] y = stats.norm(mu, sd).pdf(x) ax[i,j].plot(x, y) ax[i,j].set_ylim(0, 1) ax[i,j].plot(0, 0, label="$\alpha$ = {:3.2f}n$\beta$ = {:3.2f}".format(mu, sd), alpha=0) ax[i,j].legend() ax[2,1].set_xlabel('$x$') ax[1,0].set_ylabel('$pdf(x)$') The output of the preceding code is as follows: A variable, such as x, that comes from a probability distribution is called a random variable. It is not that the variable can take any possible value. On the contrary, the values are strictly dictated by the probability distribution; the randomness arises from the fact that we could not predict which value the variable will take, but only the probability of observing those values. A common notation used to say that a variable is distributed as a Gaussian or normal distribution with parameters and is as follows: The symbol ~ is read as is distributed as. There are two types of random variable, continuous and discrete. Continuous variables can take any value from some interval (we can use Python floats to represent them), and the discrete variables can take only certain values (we can use Python integers to represent them). Many models assume that successive values of a random variables are all sampled from the same distribution and those values are independent of each other. In such a case, we will say that the variables are independently and identically distributed, or iid variables for short. Using mathematical notation, we can see that two variables are independent if for every value of x and y: A common example of non iid variables are temporal series, where a temporal dependency in the random variable is a key feature that should be taken into account. Summary In this article we shall take up a practical approach to Bayesian statistics and discover how to implement Bayesian statistics with Python. Here we will learn to think of problems in terms of their probability and uncertainty and apply the Bayes' theorem to derive their results. Resources for Article: Further resources on this subject: Python Data Science Up and Running [article] Mining Twitter with Python – Influence and Engagement [article] Exception Handling in MySQL for Python [article]
Read more
  • 0
  • 0
  • 2255

article-image-supervised-machine-learning
Packt
04 Oct 2016
13 min read
Save for later

Supervised Machine Learning

Packt
04 Oct 2016
13 min read
In this article by Anshul Joshi, the author of the book Julia for Data Science, we will learn that data science involves understanding data, gathering data, munging data, taking the meaning out of that data, and then machine learning if needed. Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. (For more resources related to this topic, see here.) The key features offered by Julia are: A general purpose high-level dynamic programming language designed to be effective for numerical and scientific computing A Low-Level Virtual Machine (LLVM) based Just-in-Time (JIT) compiler that enables Julia to approach the performance of statically-compiled languages like C/C++ What is machine learning? Generally, when we talk about machine learning, we get into the idea of us fighting wars with intelligent machines that we created but went out of control. These machines are able to outsmart the human race and become a threat to human existence. These theories are nothing but created for our entertainment. We are still very far away from such machines. So, the question is: what is machine learning? Tom M. Mitchell gave a formal definition- "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E." It says that machine learning is teaching computers to generate algorithms using data without programming them explicitly. It transforms data into actionable knowledge. Machine learning has close association with statistics, probability, and mathematical optimization. As technology grew, there is one thing that grew with it exponentially—data. We have huge amounts of unstructured and structured data growing at a very great pace. Lots of data is generated by space observatories, meteorologists, biologists, fitness sensors, surveys, and so on. It is not possible to manually go through this much amount of data and find patterns or gain insights. This data is very important for scientists, domain experts, governments, health officials, and even businesses. To gain knowledge out of this data, we need self-learning algorithms that can help us in decision making. Machine learning evolved as a subfield of artificial intelligence, which eliminates the need to manually analyze large amounts of data. Instead of using machine learning, we make data-driven decisions by gaining knowledge using self-learning predictive models. Machine learning has become important in our daily lives. Some common use cases include search engines, games, spam filters, and image recognition. Self-driving cars also use machine learning. Some basic terminologies used in machine learning: Features: Distinctive characteristics of the data point or record Training set: This is the dataset that we feed to train the algorithm that helps us to find relationships or build a model Testing set: The algorithm generated using the training dataset is tested on the testing dataset to find the accuracy Feature vector: An n-dimensional vector that contains the features defining an object Sample: An item from the dataset or the record Uses of machine learning Machine learning in one way or another is used everywhere. Its applications are endless. Let's discuss some very common use cases: E-mail spam filtering: Every major e-mail service provider uses machine learning to filter out spam messages from the Inbox to the Spam folder. Predicting storms and natural disasters: Machine learning is used by meteorologists and geologists to predict the natural disasters using weather data, which can help us to take preventive measures. Targeted promotions/campaigns and advertising: On social sites, search engines, and maybe in mailboxes, we see advertisements that somehow suit our taste. This is made feasible using machine learning on the data from our past searches, our social profile or the e-mail contents. Self-driving cars: Technology giants are currently working on self driving cars. This is made possible using machine learning on the feed of the actual data from human drivers, image and sound processing, and various other factors. Machine learning is also used by businesses to predict the market. It can also be used to predict the outcomes of elections and the sentiment of voters towards a particular candidate. Machine learning is also being used to prevent crime. By understanding the pattern of the different criminals, we can predict a crime that can happen in future and can prevent it. One case that got a huge amount of attention was of a big retail chain in the United States using machine learning to identify pregnant women. The retailer thought of the strategy to give discounts on multiple maternity products, so that they would become loyal customers and will purchase items for babies which have a high profit margin. The retailer worked on the algorithm to predict the pregnancy using useful patterns in purchases of different products which are useful for pregnant women. Once a man approached the retailer and asked for the reason that his teenage daughter is receiving discount coupons for maternity items. The retail chain offered an apology but later the father himself apologized when he got to know that his daughter was indeed pregnant. This story may or may not be completely true, but retailers indeed analyze their customers' data routinely to find out patterns and for targeted promotions, campaigns, and inventory management. Machine learning and ethics Let's see where machine learning is used very frequently: Retailers: In the previous example, we mentioned how retail chains use data for machine learning to increase their revenue as well as to retain their customers. Spam filtering: E-mails are processed using various machine learning algorithms for spam filtering. Targeted advertisements: In our mailbox, social sites, or search engines, we see advertisements of our liking. These are only some of the actual use cases that are implemented in the world today. One thing that is common between them is the user data. In the first example, retailers are using the history of transactions done by the user for targeted promotions and campaigns and for inventory management, among other things. Retail giants do this by providing users a loyalty or sign-up card. In the second example, the e-mail service provider uses trained machine learning algorithms to detect and flag spam. It does by going through the contents of e-mail/attachments and classifying the sender of the e-mail. In the third example, again the e-mail provider, social network, or search engine will go through our cookies, our profile, or our mails to do the targeted advertising. In all of these examples, it is mentioned in the terms and conditions of the agreement when we sign up with the retailer, e-mail provider, or social network that the user's data will be used but privacy will not be violated. It is really important that before using data that is not publicly available, we take the required permissions. Also, our machine learning models shouldn't do discrimination on the basis of region, race, and sex or of any other kind. The data provided should not be used for purposes not mentioned in the agreement or illegal in the region or country of existence. Machine learning – the process Machine learning algorithms are trained in keeping with the idea of how the human brain works. They are somewhat similar. Let's discuss the whole process. The machine learning process can be described in three steps: Input Abstraction Generalization These three steps are the core of how the machine learning algorithm works. Although the algorithm may or may not be divided or represented in such a way, this explains the overall approach. The first step concentrates on what data should be there and what shouldn't. On the basis of that, it gathers, stores, and cleans the data as per the requirements. The second step involves that the data be translated to represent the bigger class of data. This is required as we cannot capture everything and our algorithm should not be applicable for only the data that we have. The third step focuses on the creation of the model or an action that will use this abstracted data, which will be applicable for the broader mass. So, what should be the flow of approaching a machine learning problem? In this particular figure, we see that the data goes through the abstraction process before it can be used to create the machine learning algorithm. This process itself is cumbersome. The process follows the training of the model, which is fitting the model into the dataset that we have. The computer does not pick up the model on its own, but it is dependent on the learning task. The learning task also includes generalizing the knowledge gained on the data that we don't have yet. Therefore, training the model is on the data that we currently have and the learning task includes generalization of the model for future data. It depends on our model how it deduces knowledge from the dataset that we currently have. We need to make such a model that can gather insights into something that wasn't known to us before and how it is useful and can be linked to the future data. Different types of machine learning Machine learning is divided mainly into three categories: Supervised learning Unsupervised learning Reinforcement learning In supervised learning, the model/machine is presented with inputs and the outputs corresponding to those inputs. The machine learns from these inputs and applies this learning in further unseen data to generate outputs. Unsupervised learning doesn't have the required outputs; therefore it is up to the machine to learn and find patterns that were previously unseen. In reinforcement learning, the machine continuously interacts with the environment and learns through this process. This includes a feedback loop. Understanding decision trees Decision tree is a very good example of divide and conquer. It is one of the most practical and widely used methods for inductive inference. It is a supervised learning method that can be used for both classification and regression. It is non-parametric and its aim is to learn by inferring simple decision rules from the data and create such a model that can predict the value of the target variable. Before taking a decision, we analyze the probability of the pros and cons by weighing the different options that we have. Let's say we want to purchase a phone and we have multiple choices in the price segment. Each of the phones has something really good, and maybe better than the other. To make a choice, we start by considering the most important feature that we want. And like this, we create a series of features that it has to pass to become the ultimate choice. In this section, we will learn about: Decision trees Entropy measures Random forests We will also learn about famous decision tree learning algorithms such as ID3 and C5.0. Decision tree learning algorithms There are various decision tree learning algorithms that are actually variations of the core algorithm. The core algorithm is actually a top-down, greedy search through all possible trees. We are going to discuss two algorithms: ID3 C4.5 and C5.0 The first algorithm, Iterative Dichotomiser 3 (ID3), was developed by Ross Quinlan in 1986. The algorithm proceeds by creating a multiway tree, where it uses greedy search to find each node and the features that can yield maximum information gain for the categorical targets. As trees can grow to the maximum size, which can result in over-fitting of data, pruning is used to make the generalized model. C4.5 came after ID3 and eliminated the restriction that all features must be categorical. It does this by defining dynamically a discrete attribute based on the numerical variables. This partitions into a discrete set of intervals from the continuous attribute value. C4.5 creates sets of if-then rules from the trained trees of the ID3 algorithm. C5.0 is the latest version; it builds smaller rule sets and uses comparatively lesser memory. An example Let's apply what we've learned to create a decision tree using Julia. We will be using the example available for Python on scikit-learn.org and Scikitlearn.jl by Cedric St-Jean. We will first have to add the required packages: We will first have to add the required packages: julia> Pkg.update() julia> Pkg.add("DecisionTree") julia> Pkg.add("ScikitLearn") julia> Pkg.add("PyPlot") ScikitLearn provides the interface to the much-famous library of machine learning for Python to Julia: julia> using ScikitLearn julia> using DecisionTree julia> using PyPlot After adding the required packages, we will create the dataset that we will be using in our example: julia> # Create a random dataset julia> srand(100) julia> X = sort(5 * rand(80)) julia> XX = reshape(X, 80, 1) julia> y = sin(X) julia> y[1:5:end] += 3 * (0.5 – rand(16)) This will generate a 16-element Array{Float64,1}. Now we will create instances of two different models. One model is where we will not limit the depth of the tree, and in other model, we will prune the decision tree on the basis of purity: We will now fit the models to the dataset that we have. We will fit both the models. This is the first model. Here our decision tree has 25 leaf nodes and a depth of 8. This is the second model. Here we prune our decision tree. This has six leaf nodes and a depth of 4. Now we will use the models to predict on the test dataset: julia> # Predict julia> X_test = 0:0.01:5.0 julia> y_1 = predict(regr_1, hcat(X_test)) julia> y_2 = predict(regr_2, hcat(X_test)) This creates a 501-element Array{Float64,1}. To better understand the results, let's plot both the models on the dataset that we have: julia> # Plot the results julia> scatter(X, y, c="k", label="data") julia> plot(X_test, y_1, c="g", label="no pruning", linewidth=2) julia> plot(X_test, y_2, c="r", label="pruning_purity_threshold=0.05", linewidth=2) julia> xlabel("data") julia> ylabel("target") julia> title("Decision Tree Regression") julia> legend(prop=Dict("size"=>10)) Decision trees can tend to overfit data. It is required to prune the decision tree to make it more generalized. But if we do more pruning than required, then it may lead to an incorrect model. So, it is required that we find the most optimized pruning level. It is quite evident that the first decision tree overfits to our dataset, whereas the second decision tree model is comparatively more generalized. Summary In this article, we learned about machine learning and its uses. Providing computers the ability to learn and improve has far-reaching uses in this world. It is used in predicting disease outbreaks, predicting weather, games, robots, self-driving cars, personal assistants, and lot more. There are three different types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. We also learned about decision trees. Resources for Article: Further resources on this subject: Specialized Machine Learning Topics [article] Basics of Programming in Julia [article] More about Julia [article]
Read more
  • 0
  • 0
  • 1994

article-image-reactive-python-real-time-events-processing
Xavier Bruhiere
04 Oct 2016
8 min read
Save for later

Reactive Python - Real-time events processing

Xavier Bruhiere
04 Oct 2016
8 min read
A recent trend in programming literature promotes functional programming as a sensible alternative to object-oriented programs for many use cases. This subject feeds many discussions and highlights how important program design is as our applications are becoming more and more complex. Although there might be here some seductive intellectual challenge (because yeah, we love to juggle with elegant abstractions), there are also real business values : Building sustainable, maintainable programs Decoupling architecture components for proper team work Limiting bug exposure Better product iteration When developers spot an interesting approach to solve a recurrent issue in our industry, they formalize it as a design pattern. Today, we will discuss a powerful member of this family: the pattern observer. We won't dive into the strict rhetorical details (sorry, not sorry). Instead, we will delve how reactive programming can level up the quality of our work. It's Python Week. That means you can not only save 50% on some of our latest Python products, but you can also pick up a free Python eBook every single day! The scene That was a bold statement; let's illustrate that with a real-world scenario. Say we were tasked to build a monitoring system. We need some way to collect data, analyze it, and take actions when things go unexpected. Anomaly detection is an exciting yet challenging problem. We don't want our data scientists to be bothered by infrastructure failures. And in the same spirit, we need other engineers to focus only on how to react to specific disaster scenarios. The core of our approach consists of two components—a monitoring module firing and forgetting its discoveries on channels and another processing brick intercepting those events with an appropriate response. The UNIX philosophy at its best: do one thing and do it well. We split the infrastructure by concerns and the workers by event types. Assuming that our team defines well-documented interfaces, this is a promising design. The rest of the article will discuss the technical implementation but keep in mind that I/O documentation and proper processing of load estimation are also fundamental. The strategy Our local lab is composed of three elements: The alert module that we will emulate with a simple cli tool, which publishes alert messages. The actual processing unit subscribing to events it knows how to react to. A message broker supporting the Publish / Subscribe (or PUBSUB) pattern. For this purpose, Redis offers a popular, efficient, and rock solid solution. This is highly recommended, but the database isn't designed for this case. NATS, however, presents itself as follows: NATS acts as a central nervous system for distributed systems such as mobile devices, IoT networks, enterprise microservices and cloud native infrastructure. Unlike traditional enterprise messaging systems, NATS provides an always on ‘dial-tone’. Sounds promising! Client libraries are available for major languages, and Apcera, the company sponsoring the technology, has a solid reputation for building reliable distributed systems. Again, we won't delve how processing actually happens, only the orchestration of this three moving parts. The setup Since NATS is a message broker, we need to run a server locally (version 0.8.0 as of today). Gnatsd is the official and scalable first choice. It is written in Go, so we get performances and drop-in binary out of the box. For fans of microservices (as I am), an official Docker image is available for pulling. Also, for lazy ones (as I am), a demo server is already running at nats://demo.nats.io:4222. Services will use Python 3.5.1, but 2.7.10 should do the job with minimal changes. Our scenario is mostly about data analysis and system administration on the backend, and Python has a wide range of tools for both areas. So let's install the requirements: $ pip --version pip 8.1.1 $ pip install -e git+https://github.com/mcuadros/pynats@6851e84eb4b244d22ffae65e9fbf79bd9872a5b3#egg=pynats click==6.6 # for cli integration Thats'all. We are now ready to write services. Publishing events Let's warm up by sending some alerts to the cloud. First, we need to connect to the NATS server: # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: broker.py import pynats def nats_conn(conf): """Connect to nats server from environment variables. The point is to allow easy switching without to change the code. You can read more on this approach stolen from 12 factors apps. """ # the default value comes from docker-compose (https://docs.docker.com/compose/) services link behavior host = conf.get('__BROKER_HOST__', 'nats') port = conf.get('__BROKER_PORT__', 4222) opts = { 'url': conf.get('url', 'nats://{host}:{port}'.format(host=host, port=port)), 'verbose': conf.get('verbose', False) } print('connecting to broker ({opts})'.format(opts=opts)) conn = pynats.Connection(**opts) conn.connect() return conn This should be enough to start our client: # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: observer.py import os import broker def send(channel, msg): # use environment variables for configuration nats = broker.nats_conn(os.environ) nats.publish(channel, msg) nats.close() And right after that, a few lines of code to shape a cli tool: #! /usr/bin/env python # -*- coding: utf-8 -*- # vim_fenc=utf-8 # # filename: __main__.py import click @click.command() @click.argument('command') @click.option('--on', default='some_event', help='messages topic name') def main(command, on): if command == 'send': click.echo('publishing message') observer.send(on, 'Terminator just dropped in our space-time') if__name__ == '__main__': main() chmod +x ./__main__.py gives it execution permission so we can test how our first bytes are doing. $ # `click` package gives us a productive cli interface $ ./__main__.py --help Usage: __main__.py [OPTIONS] COMMAND Options: --on TEXT messages topic name --help Show this message and exit. $ __BROKER_HOST__="demo.nats.io"./__main__.py send --on=click connecting to broker ({'verbose': False, 'url': 'nats://demo.nats.io:4222'}) publishing message ... This is indeed quite poor in feedback, but no exception means that we did connect to the server and published a message. Reacting to events We're done with the heavy lifting! Now that interesting events are flying through the Internet, we can catch them and actually provide business values. Don't forget the point: let the team write reactive programs without worrying how it will be triggered. I found the following snippet to be a readable syntax for such a goal: # filename: __main__.py import observer @observer.On('terminator_detected') def alert_sarah_connor(msg): print(msg.data) As the capitalized letter of On suggests, this is a Python class, wrapping a NATS connection. It aims to call the decorated function whenever a new message goes through the given channel. Here is a naive implementation shamefully ignoring any reasonable error handling and safe connection termination (broker.nats_conn would be much more production-ready as a context manger, but hey, we do things that don't scale, move fast, and break things): # filename: observer.py class On(object): def__init__(self, event_name, **kwargs): self._count = kwargs.pop('count', None) self._event = event_name self._opts = kwargs or os.environ def__call__(self, fn): nats = broker.nats_conn(self._opts) subscription = nats.subscribe(self._event, fn) def inner(): print('waiting for incoming messages') nats.wait(self._count) # we are done nats.unsubscribe(subscription) return nats.close() return inner Instil some life into this file from the __main__.py: # filename: __main__.py @click.command() @click.argument('command') @click.option('--on', default='some_event', help='messages topic name') def main(command, on): if command == 'send': click.echo('publishing message') observer.send(on, 'bad robot detected') elif command == 'listen': try: alert_sarah_connor(): exceptKeyboardInterrupt: click.echo('caught CTRL-C, cleaning after ourselves...') Your linter might complain about the injection of the msg argument in alert_sarah_connor, but no offense, it should just work (tm): $ In a first terminal, listen to messages $ __BROKER_HOST__="demo.nats.io"./__main__.py listen connecting to broker ({'url': 'nats://demo.nats.io:4222', 'verbose': False}) waiting for incoming messages $ And fire up alerts in a second terminal __BROKER_HOST__="demo.nats.io"--on='terminator_detected' The data appears in the first terminal, celebrate! Conclusion Reactive programming implemented with the Publish/Subscribe pattern brings a lot of benefits for events-oriented products. Modular development, decoupled components, scalable distributed infrastructure, single-responsibility principle.One should think about how data flows into the system before diving into the technical details. This kind of approach also gains traction from real-time data processing pipelines (Riemann, Spark, and Kafka). NATS performances, indeed, allow ultra low-latency architectures development without too much of a deployment overhead. We covered in a few lines of Python the basics of a reactive programming design, with a lot of improvement opportunities: events filtering, built-in instrumentation, and infrastructure-wide error tracing. I hope you found in this article the building block to develop upon! About the author Xavier Bruhiere is the lead developer at AppTurbo in Paris, where he develops innovative prototypes to support company growth. He is addicted to learning, hacking on intriguing hot techs (both soft and hard), and practicing high intensity sports.
Read more
  • 0
  • 0
  • 6204
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-introduction-neural-networks-chainer-part-2
Hiroyuki Vincent
04 Oct 2016
7 min read
Save for later

Introduction to Neural Networks with Chainer – Part 2

Hiroyuki Vincent
04 Oct 2016
7 min read
In this second and third part of this series with Chainer, we are going to train an autoencoder. These two parts will mostly consist of code and explanations about the internal architecture of the framework. The autoencoder that we will implement will have one hidden layer with 2 nodes and 3 nodes in the input and output. The dimension of the output layer needs to equal the dimensions of the input layer since we are training an autoencoder. The goal is to train this model to compress the 3-dimensional data into 2 dimensions in the hidden layer and then be able to reconstruct the initial data from the hidden layer. Preparing the Data Let's create 1000 training samples, each one with 3 random floating-point values, the same dimensions as the input layer of the model. This data is stored in one single NumPy array of shape (1000, 3). A copy of this data is created so that it can be used as the target values during the training phase. This is not specific to Chainer but simple Python using NumPy. It is worth noting that this is all the data we need to prepare. Later during the training phase, we will convert this data into Chainer variables as described in Part 1 of this series. import numpy as np input_size = 3 train_size = 1000 x_train = np.random.rand(train_size, input_size).astype(np.float32) y_train = x_train.copy() Creating a Model Defining a model in Chainer is done via code in contrast to other frameworks such as Caffe where the models are defined in configuration files (.prototxt). It is therefore quite easy to debug. You might, however, need to get used to the assertion errors that Chainer throws at you when the model layers aren't compatible or variable dimensions aren't what the framework expects. You can wrap all of the network definitions such as the layers and the loss functions in a class that inherits chainer.Chain as follows. from chainer import Chain from chainer import links as L from chainer import functions as F class Autoencoder(Chain): def__init__(self): # Layer / Connection definitions in the constructor super().__init__( l1=L.Linear(3, 2), l2=L.Linear(2, 3) ) self.train = True def__call__(self, x, t): # Forward pass h = self.l1(x) y = self.l2(h) ifself.train: self.loss = F.mean_squared_error(y, t) returnself.loss else: return y This is truly elegant and readable. This is not the only way to implement a model in Chainer but a commonly seen pattern. For instance, notice that the __call__ method (which is invoked directly on an instance of this class (for example, model = Autoencoder(); model(x, t)) acts as the loss function when the model train property is set to True. It takes both the input and the target; it performs a forward pass and then computes and returns the loss. It can be used as a regular feed forward network by setting the train property to False and skipping the target argument. Let's take a closer look at the connections and the loss function. Links Links, or in this case the layer connections defined by chainer.links.Linear, is a subclass of chainer.link.Link, a basic building block for a network. The linear link included in this autoencoder is usually referred to as the fully connected layer. When no other arguments are passed to the constructor of a Link, a bias vector is created behind the scenes with an initial value of 0. You may pass nobias=True to skip bias nodes altogether or set the initial bias vector to any arbitrary values. Other links except the linear include convolution layers, inception layers from GoogLeNet and LSTM layers to mention a few. The actual model itself inherits from chainer.link.Chain which is a subclass of chianer.link.Link, basically a container for multiple links. Sets of weights and biases for each layer can be accessed directly using chainer.links.Linear.W or chainer.links.Linear.b so we'd call l1.b to get the bias vector from the first layer. If a model is trained using another framework such as Caffe, those parameters could be loaded into memory and then copied over to any Chainer model by directly accessing the weights and bias values. It is therefore quite easy to convert a Caffe model to a Chainer model. Functions The function module, chainer.functions contains various loss functions such as the mean squared error used in the example, activation functions such as sigmoid, tanh, ReLU and Softmax. It also contains dropout, pooling functions, accuracy evaluation and basic arithmetic. It is a wide set of functions but they share the fact that they all inherit the chainer.function.Function base class. What it means is that they all implement the forward pass and back propagation logic. If for instance you want to implement your own loss function in Chainer, you'd have to inherit chainer.function.Function too and implement those necessary methods. More functions are introduced in later sections. Training a Model The code below shows the full training loop. It runs for 1000 epochs, meaning that it goes through the complete training set 1000 times. The order in which the training samples are iterated over is randomly shuffled in each epoch. We also split up the training samples into 10 batches so the weights are only updated 10 times in each epoch. # Assume that this code follows the previous data preparation and model # definition code from chainer import Variable, optimizers model = Autoencoder() learning_rate = 0.1 # Introducing the optimizer. It will be explained in the next section optimizer = optimizers.SGD(lr=learning_rate) optimizer.setup(model) epochs = 1000 batch_size = int(train_size / 10) for epoch in range(epochs): # Randomly change the order of the training samples in each epoch indexes = np.random.permutation(train_size) # Accumulate the loss over the epoch epoch_sum_loss = 0 for i in range(0, train_size, batch_size): batch_indexes = indexes[i : i + batch_size] batch_x_data = np.asarray(x_train[batch_indexes]) batch_y_data = np.asarray(y_train[batch_indexes]) x = Variable(batch_x_data) t = Variable(batch_y_data) optimizer.update(model, x, t) epoch_sum_loss += model.loss * batch_size epoch_avg_loss = epoch_sum_loss / train_size print('Epoch: {} Loss: {}'.format(epoch, epoch_avg_loss.data)) Running the code above might output something like this. You can see that the average loss is decreasing. Epoch: 0 Loss: 0.48210659623146057 Epoch: 1 Loss: 0.1797855794429779 Epoch: 2 Loss: 0.1128358468413353 Epoch: 3 Loss: 0.08458908647298813 Epoch: 4 Loss: 0.07391669601202011 Epoch: 5 Loss: 0.06650342792272568 Epoch: 6 Loss: 0.05791966989636421 Epoch: 7 Loss: 0.0553070604801178 Epoch: 8 Loss: 0.05461772903800011 Epoch: 9 Loss: 0.05078549310564995 ... Most of the code should be familiar but you might wonder what the optimizer is, which is covered in Part 3, along with batches, more complex networks, running the code on the GPU, and saving and loading data. Summary Defining and training neural networks with Chainer is intuitive and requires little code. It is easy to maintain and experiment with various hyper parameters because of its design. In this second part of the series with Chainer, we implemented a neural network and trained it with randomly generated data and common patterns were introduced such as how to design the model and the loss function to demonstrate this fact. Stay tuned for Part 3 where we cover the optimizer, batches, complex networks and running the code on the GPU. About the Author Hiroyuki Vincent Yamazaki is a graduate student at KTH, Royal Institute of Technology in Sweden, currently conducting research in convolutional neural networks at Keio University in Tokyo, partially using Chainer as a part of a double-degree programme. GitHub LinkedIn 
Read more
  • 0
  • 0
  • 1778

article-image-cloud-and-async-communication
Packt
03 Oct 2016
6 min read
Save for later

Cloud and Async Communication

Packt
03 Oct 2016
6 min read
In this article by Matteo Bortolu and Engin Polat, the author of the book Xamarin 4 By Example, we are going to create a new projects called fast food with help of Service and Presentation layer. (For more resources related to this topic, see here.) Example project – Xamarin fast food First of all, we create a new Xamarin.Forms PCL project. Prepare the empty subfolders of Core to define the Business Logic of our project. To use the Base classes, we need to import on our projects the SQLite.Net PCL from the NuGet Package manager. It is a good practice to update all the packages before you start. As soon as a new package has been updated, we will be notified on the Packages folder. To update the package right click on the Packages folder and select Update from the contextual menu. We can create, under the Business subfolder of the Core, the class MenuItem that contains the properties of the available Items to order. A MenuItem will have: Name Price Required seconds. The class will be developed as: public class MenuItem : BaseEntity<int> { public string Name { get; set; } public int RequiredSeconds { get; set; } public float Price { get; set; } } We will also prepare the Data Layer element and the Business Layer element for this class. In first instance they will only use the inheritance with the base classes. The Data layer will be coded like this: public class MenuItemData : BaseData<MenuItem, int>{ public MenuItemData () { }} and the Business layer will look like: public class MenuItemBusiness : BaseBusiness<MenuItem, int> { public MenuItemBusiness () : base (new MenuItemData ()) { } } Now we can add a new base class under the Services subfolder of the base layer. Service layer In this example we will develop a simple service that make the request wait for the required seconds. We will change the bsssssase service later in the article in order to make server requests. We will define our Base Service using a generic Base Entity type: public class BaseService<TEntity, TKey> where TEntity : BaseEntity<TKey> { // we will write here the code for the base service } Inside the Base Service we need to define an event to throw when the response is ready to be dispatched: public event ResponseReceivedHandler ResponseReceived; public delegate void ResponseReceivedHandler (TEntity item); We will raise this event when our process has been completed. Before we raise an event we always need to check if it has been subscribed from someone. It is a good practice to use a design pattern called observer. A design pattern is a model of solution for common problems and they help us to reuse the design of the software. To be compliant with the Observer we only need to add to the code we wrote, the following code snippet that raises the event only when the event has been subscribed: protected void OnResponseReceived (TEntity item) { if (ResponseReceived != null) { ResponseReceived (item); } } The only thing we need to do in order to raise the ResponseReceived event, is to call the method OnResponseReceived. Now we will write a base method that gives us a response after a number of seconds that we will pass as parameter as seen in the following code: public virtual asyncTask<TEntity>GetDelayedResponse(TEntity item,int seconds) { await Task.Delay (seconds * 1000); OnResponseReceived(item); return item; } We will use this base to simulate a delayed response. Let's create the Core service layer object for MenuItem. We can name it MenuItemService and it will inherit the BaseService as follows: public class MenuItemService : BaseService<MenuItem,int> { public MenuItemService () { } } We have now all the core ingredients to start writing our UI. Add a new empty class named OrderPage in the Presentation subfolder of Core. We will insert here a label to read the results and three buttons to make the requests: public class OrderPage : ContentPage { public OrderPage () : base () { Label response = new Label (); Button buttonSandwich = new Button { Text = "Order Sandwich" }; Button buttonSoftdrink = new Button { Text = "Order Drink" }; Button buttonShowReceipt = new Button { Text = "Show Receipt" }; // ... insert here the presentation logic } } Presentation layer We can now define the presentation logic creating instances of the business object and the service object. We will also define our items. MenuItemBusiness menuManager = new MenuItemBusiness (); MenuItemService service = new MenuItemService (); MenuItem sandwich = new MenuItem { Name = "Sandwich", RequiredSeconds = 10, Price = 5 }; MenuItem softdrink = new MenuItem { Name = "Sprite", RequiredSeconds = 5, Price = 2 }; Now we need to subscribe the buttons click event to send the order to our service. The GetDelayedResponse method of the service is simulating a slow response. In this case we will have a real delay that depends on the network availability and the time that the remote server needs to process the request and send back a response: buttonSandwich.Clicked += (sender, e) => { service.GetDelayedResponse (sandwich, sandwich.RequiredSeconds); }; buttonSoftdrink.Clicked += (sender, e) => { service.GetDelayedResponse (softdrink, softdrink.RequiredSeconds); }; Our service will raise an event when the response is ready. We can subscribe this event to present the results on the label and to save the items in our local database: service.ResponseReceived += (item) => { // Append the received item to the label response.Text += String.Format ("nReceived: {0} ({1}$)", item.Name, item.Price); // Read the data from the local database List<MenuItem> itemlist = menuManager.Read (); //calculate the new database key for the item item.Key = itemlist.Count == 0 ? 0 : itemlist.Max (x => x.Key) + 1; //Add The item in the local database menuManager.Create (item); }; We now can subscribe the click event of the receipt button in order to display an alert that displays the number of the items saved in the local database and the total price to pay: buttonShowReceipt.Clicked += (object sender, EventArgs e) => { List<MenuItem> itemlist = menuManager.Read (); float total = itemlist.Sum (x => x.Price); DisplayAlert ( "Receipt", String.Format( "Total:{0}$ ({1} items)", total, itemlist.Count), "OK"); }; The last step is to add the component to the content page: Content = new StackLayout { VerticalOptions = LayoutOptions.CenterAndExpand, HorizontalOptions = LayoutOptions.CenterAndExpand, Children = { response, buttonSandwich, buttonSoftdrink, buttonShowReceipt } }; At this point we are ready to run the iOS version and to try it out. In order to make the Android version work we need to set the permissions to read and write in the database file. To do that we can double click the Droid project and, under the section Android Application, check the ReadExternalStorage and WriteExternalStorage permissions: In the OnCreate method of the MainActivity of the Droid project we also need to: Create the database file when it hasn't been created yet. Set the database path in the Configuration file. var path = System.Environment.GetFolderPath ( System.Environment.SpecialFolder.ApplicationData ); if (!Directory.Exists (path)) { Directory.CreateDirectory (path); } var filename = Path.Combine (path, "fastfood.db"); if (!File.Exists (filename)) { File.Create (filename); } Configuration.DatabasePath = filename; Summary In this article, we have learned how to create a project in Xamarin with the help of Service and Presentation layer. We have also seen that, how to set read and write permissions to make an Android version work. Resources for Article: Further resources on this subject: A cross-platform solution with Xamarin.Forms and MVVM architecture [article] Working with Xamarin.Android [article] Integrating Accumulo into Various Cloud Platforms [article]
Read more
  • 0
  • 0
  • 21938

article-image-introduction-aws-lumberyard-game-development
Packt
03 Oct 2016
15 min read
Save for later

An Introduction to AWS Lumberyard Game Development

Packt
03 Oct 2016
15 min read
In this article by Dr. Edward Lavieri, author of the book Learning AWS Lumberyard Game Development, you will learn what the Lumberyard game engine means for game developers and the game development industry. (For more resources related to this topic, see here.) What is Lumberyard? Lumberyard is a free 3D game engine that has, in addition to typical 3D game engine capabilities, an impressive set of unique qualities. Most impressively, Lumberyard integrates with Amazon Web Services (AWS) for cloud computing and storage. Lumberyard, also referred to as Amazon Lumberyard, integrates with Twitch to facilitate in-game engagement with fans. Another component that makes Lumberyard unique among other game engines is the tremendous support for multiplayer games. The use of Amazon GameLift empowers developers to instantiate multiplayer game sessions with relative ease. Lumberyard is presented as a game engine intended for creating cross-platform AAA games. There are two important components of that statement. First, cross-platform refers to, in the case of Lumberyard, the ability to develop games for PC/Windows, PlayStation 4, and Xbox One. There is even additional support for Mac OS, iOS, and Android devices. The second component of the earlier statement is AAA games. A triple-A (AAA) game is like a top-grossing movie, one that had a tremendous budget, was extensively advertised, and wildly successful. If you can think of a console game (for Xbox One and/or PlayStation 4) that is advertised on national television, it is a sign the title is an AAA game. Now that this AAA game engine is available for free, it is likely that more than just AAA games will be developed using Lumberyard. This is an exciting time to be a game developer. More specifically, Amazon hopes that Lumberyard will be used to develop multiplayer online games that use AWS for cloud computing and storage, and that integrates with Twitch for user engagement. The engine is free, but AWS usage is not. Don't worry, you can create single player games with Lumberyard as well. System requirements Amazon recommends a system with the following specifications for developing games with Lumberyard: PC running a 64-bit version of Windows 7 or Windows 10 At least 8 GB RAM Minimum of 60 GB hard disk storage A 3 GHz or greater quad-core processor A DirectX 11 (DX11) compatible video card with at least 2 GB of video RAM (VRAM) As mentioned earlier, there is no support for running Lumberyard on a Mac OS or Linux computer. The game engine is a very large and complex software suite. You should take the system requirements seriously and, if at all possible, exceed the minimum requirements. Beta software As you likely know, the Lumberyard game engine is, at the time of this book's publication, in beta. What does that mean? It means a couple of things that are worth exploring. First, developers (that's you!) get early access to amazing software. Other than the cool factor of being able to experiment with a new game engine, it can accelerate game projects. There are several detractors to this as well. Here are the primary detractors from using beta software: Not all functions and features will be implemented. Depending on the engine's specific limitations, this can be a showstopper for your game project. Some functions and features might be partially implemented, not function correctly, or be unreliable. If the features that have these characteristics are not the ones you plan to use, then this is not an issue for you. This, of course, can be a tremendous problem. For example, let's say that the engine's gravity system is buggy. That would make testing your game very difficult as you would not be able to rely on the gravity system and not know if your code has issues or not. Things can change from release to release. Anything done in one beta version is apt to work just fine in subsequent beta releases. Things that tend to change between beta versions, other than bug fixes and improvements, are interface changes. This can slow a project up considerably as development workflows you have adopted may no longer work. In the next section, you will see what changes were ushered in with each sequential beta release. Release notes Amazon initially launched the Lumberyard game engine in February 2016. Since then, there have been several new versions. At the time of this book's publication, there were five releases: 1.0, 1.1, 1.2, 1.3, and 1.4. The following graphic shows the timeline of the five releases: Let's look at the major offerings of each beta release. Beta 1.0 The initial beta of the Lumberyard game engine was released on February 9, 2016. This was an innovative offering from Amazon. Lumberyard was released as a triple-A cross-platform game engine at no cost to developers. Developers had full access to the game engine along with the underlying source code. This permits developers to release games without a revenue share and to even create their own game engines using the Lumberyard source code as a base. Beta 1.1 Beta 1.1 was released just a few short weeks after Beta 1.0. According to Amazon, there were 208 feature upgrades, bug fixes, and improvements with this release. Here are the highlights: Autoscaling features Component Entity System FBX Importer New Game Gems Cloud Canvas Resource Manager New Twitch ChatPlay Features Beta 1.2 In just a few short weeks after Beta 1.1 was released, Beta 1.2 was made available. The rapid release of sequential beta versions is indicative of tremendous development work by the Lumberyard team. This also gives some indication as to the amount of support the game engine is likely to have once it is no longer in beta. With this beta, Amazon announced 218 enhancements and bug fixes to nearly two-dozen core Lumberyard components. Here are the largest Lumberyard game engine components to be upgraded in Beta 1.2: Particle editor Mannequin Geppetto FBX Importer Multiplayer Gem Cloud Canvas Resource Manager Beta 1.3 The three previous beta versions were released in subsequent months. This was an impressive pace, but not likely sustainable due to the tremendous complexities of game engine modifications and the fact that the Lumberyard game engine continues to mature. Released in June 2016, the Beta 1.3 release of Lumberyard introduced support for Virtual Reality (VR) and High Dynamic Range (HDR). Adding support for VR and HDR is enough reason to release a new beta version. Impressively, this release also contained over 130 enhancements and bug fixes to the game engine. Here is partial list of game engine components that were updated in this release: Volumetric fog Motion blur Height Mapped Ambient Occlusion Depth of field Emittance Integrated Graphics Profiler FBX Importer UI Editor FlowGraph Nodes Cloud Canvas Resource Manager Beta 1.4 At the time of this book's publication, the current beta version of Lumberyard was 1.4, which was released in August 2016. This release contained over 230 enhancements and bug fixes as well as some new features. The primary focus of this release seemed to focus on multiplayer games and making them more efficient. The result of the changes provided in this release are greater cost-efficiencies for multiplayer games when using Amazon GameLift. Sample game content Creating triple-A games is a complex process that typically involves a large number of people in a variety of roles including developers, designers, artists, and more. There is no industry average for how long it takes to create a Triple-A game because there are too many variables including budget, team size, game specifications, and individual and team experience. This being said, it is likely to take up to 2 years to create a triple-A game from scratch. Triple-A, or AAA, games typically have very large budgets, large design and development teams, large advertising efforts, and are are largely successful. In a nutshell, Triple-A games are large! Around 2 years is a long time, so we have shortened things for you in this book using available game samples that come bundled with the game engine. As illustrated in the following section, Lumberyard comes with a great set of starter content and sample games. Starter content When you first launch Lumberyard, you are able to create a new level or open an existing one. In the Open a Level dialog window, you will find nine levels listed under Levels | GettingStartedFiles. Each of these levels presents a unique opportunity to explore the game engine and learn how things are done. Let's look at each of these, next. getting-started-completed-level As the level name suggests, this is a complete game level featuring a small game grid and a player-controlled robot character. The character is moved to the standard WASD keyboard keys, rotated with the mouse, and uses the spacebar to jump. The level does a great job of demonstrating physics. As is indicated in the following screenshot, the level contains a wall of blocks that have natural physics applied. The robot can run into the wall and fallen blocks as well as use the ramp to launch into the wall. More than just playing the game, you can examine how it was created. This level is fully playable. Simply use the Ctrl + G keyboard combination to enter Game Mode. When you are through playing the game, press the Ecs key to exit. This completed level can be further examined by loading the remaining levels, featured in this section. These subsequent levels make it easier to examine specific components of the level. start-section03-terrain This section simply contains the terrain. The complete terrain is provided as well as additional space to explore and practice creating, duplicating, and placing objects. start-section04-lighting This level is presented with an expanded terrain to help you explore lighting options. There are environmental lighting effects as well as street lamp objects that can be used to emit light and generate shadows in the game. This level is not playable and is provided to aid your learning of Lumberyard's lighting system. start-section05-camera-playerstart This non-playable level is convenient for examining camera placement and discovering how that impacts the player's starting position on game launch. start-section06-designer-objects This level is playable, but only to the extent that you can control the robot character and explore the game's environment. With this level, you can focus your exploration on editing the objects. start-section07-materials This level includes the full game environment along with two natively created objects: a block and a sphere. You can freely edit these objects and see how they look in Game Mode. This represents a great way to learn as it is a no-risk situation. This simply means that you do not have to save your changes, as you are essentially working in a sandbox with no impact to a real game project. This is a playable level that allows you to explore the game environment and preview any changes you make to the level. start-section08-physics This starter level has the same two 3D objects (block and sphere) as the previous starter level. In this level, the objects have textures. No physics are already applied to this level, so it is a good level to use to practice creating objects with physics. One option is to attempt to replicate the wall of stacked objects that is present in the completed level. start-section09-flowgraph-scripting This playable level contains the wall of 3D blocks that can be knocked over. The game's gameplay is instantiated with FlowGraphs, which can be viewed and edited using this starter level. start-section10-audio This final starter level contains the full playable game that serves as a testing ground for implementing audio in the game. Sample games There are six sample unrelated game levels accessible through the Open a Level dialog window, you will find nine levels listed under Levels | Samples. Each of these levels are single level games that demonstrate specific functionality and gameplay. Let's look at each of these next. Animation_Basic_Sample This game level contains an animated character in an empty game environment. There is a camera and light. You can play the game to watch the animated character's idle animation. When you create 3D characters, you will use Geppetto, Lumberyard's animation tool. Camera_Sample You can use this sample game to help learn how to create gameplay. The sample game includes a Heads Up Display (HUD) that presents the player with three game modes, each with a different camera. This game can also be used to further explore FlowGraphs and the FlowGraph Editor. Dont_Die The Don't Die game level provides a colorful example of full gameplay with an interactive menu system. The game starts with a Press any key to start message followed by a color selection menu depicted here. The selected color is applied to the spacecraft used in the game. Movers_Sample With this game, you can learn how to instantiate animations triggered by user input. Using this game, you will also gain exposure to FlowGraphs and FlowGraph Editor. Trigger_Sample This sample game provides several examples of Proximity and Area Triggers. Here is the list of triggers instantiated in this sample game: Proximity trigger Player only Any entity One entity at a time Only selected entity Three entities required to trigger Area trigger Two volumes use same trigger Stand in all three volumes at the same time Step inside each trigger in any order Step inside each trigger in correct sequence UIEditor_Sample This sample game is not playable but provides a commercial-quality User Interface (UI) example. If you run the level in Game Mode, you will not have a game to play, but the stunning visuals of the UI give you a glimpse of what is possible with the Lumberyard game engine. Amazon Web Services AWS is a family of cloud-based scalable services to support, in the context of Lumberyard, your game. AWS includes several technologies that can support your Lumberyard game. These services include: Cloud Canvas Cloud Computing GameLift Simple Notification Service (SNS) Simple Query Service (SQS) Simple Storage Service (S3) Asset creation Game assets include graphic files such as materials, textures, color palettes, 2D objects, and 3D objects. These assets are used to bring a game to life. A terrain, for example, is nothing without grass and dirt textures applied to it. Much of this content is likely to be created with external tools. One internal tool used to implement the externally created graphical assets is the Material Editor. Audio system Lumberyard has Audio System that controls how in-game audio is instantiated. No audio sounds are created directly in Lumberyard. Instead, they are created using Wwise Software (Wave Works Interactive Sound Engine) by Audiokinetic. Because audio is created external to Lumberyard, a game project's audio team will likely consist of content creators and developers that implement the content in the Lumberyard game. Cinematics system Lumberyard has a Cinematics System that can be used to create cut-scenes and promotional videos. With this system, you can also make your cinematics interactive. Flow graph system Lumberyard's flow graph system is a visual scripting system for creating gameplay. This tool is likely to be used by many of your smaller teams. It can be beneficial to have someone that oversees all Flow Graphs to ensure compatibility and standardization. Geppetto Geppetto is Lumberyard's character tool. A character team will likely create the game's characters using external tools such as Maya or 3D Studio Max. Using those systems, they can export the necessary files to support importing the character assets into your Lumberyard game. Lumberyard has an FBX Importer tool that is used to import characters created in external programs. Mannequin editor Animating objects, especially 3D objects, is a complex process that takes artistic talent, and technical expertise. Some projects incorporate separate teams for object creation and animation. For example, you might have a small team that creates robot characters and another team that generates their animations. Production team The production team is responsible for creating builds and distributing releases. They will also handle testing coordination. One of their primary tools will be the Waf Build System. Terrain editor A game's environment consists of terrain and objects. The terrain is the foundation for the entire game experience and is the focus of exacting efforts. The creation of a terrain starts when a new level is created. The Height Map resolution is the first decision a level editor, or person responsible for creating terrain, is faced with. Twitch ChatPlay system Twitch integration represents exciting game possibilities. Twitch integration allows you to engage your game's users in unique ways. UI editor Creating a user interface is often, at least on very large projects, the responsibility of a specialized team. This team, or individual, will create the user interface components on each game level to ensure consistency. Artwork required for the user interfaces is likely to be produced by the Asset team. Summary In this article, you learned about AWS Lumberyard and what it is capable of. You gained an appreciation for Lumberyard's significance to the game development industry. You also learned about the beta history of Lumberyard and how quickly it is maturing into a game engine of choice. Resources for Article: Further resources on this subject: What Makes a Game a Game? [article] Integrating Accumulo into Various Cloud Platforms [article] CryENGINE 3: Breaking Ground with Sandbox [article]
Read more
  • 0
  • 0
  • 19398

article-image-extending-yii
Packt
03 Oct 2016
14 min read
Save for later

Extending Yii

Packt
03 Oct 2016
14 min read
Introduction      In this article by Dmitry Eliseev, the author of the book Yii Application Development Cookbook Third Edition, we will see three Yii extensions—helpers, behaviors, and components. In addition, we will learn how to make your extension reusable and useful for the community and will focus on the many things you should do in order to make your extension as efficient as possible. (For more resources related to this topic, see here.) Helpers There are a lot of built-in framework helpers, like StringHelper in the yiihelpers namespace. It contains sets of helpful static methods for manipulating strings, files, arrays, and other subjects. In many cases, for additional behavior you can create your own helper and put any static functions into one. For example, we will implement a number helper in this recipe. Getting ready Create a new yii2-app-basic application by using composer, as described in the official guide at http://www.yiiframework.com/doc-2.0/guide-start-installation.html. How to do it… Create the helpers directory in your project and write the NumberHelper class: <?php namespace apphelpers; class NumberHelper { public static function format($value, $decimal = 2) { return number_format($value, $decimal, '.', ','); } } Add the actionNumbers method into SiteController: <?php ... class SiteController extends Controller { … public function actionNumbers() { return $this->render('numbers', ['value' => 18878334526.3]); } } Add the views/site/numbers.php view: <?php use apphelpersNumberHelper; use yiihelpersHtml; /* @var $this yiiwebView */ /* @var $value float */ $this->title = 'Numbers'; $this->params['breadcrumbs'][] = $this->title; ?> <div class="site-numbers"> <h1><?= Html::encode($this->title) ?></h1> <p> Raw number:<br /> <b><?= $value ?></b> </p> <p> Formatted number:<br /> <b><?= NumberHelper::format($value) ?></b> </p> </div> Open the action and see this result: In other cases you can specify another count of decimal numbers; for example: NumberHelper::format($value, 3) How it works… Any helper in Yii2 is just a set of functions implemented as static methods in corresponding classes. You can use one to implement any different format of output for manipulations with values of any variable, and for other cases. Note: Usually, static helpers are light-weight clean functions with a small count of arguments. Avoid putting your business logic and other complicated manipulations into helpers . Use widgets or other components instead of helpers in other cases. See also For more information about helpers, refer to http://www.yiiframework.com/doc-2.0/guide-helper-overview.html. And for examples of built-in helpers, see sources in the helpers directory of the framework, refer to https://github.com/yiisoft/yii2/tree/master/framework/helpers. Creating model behaviors There are many similar solutions in today's web applications. Leading products such as Google's Gmail are defining nice UI patterns; one of these is soft delete. Instead of a permanent deletion with multiple confirmations, Gmail allows users to immediately mark messages as deleted and then easily undo it. The same behavior can be applied to any object such as blog posts, comments, and so on. Let's create a behavior that will allow marking models as deleted, restoring models, selecting not yet deleted models, deleted models, and all models. In this recipe we'll follow a test-driven development approach to plan the behavior and test if the implementation is correct. Getting ready Create a new yii2-app-basic application by using composer, as described in the official guide at http://www.yiiframework.com/doc-2.0/guide-start-installation.html. Create two databases for working and for tests. Configure Yii to use the first database in your primary application in config/db.php. Make sure the test application uses a second database in tests/codeception/config/config.php. Create a new migration: <?php use yiidbMigration; class m160427_103115_create_post_table extends Migration { public function up() { $this->createTable('{{%post}}', [ 'id' => $this->primaryKey(), 'title' => $this->string()->notNull(), 'content_markdown' => $this->text(), 'content_html' => $this->text(), ]); } public function down() { $this->dropTable('{{%post}}'); } } Apply the migration to both working and testing databases: ./yii migrate tests/codeception/bin/yii migrate Create a Post model: <?php namespace appmodels; use appbehaviorsMarkdownBehavior; use yiidbActiveRecord; /** * @property integer $id * @property string $title * @property string $content_markdown * @property string $content_html */ class Post extends ActiveRecord { public static function tableName() { return '{{%post}}'; } public function rules() { return [ [['title'], 'required'], [['content_markdown'], 'string'], [['title'], 'string', 'max' => 255], ]; } } How to do it… Let's prepare a test environment, starting with defining the fixtures for the Post model. Create the tests/codeception/unit/fixtures/PostFixture.php file: <?php namespace apptestscodeceptionunitfixtures; use yiitestActiveFixture; class PostFixture extends ActiveFixture { public $modelClass = 'appmodelsPost'; public $dataFile = '@tests/codeception/unit/fixtures/data/post.php'; } Add a fixture data file in tests/codeception/unit/fixtures/data/post.php: <?php return [ [ 'id' => 1, 'title' => 'Post 1', 'content_markdown' => 'Stored *markdown* text 1', 'content_html' => "<p>Stored <em>markdown</em> text 1</p>n", ], ]; Then, we need to create a test case tests/codeception/unit/MarkdownBehaviorTest: . .php: <?php namespace apptestscodeceptionunit; use appmodelsPost; use apptestscodeceptionunitfixturesPostFixture; use yiicodeceptionDbTestCase; class MarkdownBehaviorTest extends DbTestCase { public function testNewModelSave() { $post = new Post(); $post->title = 'Title'; $post->content_markdown = 'New *markdown* text'; $this->assertTrue($post->save()); $this->assertEquals("<p>New <em>markdown</em> text</p>n", $post->content_html); } public function testExistingModelSave() { $post = Post::findOne(1); $post->content_markdown = 'Other *markdown* text'; $this->assertTrue($post->save()); $this->assertEquals("<p>Other <em>markdown</em> text</p>n", $post->content_html); } public function fixtures() { return [ 'posts' => [ 'class' => PostFixture::className(), ] ]; } } Run unit tests: codecept run unit MarkdownBehaviorTest and ensure that tests have not passed Codeception PHP Testing Framework v2.0.9 Powered by PHPUnit 4.8.27 by Sebastian Bergmann and contributors. Unit Tests (2) --------------------------------------------------------------------------- Trying to test ... MarkdownBehaviorTest::testNewModelSave Error Trying to test ... MarkdownBehaviorTest::testExistingModelSave Error --------------------------------------------------------------------------- Time: 289 ms, Memory: 16.75MB Now we need to implement a behavior, attach it to the model, and make sure the test passes. Create a new directory, behaviors. Under this directory, create the MarkdownBehavior class: <?php namespace appbehaviors; use yiibaseBehavior; use yiibaseEvent; use yiibaseInvalidConfigException; use yiidbActiveRecord; use yiihelpersMarkdown; class MarkdownBehavior extends Behavior { public $sourceAttribute; public $targetAttribute; public function init() { if (empty($this->sourceAttribute) || empty($this->targetAttribute)) { throw new InvalidConfigException('Source and target must be set.'); } parent::init(); } public function events() { return [ ActiveRecord::EVENT_BEFORE_INSERT => 'onBeforeSave', ActiveRecord::EVENT_BEFORE_UPDATE => 'onBeforeSave', ]; } public function onBeforeSave(Event $event) { if ($this->owner->isAttributeChanged($this->sourceAttribute)) { $this->processContent(); } } private function processContent() { $model = $this->owner; $source = $model->{$this->sourceAttribute}; $model->{$this->targetAttribute} = Markdown::process($source); } } Let's attach the behavior to the Post model: class Post extends ActiveRecord { ... public function behaviors() { return [ 'markdown' => [ 'class' => MarkdownBehavior::className(), 'sourceAttribute' => 'content_markdown', 'targetAttribute' => 'content_html', ], ]; } } Run the test and make sure it passes: Codeception PHP Testing Framework v2.0.9 Powered by PHPUnit 4.8.27 by Sebastian Bergmann and contributors. Unit Tests (2) --------------------------------------------------------------------------- Trying to test ... MarkdownBehaviorTest::testNewModelSave Ok Trying to test ... MarkdownBehaviorTest::testExistingModelSave Ok --------------------------------------------------------------------------- Time: 329 ms, Memory: 17.00MB That's it. We've created a reusable behavior and can use it for all future projects by just connecting it to a model. How it works… Let's start with the test case. Since we want to use a set of models, we will define fixtures. A fixture set is put into the DB each time the test method is executed. We will prepare unit tests for specifying how the behavior works: First, we test processing new model content. The behavior must convert Markdown text from a source attribute to HTML and store the second one to target attribute. Second, we test updated content of an existing model. After changing Markdown content and saving the model, we must get updated HTML content. Now let's move to the interesting implementation details. In behavior, we can add our own methods that will be mixed into the model that the behavior is attached to. We can also subscribe to our own component events. We are using it to add our own listener: public function events() { return [ ActiveRecord::EVENT_BEFORE_INSERT => 'onBeforeSave', ActiveRecord::EVENT_BEFORE_UPDATE => 'onBeforeSave', ]; } And now we can implement this listener: public function onBeforeSave(Event $event) { if ($this->owner->isAttributeChanged($this->sourceAttribute)) { $this->processContent(); } } In all methods, we can use the owner property to get the object the behavior is attached to. In general we can attach any behavior to your models, controllers, application, and other components that extend the yiibaseComponent class. We can also attach one behavior again and again to model for the processing of different attributes: class Post extends ActiveRecord { ... public function behaviors() { return [ [ 'class' => MarkdownBehavior::className(), 'sourceAttribute' => 'description_markdown', 'targetAttribute' => 'description_html', ], [ 'class' => MarkdownBehavior::className(), 'sourceAttribute' => 'content_markdown', 'targetAttribute' => 'content_html', ], ]; } } Besides, we can also extend the yiibaseAttributeBehavior class, like yiibehaviorsTimestampBehavior, to update specified attributes for any event. See also To learn more about behaviors and events, refer to the following pages: http://www.yiiframework.com/doc-2.0/guide-concept-behaviors.html http://www.yiiframework.com/doc-2.0/guide-concept-events.html For more information about Markdown syntax, refer to http://daringfireball.net/projects/markdown/. Creating components If you have some code that looks like it can be reused but you don't know if it's a behavior, widget, or something else, it's most probably a component. The component should be inherited from the yiibaseComponent class. Later on, the component can be attached to the application and configured using the components section of a configuration file. That's the main benefit compared to using just a plain PHP class. We are also getting behaviors, events, getters, and setters support. For our example, we'll implement a simple Exchange application component that will be able to get currency rates from the http://fixer.io site, attach them to the application, and use them. Getting ready Create a new yii2-app-basic application by using composer, as described in the official guide at http://www.yiiframework.com/doc-2.0/guide-start-installation.html. How to do it… To get a currency rate, our component should send an HTTP GET query to a service URL, like http://api.fixer.io/2016-05-14?base=USD. The service must return all supported rates on the nearest working day: { "base":"USD", "date":"2016-05-13", "rates": { "AUD":1.3728, "BGN":1.7235, ... "ZAR":15.168, "EUR":0.88121 } } The component should extract needle currency from the response in a JSON format and return a target rate. Create a components directory in your application structure. Create the component class example with the following interface: <?php namespace appcomponents; use yiibaseComponent; class Exchange extends Component { public function getRate($source, $destination, $date = null) { } } Implement the component functional: <?php namespace appcomponents; use yiibaseComponent; use yiibaseInvalidConfigException; use yiibaseInvalidParamException; use yiicachingCache; use yiidiInstance; use yiihelpersJson; class Exchange extends Component { /** * @var string remote host */ public $host = 'http://api.fixer.io'; /** * @var bool cache results or not */ public $enableCaching = false; /** * @var string|Cache component ID */ public $cache = 'cache'; public function init() { if (empty($this->host)) { throw new InvalidConfigException('Host must be set.'); } if ($this->enableCaching) { $this->cache = Instance::ensure($this->cache, Cache::className()); } parent::init(); } public function getRate($source, $destination, $date = null) { $this->validateCurrency($source); $this->validateCurrency($destination); $date = $this->validateDate($date); $cacheKey = $this->generateCacheKey($source, $destination, $date); if (!$this->enableCaching || ($result = $this->cache->get($cacheKey)) === false) { $result = $this->getRemoteRate($source, $destination, $date); if ($this->enableCaching) { $this->cache->set($cacheKey, $result); } } return $result; } private function getRemoteRate($source, $destination, $date) { $url = $this->host . '/' . $date . '?base=' . $source; $response = Json::decode(file_get_contents($url)); if (!isset($response['rates'][$destination])) { throw new RuntimeException('Rate not found.'); } return $response['rates'][$destination]; } private function validateCurrency($source) { if (!preg_match('#^[A-Z]{3}$#s', $source)) { throw new InvalidParamException('Invalid currency format.'); } } private function validateDate($date) { if (!empty($date) && !preg_match('#d{4}-d{2}-d{2}#s', $date)) { throw new InvalidParamException('Invalid date format.'); } if (empty($date)) { $date = date('Y-m-d'); } return $date; } private function generateCacheKey($source, $destination, $date) { return [__CLASS__, $source, $destination, $date]; } } Attach our component in the config/console.php or config/web.php configuration files: 'components' => [ 'cache' => [ 'class' => 'yiicachingFileCache', ], 'exchange' => [ 'class' => 'appcomponentsExchange', 'enableCaching' => true, ], // ... db' => $db, ], We can now use a new component directly or via a get method: echo Yii::$app->exchange->getRate('USD', 'EUR'); echo Yii::$app->get('exchange')->getRate('USD', 'EUR', '2014-04-12'); Create a demonstration console controller: <?phpnamespace appcommands;use yiiconsoleController;class ExchangeController extends Controller{ public function actionTest($currency, $date = null) { echo Yii::$app->exchange->getRate('USD', $currency, $date) . PHP_EOL; }} And try to run any commands: $ ./yii exchange/test EUR > 0.90196 $ ./yii exchange/test EUR 2015-11-24 > 0.93888 $ ./yii exchange/test OTHER > Exception 'yiibaseInvalidParamException' with message 'Invalid currency format.' $ ./yii exchange/test EUR 2015/24/11 Exception 'yiibaseInvalidParamException' with message 'Invalid date format.' $ ./yii exchange/test ASD > Exception 'RuntimeException' with message 'Rate not found.' As a result you must see rate values in success cases or specific exceptions in error ones. In addition to creating your own components, you can do more. Overriding existing application components Most of the time there will be no need to create your own application components, since other types of extensions, such as widgets or behaviors, cover almost all types of reusable code. However, overriding core framework components is a common practice and can be used to customize the framework's behavior for your specific needs without hacking into the core. For example, to be able to format numbers using the Yii::app()->formatter->asNumber($value) method instead of the NumberHelper::format method from the Helpers recipe, follow the next steps: Extend the yiii18nFormatter component like the following: <?php namespace appcomponents; class Formatter extends yiii18nFormatter { public function asNumber($value, $decimal = 2) { return number_format($value, $decimal, '.', ','); } } Override the class of the built-in formatter component: 'components' => [ // ... formatter => [ 'class' => 'appcomponentsFormatter, ], // … ], Right now, we can use this method directly: echo Yii::app()->formatter->asNumber(1534635.2, 3); or as a new format for GridView and DetailView widgets: <?= yiigridGridView::widget([ 'dataProvider' => $dataProvider, 'columns' => [ 'id', 'created_at:datetime', 'title', 'value:number', ], ]) ?> You can also extend every existing component without overwriting its source code. How it works… To be able to attach a component to an application it can be extended from the yiibaseComponent class. Attaching is as simple as adding a new array to the components’ section of configuration. There, a class value specifies the component's class and all other values are set to a component through the corresponding component's public properties and setter methods. Implementation itself is very straightforward; We are wrapping http://api.fixer.io calls into a comfortable API with validators and caching. We can access our class by its component name using Yii::$app. In our case, it will be Yii::$app->exchange. See also For official information about components, refer to http://www.yiiframework.com/doc-2.0/guide-concept-components.html. For the NumberHelper class sources, see Helpers recipe. Summary In this article we learnt about the Yii extensions—helpers, behavior, and components. Helpers contains sets of helpful static methods for manipulating strings, files, arrays, and other subjects. Behaviors allow you to enhance the functionality of an existing component class without needing to change the class's inheritance. Components are the main building blocks of Yii applications. A component is an instance of CComponent or its derived class. Using a component mainly involves accessing its properties and raising/handling its events. Resources for Article: Further resources on this subject: Creating an Extension in Yii 2 [article] Atmosfall – Managing Game Progress with Coroutines [article] Optimizing Games for Android [article]
Read more
  • 0
  • 0
  • 13205
article-image-build-universal-javascript-app-part-2
John Oerter
30 Sep 2016
10 min read
Save for later

Build a Universal JavaScript App, Part 2

John Oerter
30 Sep 2016
10 min read
In this post series, we will walk through how to write a universal (or isomorphic) JavaScript app. Part 1 covered what a universal JavaScript application is, why it is such an exciting concept, and the first two steps for creating our app, which are serving post data and adding React. In this second part of the series, we walk through steps 3-6, which are client-side routing with React Router, server rendering, data flow refactoring, and data loading of the app. Let’s get started. Save on some of our very best React and Angular product from the 7th to 13th November - it's a perfect opportunity to get stuck into two tools that are truly redefining modern web development. Save 50% on featured eBooks and 80% on featured video courses here. Step 3: Client-side routing with React Router git checkout client-side-routing && npm install Now that we're pulling and displaying posts, let's add some navigation to individual pages for each post. To do this, we will turn our list of posts from step 2 (see the Part 1 post) into links that are always present on the page. Each post will live at http://localhost:3000/:postId/:postSlug. We can use React Router and a routes.js file to set up this structure: // components/routes.js import React from 'react' import { Route } from 'react-router' import App from './App' import Post from './Post' module.exports = ( <Route path="/" component={App}> <Route path="/:postId/:postName" component={Post} /> </Route> ) We've changed the render method in App.js to render links to posts instead of just <li> tags: // components/App.js import React from 'react' import { Link } from 'react-router' const allPostsUrl = '/api/post' class App extends React.Component { constructor(props) { super(props) this.state = { posts: [] } } ... render() { const posts = this.state.posts.map((post) => { const linkTo = `/${post.id}/${post.slug}`; return ( <li key={post.id}> <Link to={linkTo}>{post.title}</Link> </li> ) }) return ( <div> <h3>Posts</h3> <ul> {posts} </ul> {this.props.children} </div> ) } } export default App And, we'll add a Post.js component to render each post's content: // components/Post.js import React from 'react' class Post extends React.Component { constructor(props) { super(props) this.state = { title: '', content: '' } } fetchPost(id) { const request = new XMLHttpRequest() request.open('GET', '/api/post/' + id, true) request.setRequestHeader('Content-type', 'application/json'); request.onload = () => { if (request.status === 200) { const response = JSON.parse(request.response) this.setState({ title: response.title, content: response.content }); } } request.send(); } componentDidMount() { this.fetchPost(this.props.params.postId) } componentWillReceiveProps(nextProps) { this.fetchPost(nextProps.params.postId) } render() { return ( <div> <h3>{this.state.title}</h3> <p>{this.state.content}</p> </div> ) } } export default Post The componentDidMount() and componentWillReceiveProps() methods are important because they let us know when we should fetch a post from the server. componentDidMount() will handle the first time the Post.js component is rendered, and then componentWillReceiveProps() will take over as React Router handles rerendering the component with different props. Run npm build:client && node server.js again to build and run the app. You will now be able to go to http://localhost:3000 and navigate around to the different posts. However, if you try to refresh on a single post page, you will get something like Cannot GET /3/debugging-node-apps. That's because our Express server doesn't know how to handle that kind of route. React Router is handling it completely on the front end. Onward to server rendering! Step 4: Server rendering git checkout server-rendering && npm install Okay, now we're finally getting to the good stuff. In this step, we'll use React Router to help our server take application requests and render the appropriate markup. To do that, we need to also build a server bundle like we build a client bundle, so that the server can understand JSX. Therefore, we've added the below webpack.server.config.js: // webpack.server.config.js var fs = require('fs') var path = require('path') module.exports = { entry: path.resolve(__dirname, 'server.js'), output: { filename: 'server.bundle.js' }, target: 'node', // keep node_module paths out of the bundle externals: fs.readdirSync(path.resolve(__dirname, 'node_modules')).concat([ 'react-dom/server', 'react/addons', ]).reduce(function (ext, mod) { ext[mod] = 'commonjs ' + mod return ext }, {}), node: { __filename: true, __dirname: true }, module: { loaders: [ { test: /.js$/, exclude: /node_modules/, loader: 'babel-loader?presets[]=es2015&presets[]=react' } ] } } We've also added the following code to server.js: // server.js import React from 'react' import { renderToString } from 'react-dom/server' import { match, RouterContext } from 'react-router' import routes from './components/routes' const app = express() ... app.get('*', (req, res) => { match({ routes: routes, location: req.url }, (err, redirect, props) => { if (err) { res.status(500).send(err.message) } else if (redirect) { res.redirect(redirect.pathname + redirect.search) } else if (props) { const appHtml = renderToString(<RouterContext {...props} />) res.send(renderPage(appHtml)) } else { res.status(404).send('Not Found') } }) }) function renderPage(appHtml) { return ` <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Universal Blog</title> </head> <body> <div id="app">${appHtml}</div> <script src="/bundle.js"></script> </body> </html> ` } ... Using React Router's match function, the server can find the appropriate requested route, renderToString, and send the markup down the wire. Run npm start to build the client and server bundles and start the app. Fantastic right? We're not done yet. Even though the markup is being generated on the server, we're still fetching all the data client side. Go ahead and click through the posts with your dev tools open, and you'll see the requests. It would be far better to load the data while we're rendering the markup instead of having to request it separately on the client. Since server rendering and universal apps are still bleeding-edge, there aren't really any established best practices for data loading. If you're using some kind of Flux implementation, there may be some specific guidance. But for this use case, we will simply grab all the posts and feed them through our app. In order to this, we first need to do some refactoring on our current architecture. Step 5: Data Flow Refactor git checkout data-flow-refactor && npm install It's a little weird how each post page has to make a request to the server for its content, even though the App component already has all the posts in its state. A better solution would be to have an App simply pass the appropriate content down to the Post component. // components/routes.js import React from 'react' import { Route } from 'react-router' import App from './App' import Post from './Post' module.exports = ( <Route path="/" component={App}> <Route path="/:postId/:postName" /> </Route> ) In our routes.js, we've made the Post route a componentless route. It's still a child of the App route, but now has to completely rely on the App component for rendering. Below are the changes to App.js: // components/App.js ... render() {    const posts = this.state.posts.map((post) => {      const linkTo = `/${post.id}/${post.slug}`;      return (        <li key={post.id}>          <Link to={linkTo}>{post.title}</Link>        </li>      )    })    const { postId, postName } = this.props.params;    let postTitle, postContent    if (postId && postName) {      const post = this.state.posts.find(p => p.id == postId)      postTitle = post.title      postContent = post.content    }    return (      <div>        <h3>Posts</h3>        <ul>          {posts}        </ul>        {postTitle && postContent ? (          <Post title={postTitle} content={postContent} />        ) : (          <h1>Welcome to the Universal Blog!</h1>        )}      </div>    ) } } export default App If we are on a post page, then props.params.postId and props.params.postName will both be defined and we can use them to grab the desired post and pass the data on to the Post component to be rendered. If those properties are not defined, then we're on the home page and can simply render a greeting. Now, our Post.js component can be a simple stateless functional component that simply renders its properties. // components/Post.js import React from 'react' const Post = ({title, content}) => ( <div> <h3>{title}</h3> <p>{content}</p> </div>) export default Post With that refactoring complete, we're ready to implement data loading. Step 6: Data Loading git checkout data-loading && npm install For this final step, we just need to make two small changes in server.js and App.js: // server.js ... app.get('*', (req, res) => { match({ routes: routes, location: req.url }, (err, redirect, props) => { if (err) { res.status(500).send(err.message) } else if (redirect) { res.redirect(redirect.pathname + redirect.search) } else if (props) { const routerContextWithData = ( <RouterContext {...props} createElement={(Component, props) => { return <Component posts={posts} {...props} /> }} /> ) const appHtml = renderToString(routerContextWithData) res.send(renderPage(appHtml)) } else { res.status(404).send('Not Found') } }) }) ... // components/App.js import React from 'react' import Post from './Post' import { Link, IndexLink } from 'react-router' const allPostsUrl = '/api/post' class App extends React.Component { constructor(props) { super(props) this.state = { posts: props.posts || [] } } ... In server.js, we're changing how the RouterContext creates elements by overwriting its createElement function and passing in our data as additional props. These props will get passed to any component that is matched by the route, which in this case will be our App component. Then, when the App component is initialized, it sets its posts state property to what it got from props or an empty array. That's it! Run npm start one last time, and cruise through your app. You can even disable JavaScript, and the app will automatically degrade to requesting whole pages. Thanks for reading! About the author John Oerter is a software engineer from Omaha, Nebraska, USA. He has a passion for continuous improvement and learning in all areas of software development, including Docker, JavaScript, and C#. He blogs at here.
Read more
  • 0
  • 0
  • 9870

article-image-functions-swift
Packt
30 Sep 2016
15 min read
Save for later

Functions in Swift

Packt
30 Sep 2016
15 min read
In this article by Dr. Fatih Nayebi, the author of the book Swift 3 Functional Programming, we will see that as functions are the fundamental building blocks in functional programming, this article dives deeper into it and explains all the aspects related to the definition and usage of functions in functional Swift with coding examples. This article will cover the following topics with coding examples: The general syntax of functions Defining and using function parameters Setting internal and external parameters Setting default parameter values Defining and using variadic functions Returning values from functions Defining and using nested functions (For more resources related to this topic, see here.) What is a function? Object-oriented programming (OOP) looks very natural to most developers as it simulates a real-life situation of classes or, in other words, blueprints and their instances, but it brought a lot of complexities and problems such as instance and memory management, complex multithreading, and concurrency programming. Before OOP became mainstream, we were used to developing in procedural languages. In the C programming language, we did not have objects and classes; we would use structs and function pointers. So now we are talking about functional programming that relies mostly on functions just as procedural languages relied on procedures. We are able to develop very powerful programs in C without classes; in fact, most operating systems are developed in C. There are other multipurpose programming languages such as Go by Google that is not object-oriented and is getting very popular because of its performance and simplicity. So, are we going to be able to write very complex applications without classes in Swift? We might wonder why we should do this. Generally, we should not, but attempting it will introduce us to the capabilities of functional programming. A function is a block of code that executes a specific task, can be stored, can persist data, and can be passed around. We define them in standalone Swift files as global functions or inside other building blocks such as classes, structs, enums, and protocols as methods. They are called methods if they are defined in classes but in terms of definition, there is no difference between a function and method in Swift. Defining them in other building blocks enables methods to use the scope of the parent or to be able to change them. They can access the scope of their parent and they have their own scope. Any variable that is defined inside a function is not accessible outside of it. The variables defined inside them and the corresponding allocated memory goes away when the function terminates. Functions are very powerful in Swift. We can compose a program with only functions as functions can receive and return functions, capture variables that exist in the context they were declared, and can persist data inside themselves. To understand the functional programming paradigms, we need to understand the capability of functions in detail. We need to think if we can avoid classes and only use functions so we will cover all the details related to functions in the upcoming sections of this article. The general syntax of functions and methods We can define functions or methods as follows: accessControl func functionName(parameter: ParameterType) throws -> ReturnType { } As we know already, when functions are defined in objects, they become methods. The first step to define a method is to tell the compiler from where it can be accessed. This concept is called access control in Swift and there are three levels of access control. We are going to explain them for methods as follows: Public access: Any entity can access a method that is defined as public if it is in the same module. If an entity is not in the same module, we will need to import the module to be able to call the method. We need to mark our methods and objects as public when we develop frameworks in order to enable other modules to use them. Internal access: Any method that is defined as internal can be accessed from other entities in a module but cannot be accessed from other modules. Private access: Any method that is defined as private can be accessed only from the same source file. By default, if we do not provide the access modifier, a variable or function becomes internal. Using these access modifiers, we can structure our code properly, for instance, we can hide details from other modules if we define an entity as internal. We can even hide the details of a method from other files if we define them as private. Before Swift 2.0, we had to define everything as public or add all source files to the testing target. Swift 2.0 introduced the @testable import syntax that enables us to define internal or private methods that can be accessed from testing modules. Methods can generally be in three forms: Instance methods: We need to obtain an instance of an object (In this article we will refer to classes, structs, and enums as objects) in order to be able to call the method defined in it, and then we will be able to access the scope and data of the object. Static methods: Swift names them type methods also. They do not need any instances of objects and they cannot access the instance data. They are called by putting a dot after the name of the object type (for example, Person.sayHi()). The static methods cannot be overridden by the subclasses of the object that they reside in. Class methods: Class methods are like the static methods but they can be overridden by subclasses. We have covered the keywords that are required for method definitions; now we will concentrate on the syntax that is shared among functions and methods. There are other concepts related to methods that are out of scope of this article as we will concentrate on functional programming in Swift. Continuing to cover the function definition, now comes the func keyword that is mandatory and is used to tell the compiler that it is going to deal with a function. Then comes the function name that is mandatory and is recommended to be camel-cased with the first letter as lowercase. The function name should be stating what the function does and is recommended to be in the form of a verb when we define our methods in objects. Basically, our classes will be named nouns and methods will be verbs that are in the form of orders to the class. In pure functional programming, as the function does not reside in other objects, they can be named by their functionalities. Parameters follow the func name. They will be defined in parentheses to pass arguments to the function. Parentheses are mandatory even if we do not have any parameters. We will cover all aspects of parameters in an upcoming section of this article. Then comes throws, which is not mandatory. A function or method that is marked with the throw keyword may or may not throw errors. At this point, it is enough to know what they are when we see them in a function or method signature. The next entity in a function type declaration is the return type. If a function is not void, the return type will come after the -> sign. The return type indicates the type of entity that is going to be returned from a function. We will cover return types in detail in an upcoming section in this article, so now we can move on to the last piece of function that is present in most programming languages, our beloved { }. We defined functions as blocks of functionality and {} defines the borders of the block so that the function body is declared and execution happens in there. We will write the functionality inside {}. Best practices in function definition There are proven best practices for function and method definition provided by amazing software engineering resources, such as Clean Code, Code Complete, and Coding Horror, that we can summarize as follows: Try not to exceed 8-10 lines of code in each function as shorter functions or methods are easier to read, understand, and maintain. Keep the number of parameters minimal because the more parameters a function has, the more complex it is. Functions should have at least one parameter and one return value. Avoid using type names in function names as it is going to be redundant. Aim for one and only one functionality in a function. Name a function or method in a way that it describes its functionality properly and is easy to understand. Name functions and methods consistently. If we have a connect function, we can have a disconnect one. Write functions to solve the current problem and generalize it when needed. Try to avoid what if scenarios as probably you aren't going to need it (YAGNI). Calling functions We have covered a general syntax to define a function and method if it resides in an object. Now it is time to talk about how we call our defined functions and methods. To call a function, we will use its name and provide its required parameters. There are complexities with providing parameters that we will cover in the upcoming section. For now, we are going to cover the most basic type of parameter providing as follows: funcName(paramName, secondParam: secondParamName) This type of function calling should be familiar to Objective-C developers as the first parameter name is not named and the rest are named. To call a method, we need to use the dot notation provided by Swift. The following examples are for class instance methods and static class methods: let someClassInstance = SomeClass() someClassInstance.funcName(paramName, secondParam: secondParamName) StaticClass.funcName(paramName, secondParam: secondParamName)   Defining and using function parameters In function definition, parameters follow the function name and they are constants by default so we will not able to alter them inside the function body if we do not mark them with var. In functional programming, we avoid mutability, therefore, we would never use mutable parameters in functions. Parameters should be inside parentheses. If we do not have any parameters, we simply put open and close parentheses without any characters between them: func functionName() { } In functional programming, it is important to have functions that have at least one parameter. We will explain why it is important in upcoming sections. We can have multiple parameters separated by commas. In Swift, parameters are named so we need to provide the parameter name and type after putting a colon, as shown in the following example: func functionName(parameter: ParameterType, secondParameter: ParameterType) { } // To call: functionName(parameter, secondParameter: secondParam) ParameterType can also be an optional type so the function becomes the following if our parameters need to be optionals: func functionName(parameter: ParameterType?, secondParameter: ParameterType?) { } Swift enables us to provide external parameter names that will be used when functions are called. The following example presents the syntax: Func functionName(externalParamName localParamName: ParameterType) // To call: functionName(externalParamName: parameter) Only the local parameter name is usable in the function body. It is possible to omit the parameter names with the _ syntax, for instance, if we do not want to provide any parameter name when the function is called, we can use _ as externalParamName for the second or subsequent parameters. If we want to have a parameter name for the first parameter name in function calls, we can basically provide the local parameter name as external also. In this article, we are going to use the default function parameter definition. Parameters can have default values as follows: func functionName(parameter: Int = 3) { print("(parameter) is provided." } functionName(5) // prints "5 is provided." functionName() // prints "3 is provided" Parameters can be defined as inout to enable function callers obtaining parameters that are going to be changed in the body of a function. As we can use tuples for function returns, it is not recommended to use inout parameters unless we really need them. We can define function parameters as tuples. For instance, the following example function accepts a tuple of the (Int, Int) type: func functionWithTupleParam(tupleParam: (Int, Int)) {} As, under the hood, variables are represented by tuples in Swift, the parameters to a function can also be tuples. For instance, let's have a simple convert function that takes an array of Int and a multiplier and converts it to a different structure. Let's not worry about the implementation of this function for now: let numbers = [3, 5, 9, 10] func convert(numbers: [Int], multiplier: Int) -> [String] { let convertedValues = numbers.enumerate().map { (index, element) in return "(index): (element * multiplier)" } return convertedValues } If we use this function as convert(numbers, multiplier: 3), the result is going to be ["0: 9", "1: 15", "2: 27", "3: 30"]. We can call our function with a tuple. Let's create a tuple and pass it to our function: let parameters = (numbers, multiplier: 3) convert(parameters) The result is identical to our previous function call. However, passing tuples in function calls is deprecated and will be removed in Swift 3.0, so it is not recommended to use them. We can define higher-order functions that can receive functions as parameters. In the following example, we define funcParam as a function type of (Int, Int) -> Int: func functionWithFunctionParam(funcParam: (Int, Int)-> Int) In Swift, parameters can be of a generic type. The following example presents a function that has two generic parameters. In this syntax, any type (for example, T or V) that we put inside <> should be used in parameter definition: func functionWithGenerics<T, V>(firstParam: T, secondParam) Defining and using variadic functions Swift enables us to define functions with variadic parameters. A variadic parameter accepts zero or more values of a specified type. Variadic parameters are similar to array parameters but they are more readable and can only be used as the last parameter in the multiparameter functions. As variadic parameters can accept zero values, we will need to check whether it is empty. The following example presents a function with variadic parameters of the String type: func greet(names: String…) { for name in names { print("Greetings, (name)") } } // To call this function greet("Steve", "Craig") // prints twice greet("Steve", "Craig", "Johny") // prints three times Returning values from functions If we need our function to return a value, tuple, or another function, we can specify it by providing ReturnType after ->. For instance, the following example returns String: func functionName() -> String { } Any function that has ReturnType in its definition should have a return keyword with the matching type in its body. Return types can be optionals in Swift so the function becomes as follows if the return needs to be optional: func functionName() -> String? { } Tuples can be used to provide multiple return values. For instance, the following function returns tuple of the (Int, String) type: func functionName() -> (code: Int, status: String) { } As we are using parentheses for tuples, we should avoid using parentheses for single return value functions. Tuple return types can be optional too so the syntax becomes as follows: func functionName() -> (code: Int, status: String)? { } This syntax makes the entire tuple optional; if we want to make only status optional, we can define the function as follows: func functionName() -> (code: Int, status: String?) { } In Swift, functions can return functions. The following example presents a function with the return type of a function that takes two Int values and returns Int: func funcName() -> (Int, Int)-> Int {} If we do not expect a function to return any value, tuple, or function, we simply do not provide ReturnType: func functionName() { } We could also explicitly declare it with the Void keyword: func functionName() { } In functional programming, it is important to have return types in functions. In other words, it is a good practice to avoid functions that have Void as return types. A function with the Void return type typically is a function that changes another entity in the code; otherwise, why would we need to have a function? OK, we might have wanted to log an expression to the console/log file or write data to a database or file to a filesystem. In these cases, it is also preferable to have a return or feedback related to the success of the operation. As we try to avoid mutability and stateful programming in functional programming, we can assume that our functions will have returns in different forms. This requirement is in line with mathematical underlying bases of functional programming. In mathematics, a simple function is defined as follows: y = f(x) or f(x) -> y Here, f is a function that takes x and returns y. Therefore, a function receives at least one parameter and returns at least a value. In functional programming, following the same paradigm makes reasoning easier, function composition possible, and code more readable. Summary This article explained the function definition and usage in detail by giving examples for parameter and return types. You can also refer the following books on the similar topics: Protocol-Oriented Programming with Swift: https://www.packtpub.com/application-development/protocol-oriented-programming-swift OpenStack Object Storage (Swift) Essentials: https://www.packtpub.com/virtualization-and-cloud/openstack-object-storage-swift-essentials Implementing Cloud Storage with OpenStack Swift: https://www.packtpub.com/virtualization-and-cloud/implementing-cloud-storage-openstack-swift Resources for Article: Further resources on this subject: Introducing the Swift Programming Language [article] Swift for Open Source Developers [article] Your First Swift App [article]
Read more
  • 0
  • 0
  • 3884

article-image-parallel-computing
Packt
30 Sep 2016
9 min read
Save for later

Parallel Computing

Packt
30 Sep 2016
9 min read
In this article written by Jalem Raj Rohit, author of the book Julia Cookbook, cover the following recipes: Basic concepts of parallel computing Data movement Parallel map and loop operations Channels (For more resources related to this topic, see here.) Introduction In this article, you will learn about performing parallel computing and using it to handle big data. So, some concepts like data movements, sharded arrays, and the map-reduce framework are important to know in order to handle large amounts of data by computing on it using parallelized CPUs. So, all the concepts discussed in this article will help you build good parallel computing and multiprocessing basics, including efficient data handling and code optimization. Basic concepts of parallel computing Parallel computing is a way of dealing with data in a parallel way. This can be done by connecting multiple computers as a cluster and using their CPUs for carrying out the computations. This style of computation is used when handling large amounts of data and also while running complex algorithms over significantly large data. The computations are executed faster due to the availability of multiple CPUs running them in parallel as well as the direct availability of RAM to each of them. Getting ready Julia has an in-built support for parallel computing and multiprocessing. So, these computations rarely require any external libraries for the task. How to do it… Julia can be started in your local computer using multiple cores of your CPU. So, we will now have multiple workers for the process. This is how you can fire up Julia in the multi-processing mode in your terminal. This creates two worker process in the machine, which means it uses twwo CPU cores for the purpose julia -p 2 The output looks something like this. It might differ for different operating systems and different machines: Now, we will look at the remotecall() function. It takes in multiple arguments, the first one being the process which we want to assign the task to. The next argument would be the function which we want to execute. The subsequent arguments would be the parameters or the arguments of that function which we want to execute. In this example, we will create a 2 x 2 random matrix and assign it to the process number 2. This can be done as follows: task = remotecall(2, rand, 2, 2) The preceding command gives the following output: Now that the remotecall() function for remote referencing has been executed, we will fetch the results of the function through the fetch() function. This can be done as follows: fetch(task) The preceding command gives the following output: Now, to perform some mathematical operations on the generated matrix, we can use the @spawnat macro, which takes in the mathematical operation and the fetch() function. The @spawnat macro actually wraps the expression 5 .+ fetch(task) into an anonymous function and runs it on the second machine This can be done as follows: task2 = @spawnat 5 .+ fetch(task) There is also a function that eliminates the need of using two different functions: remotecall() and fetch(). The remotecall_fetch() function takes in multiple arguments. The first one being the process that the task is being assigned. The next argument is the function which you want to be executed. The subsequent arguments would be the arguments or the parameters of the function that you want to execute. Now, we will use the remote call_fetch() function to fetch an element of the task matrix for a particular index. This can be done as follows: remotecall_fetch(2, getindex, task2, 1, 1) How it works… Julia can be started in the multiprocessing mode by specifying the number of processes needed while starting up the REPL. In this example, we started Julia as a two process mode. The maximum number of processes depends on the number of cores available in the CPU. The remotecall() function helps in selecting a particular process from the running processes in order to run a function or, in fact, any computation for us. The fetch() function is used to fetch the results of the remotecall() function from a common data resource (or the process) for all the running processes. The details of the data source would be covered in the later sections. The results of the fetch() function can also be used for further computations, which can be carried out with the @spawnat macro along with the results of fetch(). This would assign a process for the computation. The remotecall_fetch() function further eliminates the need for the fetch function in case of a direct execution. This has both the remotecall() and fetch() operations built into it. So, it acts as a combination of both the second and third points in this section. Data movement In parallel computing, data movements are quite common and are also a thing to be minimized due to the time and the network overhead due to the movements. In this recipe, we will see how that can be optimized to avoid latency as much as we can. Getting ready To get ready for this recipe, you need to have the Julia REPL started in the multiprocessing mode. This is explained in the Getting ready section of the preceding recipe. How to do it… Firstly, we will see how to do a matrix computation using the @spawn macro, which helps in data movement. So, we construct a matrix of shape 200 x 200 and then try to square it using the @spawn macro. This can be done as follows: mat = rand(200, 200) exec_mat = @spawn mat^2 fetch(exec_mat) The preceding command gives the following output: Now, we will look at an another way to achieve the same. This time, we will use the @spawn macro directly instead of the initialization step. We will discuss the advantages and drawbacks of each method in the How it works… section. So, this can be done as follows: mat = @spawn rand(200, 200)^2 fetch(mat) The preceding command gives the following output: How it works… In this example, we try to construct a 200X200 matrix and then used the @spawn macro to spawn a process in the CPU to execute the same for us. The @spawn macro spawns one of the two processes running, and it uses one of them for the computation. In the second example, you learned how to use the @spawn macro directly without an extra initialization part. The fetch() function helps us fetch the results from a common data resource of the processes. More on this will be covered in the following recipes. Parallel maps and loop operations In this recipe, you will learn a bit about the famous Map Reduce framework and why it is one of the most important ideas in the domains of big data and parallel computing. You will learn how to parallelize loops and use reducing functions on them through the several CPUs and machines and the concept of parallel computing, which you learned about in the previous recipes. Getting ready Just like the previous sections, Julia just needs to be running in the multiprocessing mode to follow along the following examples. This can be done through the instructions given in the first section. How to do it… Firstly, we will write a function that takes and adds n random bits. The writing of this function has nothing to do with multiprocessing. So, it has simple Julia functions and loops. This function can be written as follows: Now, we will use the @spawn macro, which we learned previously to run the count_heads() function as separate processes. The count_heads()function needs to be in the same directory for this to work. This can be done as follows: require("count_heads") a = @spawn count_heads(100) b = @spawn count_heads(100) fetch(a) + fetch(b) However, we can use the concept of multi-processing and parallelize the loop directly as well as take the sum. The parallelizing part is called mapping, and the addition of the parallelized bits is called reduction. Thus, the process constitutes the famous Map-Reduce framework. This can be made possible using the @parallel macro, as follows: nheads = @parallel (+) for i = 1:200 Int(rand(Bool)) end How it works… The first function is a simple Julia function that adds random bits with every loop iteration. It was created just for the demonstration of Map-Reduce operations. In the second point, we spawn two separate processes for executing the function and then fetch the results of both of them and add them up. However, that is not really a neat way to carry out parallel computation of functions and loops. Instead, the @parallel macro provides a better way to do it, which allows the user to parallelize the loop and then reduce the computations through an operator, which together would be called the Map-Reduce operation. Channels Channels are like the background plumbing for parallel computing in Julia. They are like the reservoirs from where the individual processes access their data from. Getting ready The requisite is similar to the previous sections. This is mostly a theoretical section, so you just need to run your experiments on your own. For that, you need to run your Julia REPL in a multiprocessing mode. How to do it… Channels are shared queues with a fixed length. They are common data reservoirs for the processes which are running. The channels are like common data resources, which multiple readers or workers can access. They can access the data through the fetch() function, which we already discussed in the previous sections. The workers can also write to the channel through the put!() function. This means that the workers can add more data to the resource, which can be accessed by all the workers running a particular computation. Closing a channel after usage is a good practice to avoid data corruption and unnecessary memory usage. It can be done using the close() function. Summary In this article we covered the basic concepts of parallel computing and data movement that takes place in the network. We also learned about parallel maps and loop operations along with the famous Map Reduce framework. At the end we got a brief understanding of channels and how individual processes access their data from channels. Resources for Article: Further resources on this subject: More about Julia [article] Basics of Programming in Julia [article] Simplifying Parallelism Complexity in C# [article]
Read more
  • 0
  • 0
  • 3370
article-image-introduction-neural-networks-chainer-part-1
Hiroyuki Vincent
30 Sep 2016
8 min read
Save for later

Introduction to Neural Networks with Chainer – Part 1

Hiroyuki Vincent
30 Sep 2016
8 min read
With the increasing popularity of neural networks, or deep learning, companies ranging from smaller start-ups to major ones such as Google have been releasing frameworks for deep learning related tasks. Caffe from the Berkeley Vision and Learning Center (BVLC), Torch and Theano has been around for quite a while. TensorFlow was open sourced by Google last year in 2015 and has since then been expanding its community. Neon from Nervana Systems is a more recent addition to this repository with good reputation for its performance. This three-part post series will introduce you to yet another framework for neural networks called Chainer, which similarly to most of the previously mentioned frameworks is based on Python. It has an intuitive interface with a low learning curve but hasn't been widely adopted yet outside the borders of Japan where it is being developed. This article is split up into three parts where this first part aims to explain the characteristics of Chainer and the basic data structures. The second and third parts will help you get started with the actual training. The basic theory of neural networks will not be covered but if you are familiar with forward pass, back propagation and gradient descent and on top of that have some coding experience, you should be be able to follow this article. What is Chainer? Chainer is an open sourced Python based framework maintained by Preferred Infrastructure/Preferred Networks in Japan. The company behind the framework put a heavy emphasis on closing the gap between the machine learning research being carried out in academia and the more practical applications of machine learning. They focus on deep learning, IoT, edge-heavy computing with applications in the automobile manufacturing and healthcare markets by for instance developing autonomous cars and factory robots with grasping capabilities. Why Chainer? Defining and training simple neural networks with Chainer can be done with just few lines of code. It can also scale to larger models with more complex architecture with little effort. It is a framework for basically anyone, working, studying or researching in neural networks. There are however other alternatives as mentioned in the introduction. This section will explain the characteristics of the framework and why you might want to try it out. One major issue with deep learning related tasks is configuring the hyper parameters. Chainer makes this less of a pain. It comes with many layers, activation functions, loss functions and optimization algorithms in a plug-and-play fashion. With a single line of code or a single function call, those components can be added or removed without affecting the rest of the program. The abstraction and class structure of the framework makes it intuitive to learn and start experimenting. We will dig deeper into that in the second part of this series. On top of that, it is well documented. It is actually so well documented that you may stop reading this article right here and jump to the official documentation. Much of the content in this post is extracted from the official documentation, but I will try to complement it with additional details and awareness for common pitfalls. GPU Support NumPy is a Python package commonly used in academia due to its rich interface for manipulating multidimensional arrays similar to MATLAB. If you are working with neural networks, chances are, you're well familiar with NumPy and its methods and operations. What Chainer does is that it comes with CuPy (chainer.cuda.cupy), a GPU alternative to NumPy. CuPy is a CUDA-based GPU backend for array manipulation that implements a subset of the NumPy interface. Hence, it is possible to write almost generic code for both the CPU and the GPU. You can simply change the NumPy package to CuPy and vice versa in your source code to switch from one to another. It unfortunately lacks some features such as advanced indexing and numpy.where. Multi-GPU training in terms of model parallelism and data parallelism is also supported, although not in the scope of this series. As we will discover, the most fundamental data structure in Chainer is a NumPy (or CuPy) array wrapper with added functionality. The Basics of Chainer Let’s write some code to get familiar with the Chainer interface. First, make sure that you have a working Python environment. We will use pip to install Chainer even though it can be installed directly from source. The code included in this post is verified with Python 3.5.1 and Chainer 1.8.2. Installation Install NumPy and Chainer using pip, which comes with the Python environment. pip install numpy pip install chainer Chainer Variables and Variable Differentiation You are now ready to start writing code. First, let's take a look at the snippet below. import numpy as np from chainer import Variable x_data = np.array([4], dtype=np.float32) x = Variable(x_data) assert x.data == x_data # True y = x ** 2 + 5 * x + 3 assertisinstance(y, Variable) # True # Compute the gradient for x and store it in x.grad y.backward() # y'(x) = 2 * x + 5; y'(4) = 13 assert x.grad == 13# True The most fundamental data structure is the Chainer variable class, chainer.Variable. It can be initialized by passing a NumPy array (which must have the datatype numpy.float32) or by applying functions to already instantiated Chainer variables. Think of them as wrappers of NumPy's N-dimensional arrays. To access the NumPy array from the variable, use the data property. So what makes them different? Each Chainer variable actually holds a reference to it's creator, unless it is a leaf node, x in the example above. This means that y can reference x through the functions that created it. That way, a computational graph is maintained by the framework which is used for computing the gradients during back propagation. This is exactly what happens when calling the Variable.backward() method on y. The variable is differentiated and the gradient with respect to x is stored in the x variable itself. A pitfall here is that if x would be an array with more than one element, y.grad would need to be initialized with an initial error. If, as in the above mentioned example, x only contains one element, the error is automatically set to 1. Principles of Gradient Descent Using the differentiation mechanism above, you may implement a gradient descent optimization algorithm the following way. Initialize a weight w, a one dimensional array with one element to any value, say 4 as in the previous example, and iteratively optimizes the loss function w ** 2. The loss function is a function depending on the parameter w, just as y was depending on x. The loss function is obviously at it's global minimum when w == 0. This is what we want to achieve with gradient descent. We can get very close by repeating the optimization step using Variable.backward(). import numpy as np from chainer import Variable w = Variable(np.array([4], dtype=np.float32)) learning_rate = 0.1 max_iters = 100 for i in range(max_iters): loss = w ** 2 loss.backward() # Compute w.grad # Optimize / Update the parameter using gradient descent w.data -= learning_rate * w.grad # Reset gradient for the next iteration, w.grad == 0 w.zerograd() print('Iteration: {} Loss: {}'.format(i, loss.data)) In each iteration, the parameter is updated towards the negative gradient to lower the loss. We are performing a gradient descent. Note that the gradient was scaled by a learning rate before the parameter was updated in order to stabilize the optimization. In fact, if the learning rate were removed from this example, the loss would be stuck at 16, since the derivative of the loss functions is 2 * w, which would cause w to simply jump back and forth between 4 and -4. With the learning rate, we see that the loss decreases with each iteration. loss.data as seen in the output below is an array with one element since it has the same dimensions as w.data. Iteration: 0 Loss: [ 16.] Iteration: 1 Loss: [ 10.24000072] Iteration: 2 Loss: [ 6.55359983] Iteration: 3 Loss: [ 4.19430351] Iteration: 4 Loss: [ 2.68435407] Iteration: 5 Loss: [ 1.71798646] Iteration: 6 Loss: [ 1.09951138] Iteration: 7 Loss: [ 0.70368725] Iteration: 8 Loss: [ 0.45035988] Iteration: 9 Loss: [ 0.2882303] ... Summary This was a very brief introduction to the framework and its fundamental chainer.Variable class, how variable differentiation using computational graphs form the core concept of the framework. In the second and third part of this series, we will implement a complete training algorithm using Chainer. About the Author Hiroyuki Vincent Yamazaki is a graduate student at KTH, Royal Institute of Technology in Sweden, currently conducting research in convolutional neural networks at Keio University in Tokyo, partially using Chainer as a part of a double-degree programme. GitHub  LinkedIn 
Read more
  • 0
  • 0
  • 1463

article-image-getting-started-keystonejs
Jake Stockwin
29 Sep 2016
5 min read
Save for later

Getting Started with KeystoneJS

Jake Stockwin
29 Sep 2016
5 min read
KeystoneJS is a content management framework for node.js. It is an easy-to-use system that does all the hard work of making a website for you. This article works through a simple example to get you started with KeystoneJS. Initial Setup KeystoneJS comes paired with a generator to make setup simple. You'll need to have node.js and mongodb installed before you begin. To generate your site, all you need to do is run npm install -g generator-keystone and then yo keystone. You'll be asked a few questions, and after a while your site is ready. Running node keystone, you'll find a site with a readymade blog, gallery and contact form, but the main feature is of KeystoneJS is the admin UI. Navigate to localhost:3000/keystone and sign in with the default credentials and you'll be able to manage all the content on your site from a user-friendly interface. Take a look around your site and the code so that you're familiar with it, and it's also worth having a read through the documentation. Keystone Models You now have a site up and running, but what if you need more than just a blog and a gallery? Perhaps, you would like a page to display upcoming events. No problem, to achieve this we create a model. Open the models folder in your file browser and you will be able to see the existing models, User.js for example. We're going to add our own model; create a new file called Event.js in the models folder. Say, our event should have both a start and an end time, a name and a description. Then our model will look like this: var keystone = require('keystone'); var Types = keystone.Field.Types; var Event = new keystone.List('Event'); Event.add({ name: { type: Types.Name, required: true, index: true }, description: { type: Types.Textarea }, start: { type: Types.Datetime }, end: { type: Types.Datetime } }); Event.register(); Now restart your app. Under the hood, KeystoneJS is managing all the database schemas for you, and if you sign back in to the admin, UI you'll see that there is now a page to manage your events. All that was required was to create a new model, and Keystone wrote the entire backend for you—this shows the power of Keystone. You don't have to spend your time writing the backend for your site, and are free to focus on the client-facing side of things. Routes and Templates We have created our model and are now able to log in to the admin UI and manage our events. However, we still need to display these events to our website viewers. This is done in two parts; a route is used to obtain the data from the database and makes this data available to the template, which displays the data. First, create the route. Open a new file, routes/views/events.js, and enter the following code: var keystone = require('keystone'); exports = module.exports = function(req, res) { var view = new keystone.View(req, res); var locals = res.locals; // Set locals locals.section = 'events'; // Load the events view.on('init', function(next) { var q = keystone.list('Event').model.find(); q.exec(function(err, results) { locals.data.events = results; next(err); }); }); // Render the view view.render('post'); }; You can now create your template. The events will be available to the template as data.events, because we have set locals.data.events in the route. KeystoneJS gives you the option of which template engine to use. The default is jade, so we will use this as the example here, but you can easily adapt the code to any other engine, and if you get stuck, a good place to start is the blog post template. Templates are stored in templates/views, so create templates/views/events.js with the following code: extends ../layouts/default mixin event(event) h2 event.name p if event.start | start: #{event._.start.format('MMMM Do, YYYY')} p if event.end | end: #{event._.end.format('MMMM Do, YYYY')} p if event.description | details: event.description block content .container: .row .events each event in data.events +event(event) This is by no means a well-designed page, but will do for this example. We're almost done, but if you go to /events in your web browser, you'll get a 404 error. That's because we haven't told our route controllers about the new page yet. This is done in routes/index.js and you just need to add the line app.get('/events', routes.views.events);. This tells your app to send any get requests for /events to your new route, which in turn renders the new template. You can also add your new events page to your header by simply adding { label: 'Events',      key: 'events',     href: '/events' }, to routes/middleware.js. The key in this should match the res.locals.section in the route we created. Conclusion By simply running yo keystone and adding just over 50 lines of code, we've created an events page to display our events. You can log in to the admin UI and create, update and delete events; and your website will update automatically. This really highlights what keystone does. We don’t have to spend our time configuring all the node modules and writing the backend of our server; keystone has done all the work for us. This means we can dedicate all our time to making our client-facing website look as good as possible. About the Author Jake Stockwin is a third-year mathematics and statistics undergraduate at the University of Oxford, and a novice full-stack developer. He has a keen interest in programming, both in his academic studies and in his spare time. Next year, he plans to write his dissertation on reinforcement learning, an area of machine learning. Over the past few months, he has designed websites for various clients and has begun developing in Node.js.
Read more
  • 0
  • 0
  • 2862
Modal Close icon
Modal Close icon