Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-introduction-nodejs-design-patterns
Packt
18 Feb 2016
27 min read
Save for later

An Introduction to Node.js Design Patterns

Packt
18 Feb 2016
27 min read
A design pattern is a reusable solution to a recurring problem; the term is really broad in its definition and can span multiple domains of application. However, the term is often associated with a well-known set of object-oriented patterns that were popularized in the 90's by the book, Design Patterns: Elements of Reusable Object-Oriented Software, Pearson Education by the almost legendary Gang of Four (GoF): Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. We will often refer to these specific set of patterns as traditional design patterns, or GoF design patterns. (For more resources related to this topic, see here.) Applying this set of object-oriented design patterns in JavaScript is not as linear and formal as it would happen in a class-based object-oriented language. As we know, JavaScript is multi-paradigm, object-oriented, and prototype-based, and has dynamic typing; it treats functions as first class citizens, and allows functional programming styles. These characteristics make JavaScript a very versatile language, which gives tremendous power to the developer, but at the same time, it causes a fragmentation of programming styles, conventions, techniques, and ultimately the patterns of its ecosystem. There are so many ways to achieve the same result using JavaScript that everybody has their own opinion on what the best way is to approach a problem. A clear demonstration of this phenomenon is the abundance of frameworks and opinionated libraries in the JavaScript ecosystem; probably, no other language has ever seen so many, especially now that Node.js has given new astonishing possibilities to JavaScript and has created so many new scenarios. In this context, the traditional design patterns too are affected by the nature of JavaScript. There are so many ways in which they can be implemented so that their traditional, strongly object-oriented implementation is not a pattern anymore, and in some cases, not even possible because JavaScript, as we know, doesn't have real classes or abstract interfaces. What doesn't change though, is the original idea at the base of each pattern, the problem it solves, and the concepts at the heart of the solution. In this article, we will see how some of the most important GoF design patterns apply to Node.js and its philosophy, thus rediscovering their importance from another perspective. The design patterns explored in this article are as follows: Factory Proxy Decorator Adapter Strategy State Template Middleware Command This article assumes that the reader has some notion of how inheritance works in JavaScript. Please also be advised that throughout this article we will often use generic and more intuitive diagrams to describe a pattern in place of standard UML, since many patterns can have an implementation based not only on classes, but also on objects and even functions. Factory We begin our journey starting from what probably is the most simple and common design pattern in Node.js: Factory. A generic interface for creating objects We already stressed the fact that, in JavaScript, the functional paradigm is often preferred to a purely object-oriented design, for its simplicity, usability, and small surface area. This is especially true when creating new object instances. In fact, invoking a factory, instead of directly creating a new object from a prototype using the new operator or Object.create(), is so much more convenient and flexible under several aspects. First and foremost, a factory allows us to separate the object creation from its implementation; essentially, a factory wraps the creation of a new instance, giving us more flexibility and control in the way we do it. Inside the factory, we can create a new instance leveraging closures, using a prototype and the new operator, using Object.create(), or even returning a different instance based on a particular condition. The consumer of the factory is totally agnostic about how the creation of the instance is carried out. The truth is that, by using new, we are binding our code to one specific way of creating an object, while in JavaScript we can have much more flexibility, almost for free. As a quick example, let's consider a simple factory that creates an Image object: function createImage(name) { return new Image(name); } var image = createImage('photo.jpeg'); The createImage() factory might look totally unnecessary, why not instantiate the Image class by using the new operator directly? Something like the following line of code: var image = new Image(name); As we already mentioned, using new binds our code to one particular type of object; in the preceding case, to objects of type, Image. A factory instead, gives us much more flexibility; imagine that we want to refactor the Image class, splitting it into smaller classes, one for each image format that we support. If we exposed a factory as the only means to create new images, we can simply rewrite it as follows, without breaking any of the existing code: function createImage(name) { if(name.match(/.jpeg$/)) { return new JpegImage(name); } else if(name.match(/.gif$/)) { return new GifImage(name); } else if(name.match(/.png$/)) { return new PngImage(name); } else { throw new Exception('Unsupported format'); } } Our factory also allows us to not expose the constructors of the objects it creates, and prevents them from being extended or modified (remember the principle of small surface area?). In Node.js, this can be achieved by exporting only the factory, while keeping each constructor private. A mechanism to enforce encapsulation A factory can also be used as an encapsulation mechanism, thanks to closures. Encapsulation refers to the technique of controlling the access to some internal details of an object by preventing the external code from manipulating them directly. The interaction with the object happens only through its public interface, isolating the external code from the changes in the implementation details of the object. This practice is also referred to as information hiding. Encapsulation is also a fundamental principle of object-oriented design, together with inheritance, polymorphism, and abstraction. As we know, in JavaScript, we don't have access level modifiers (for example, we can't declare a private variable), so the only way to enforce encapsulation is through function scopes and closures. A factory makes it straightforward to enforce private variables, consider the following code for example: function createPerson(name) { var privateProperties = {}; var person = { setName: function(name) { if(!name) throw new Error('A person must have a name'); privateProperties.name = name; }, getName: function() { return privateProperties.name; } }; person.setName(name); return person; } In the preceding code, we leverage closures to create two objects: a person object which represents the public interface returned by the factory and a group of privateProperties that are inaccessible from the outside and that can be manipulated only through the interface provided by the person object. For example, in the preceding code, we make sure that a person's name is never empty, this would not be possible to enforce if name was just a property of the person object. Factories are only one of the techniques that we have for creating private members; in fact, other possible approaches are as follows: Defining private variables in a constructor (as recommended by Douglas Crockford: http://javascript.crockford.com/private.html) Using conventions, for example, prefixing the name of a property with an underscore "_" or the dollar sign "$" (this however, does not technically prevent a member from being accessed from the outside) Using ES6 WeakMaps (http://fitzgeraldnick.com/weblog/53/) Building a simple code profiler Now, let's work on a complete example using a factory. Let's build a simple code profiler, an object with the following properties: A start() method that triggers the start of a profiling session An end() method to terminate the session and log its execution time to the console Let's start by creating a file named profiler.js, which will have the following content: function Profiler(label) { this.label = label; this.lastTime = null; } Profiler.prototype.start = function() { this.lastTime = process.hrtime(); } Profiler.prototype.end = function() { var diff = process.hrtime(this.lastTime); console.log('Timer "' + this.label + '" took ' + diff[0] + ' seconds and ' + diff[1] + ' nanoseconds.'); } There is nothing fancy in the preceding class; we simply use the default high resolution timer to save the current time when start() is invoked, and then calculate the elapsed time when end() is executed, printing the result to the console. Now, if we are going to use such a profiler in a real-world application to calculate the execution time of the different routines, we can easily imagine the huge amount of logging we will generate to the standard output, especially in a production environment. What we might want to do instead is redirect the profiling information to another source, for example, a database, or alternatively, disabling the profiler altogether if the application is running in the production mode. It's clear that if we were to instantiate a Profiler object directly by using the new operator, we would need some extra logic in the client code or in the Profiler object itself in order to switch between the different logics. We can instead use a factory to abstract the creation of the Profiler object, so that, depending on whether the application runs in the production or development mode, we can return a fully working Profiler object, or alternatively, a mock object with the same interface, but with empty methods. Let's do this then, in the profiler.js module instead of exporting the Profiler constructor, we will export only a function, our factory. The following is its code: module.exports = function(label) { if(process.env.NODE_ENV === 'development') { return new Profiler(label); //[1] } else if(process.env.NODE_ENV === 'production') { return { //[2] start: function() {}, end: function() {} } } else { throw new Error('Must set NODE_ENV'); } } The factory that we created abstracts the creation of a profiler object from its implementation: If the application is running in the development mode, we return a new, fully functional Profiler object If instead the application is running in the production mode, we return a mock object where the start() and stop() methods are empty functions The nice feature to highlight is that, thanks to the JavaScript dynamic typing, we were able to return an object instantiated with the new operator in one circumstance and a simple object literal in the other (this is also known as duck typing http://en.wikipedia.org/wiki/Duck_typing). Our factory is doing its job perfectly; we can really create objects in any way that we like inside the factory function, and we can execute additional initialization steps or return a different type of object based on particular conditions, and all of this while isolating the consumer of the object from all these details. We can easily understand the power of this simple pattern. Now, we can play with our profiler; this is a possible use case for the factory that we just created earlier: var profiler = require('./profiler'); function getRandomArray(len) { var p = profiler('Generating a ' + len + ' items long array'); p.start(); var arr = []; for(var i = 0; i < len; i++) { arr.push(Math.random()); } p.end(); } getRandomArray(1e6); console.log('Done'); The p variable contains the instance of our Profiler object, but we don't know how it's created and what its implementation is at this point in the code. If we include the preceding code in a file named profilerTest.js, we can easily test these assumptions. To try the program with profiling enabled, run the following command: export NODE_ENV=development; node profilerTest The preceding command enables the real profiler and prints the profiling information to the console. If we want to try the mock profiler instead, we can run the following command: export NODE_ENV=production; node profilerTest The example that we just presented is just a simple application of the factory function pattern, but it clearly shows the advantages of separating an object's creation from its implementation. In the wild As we said, factories are very popular in Node.js. Many packages offer only a factory for creating new instances, some examples are the following: Dnode (https://npmjs.org/package/dnode): This is an RPC system for Node.js. If we look into its source code, we will see that its logic is implemented into a class named D; however, this is never exposed to the outside as the only exported interface is a factory, which allows us to create new instances of the class. You can take a look at its source code at https://github.com/substack/dnode/blob/34d1c9aa9696f13bdf8fb99d9d039367ad873f90/index.js#L7-9. Restify (https://npmjs.org/package/restify): This is a framework to build REST API that allows us to create new instances of a server using the restify.createServer()factory, which internally creates a new instance of the Server class (which is not exported). You can take a look at its source code at https://github.com/mcavage/node-restify/blob/5f31e2334b38361ac7ac1a5e5d852b7206ef7d94/lib/index.js#L91-116. Other modules expose both a class and a factory, but document the factory as the main method—or the most convenient way—to create new instances; some of the examples are as follows: http-proxy (https://npmjs.org/package/http-proxy): This is a programmable proxying library, where new instances are created with httpProxy.createProxyServer(options). The core Node.js HTTP server: This is where new instances are mostly created using http.createServer(), even though this is essentially a shortcut for new http.Server(). bunyan (https://npmjs.org/package/bunyan): This is a popular logging library; in its readme file the contributors propose a factory, bunyan.createLogger(), as the main method to create new instances, even though this would be equivalent to running new bunyan(). Some other modules provide a factory to wrap the creation of other components. Popular examples are through2 and from2 , which allow us to simplify the creation of new streams using a factory approach, freeing the developer from explicitly using inheritance and the new operator. Proxy A proxy is an object that controls the access to another object called subject. The proxy and the subject have an identical interface and this allows us to transparently swap one for the other; in fact, the alternative name for this pattern is surrogate. A proxy intercepts all or some of the operations that are meant to be executed on the subject, augmenting or complementing their behavior. The following figure shows the diagrammatic representation: The preceding figure shows us how the Proxy and the Subject have the same interface and how this is totally transparent to the client, who can use one or the other interchangeably. The Proxy forwards each operation to the subject, enhancing its behavior with additional preprocessing or post-processing. It's important to observe that we are not talking about proxying between classes; the Proxy pattern involves wrapping actual instances of the subject, thus preserving its state. A proxy is useful in several circumstances, for example, consider the following ones: Data validation: The proxy validates the input before forwarding it to the subject Security: The proxy verifies that the client is authorized to perform the operation and it passes the request to the subject only if the outcome of the check is positive Caching: The proxy keeps an internal cache so that the operations are executed on the subject only if the data is not yet present in the cache Lazy initialization: If the creation of the subject is expensive, the proxy can delay it to when it's really necessary Logging: The proxy intercepts the method invocations and the relative parameters, recoding them as they happen Remote objects: A proxy can take an object that is located remotely, and make it appear local Of course, there are many more applications for the Proxy pattern, but these should give us an idea of the extent of its purpose. Techniques for implementing proxies When proxying an object, we can decide to intercept all its methods or only part of them, while delegating the rest of them directly to the subject. There are several ways in which this can be achieved; let's analyze some of them. Object composition Composition is the technique whereby an object is combined with another object for the purpose of extending or using its functionality. In the specific case of the Proxy pattern, a new object with the same interface as the subject is created, and a reference to the subject is stored internally in the proxy in the form of an instance variable or a closure variable. The subject can be injected from the client at creation time or created by the proxy itself. The following is one example of this technique using a pseudo class and a factory: function createProxy(subject) { var proto = Object.getPrototypeOf(subject); function Proxy(subject) { this.subject = subject; } Proxy.prototype = Object.create(proto); //proxied method Proxy.prototype.hello = function() { return this.subject.hello() + ' world!'; } //delegated method Proxy.prototype.goodbye = function() { return this.subject.goodbye .apply(this.subject, arguments); } return new Proxy(subject); } To implement a proxy using composition, we have to intercept the methods that we are interested in manipulating (such as hello()), while simply delegating the rest of them to the subject (as we did with goodbye()). The preceding code also shows the particular case where the subject has a prototype and we want to maintain the correct prototype chain, so that, executing proxy instanceof Subject will return true; we used pseudo-classical inheritance to achieve this. This is just an extra step, required only if we are interested in maintaining the prototype chain, which can be useful in order to improve the compatibility of the proxy with code initially meant to work with the subject. However, as JavaScript has dynamic typing, most of the time we can avoid using inheritance and use more immediate approaches. For example, an alternative implementation of the proxy presented in the preceding code, might just use an object literal and a factory: function createProxy(subject) { return { //proxied method hello: function() { return subject.hello() + ' world!'; }, //delegated method goodbye: function() { return subject.goodbye.apply(subject, arguments); } }; } If we want to create a proxy that delegates most of its methods, it would be convenient to generate these automatically using a library, such as delegates (https://npmjs.org/package/delegates). Object augmentation Object augmentation (or monkey patching)is probably the most pragmatic way of proxying individual methods of an object and consists of modifying the subject directly by replacing a method with its proxied implementation; consider the following example: function createProxy(subject) { var helloOrig = subject.hello; subject.hello = function() { return helloOrig.call(this) + ' world!'; } return subject; } This technique is definitely the most convenient one when we need to proxy only one or a few methods, but it has the drawback of modifying the subject object directly. A comparison of the different techniques Composition can be considered the safest way of creating a proxy, because it leaves the subject untouched without mutating its original behavior. Its only drawback is that we have to manually delegate all the methods, even if we want to proxy only one of them. If needed, we might also have to delegate the access to the properties of the subject. The object properties can be delegated using Object.defineProperty(). Find out more at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/defineProperty. Object augmentation, on the other hand, modifies the subject, which might not always be what we want, but it does not present the various inconveniences related to delegation. For this reason, object augmentation is definitely the most pragmatic way to implement proxies in JavaScript, and it's the preferred technique in all those circumstances where modifying the subject is not a big concern. However, there is at least one situation where composition is almost necessary; this is when we want to control the initialization of the subject as for example, to create it only when needed (lazy initialization). It is worth pointing out that by using a factory function (createProxy() in our examples), we can shield our code from the technique used to generate the proxy. Creating a logging Writable stream To see the proxy pattern in a real example, we will now build an object that acts as a proxy to a Writable stream, by intercepting all the calls to the write() method and logging a message every time this happens. We will use an object composition to implement our proxy; this is how the loggingWritable.js file looks: function createLoggingWritable(writableOrig) { var proto = Object.getPrototypeOf(writableOrig); function LoggingWritable(subject) { this.writableOrig = writableOrig; } LoggingWritable.prototype = Object.create(proto); LoggingWritable.prototype.write = function(chunk, encoding, callback) { if(!callback && typeof encoding === 'function') { callback = encoding; encoding = undefined; } console.log('Writing ', chunk); return this.writableOrig.write(chunk, encoding, function() { console.log('Finished writing ', chunk); callback && callback(); }); }; LoggingWritable.prototype.on = function() { return this.writableOrig.on .apply(this.writableOrig, arguments); }; LoggingWritable.prototype.end = function() { return this.writableOrig.end .apply(this.writableOrig, arguments); } return new LoggingWritable(this.writableOrig); } In the preceding code, we created a factory that returns a proxied version of the writable object passed as an argument. We provide an override for the write() method that logs a message to the standard output every time it is invoked and every time the asynchronous operation completes. This is also a good example to demonstrate the particular case of creating proxies of asynchronous functions, which makes necessary to proxy the callback as well; this is an important detail to be considered in a platform such as Node.js. The remaining methods, on() and end(), are simply delegated to the original writable stream (to keep the code leaner we are not considering the other methods of the Writable interface). We can now include a few more lines of code into the logginWritable.js module to test the proxy that we just created: var fs = require('fs'); var writable = fs.createWriteStream('test.txt'); var writableProxy = createProxy(writable); writableProxy.write('First chunk'); writableProxy.write('Second chunk'); writable.write('This is not logged'); writableProxy.end(); The proxy did not change the original interface of the stream or its external behavior, but if we run the preceding code, we will now see that every chunk that is written into the stream is transparently logged to the console. Proxy in the ecosystem – function hooks and AOP In its numerous forms, Proxy is a quite popular pattern in Node.js and in the ecosystem. In fact, we can find several libraries that allow us to simplify the creation of proxies, most of the time leveraging object augmentation as an implementation approach. In the community, this pattern can be also referred to as function hooking or sometimes also as Aspect Oriented Programming (AOP), which is actually a common area of application for proxies. As it happens in AOP, these libraries usually allow the developer to set pre or post execution hooks for a specific method (or a set of methods) that allow us to execute some custom code before and after the execution of the advised method respectively. Sometimes proxies are also called middleware, because, as it happens in the middleware pattern, they allow us to preprocess and post-process the input/output of a function. Sometimes, they also allow to register multiple hooks for the same method using a middleware-like pipeline. There are several libraries on npm that allow us to implement function hooks with little effort. Among them there are hooks (https://npmjs.org/package/hooks), hooker (https://npmjs.org/package/hooker), and meld (https://npmjs.org/package/meld). In the wild Mongoose (http://mongoosejs.com) is a popular Object-Document Mapping (ODM) library for MongoDB. Internally, it uses the hooks package (https://npmjs.org/package/hooks) to provide pre and post execution hooks for the init, validate, save, and remove methods of its Document objects. Find out more on the official documentation at http://mongoosejs.com/docs/middleware.html. Decorator Decorator is a structural pattern that consists of dynamically augmenting the behavior of an existing object. It's different from classical inheritance, because the behavior is not added to all the objects of the same class but only to the instances that are explicitly decorated. Implementation-wise, it is very similar to the Proxy pattern, but instead of enhancing or modifying the behavior of the existing interface of an object, it augments it with new functionalities, as described in the following figure: In the previous figure, the Decorator object is extending the Component object by adding the methodC() operation. The existing methods are usually delegated to the decorated object, without further processing. Of course, if necessary we can easily combine the Proxy pattern, so that also the calls to the existing methods can be intercepted and manipulated. Techniques for implementing decorators Although Proxy and Decorator are conceptually two different patterns, with different intents, they practically share the same implementation strategies. Let's revise them. Composition Using composition, the decorated component is wrapped around a new object that usually inherits from it. The Decorator in this case simply needs to define the new methods while delegating the existing ones to the original component: function decorate(component) { var proto = Object.getPrototypeOf(component); function Decorator(component) { this.component = component; } Decorator.prototype = Object.create(proto); //new method Decorator.prototype.greetings = function() { //... }; //delegated method Decorator.prototype.hello = function() { this.component.hello.apply(this.component, arguments); }; return new Decorator(component); } Object augmentation Object decoration can also be achieved by simply attaching new methods directly to the decorated object, as follows: function decorate(component) { //new method component.greetings = function() { //... }; return component; } The same caveats discussed during the analysis of the Proxy pattern are also valid for Decorator. Let's now practice the pattern with a working example! Decorating a LevelUP database Before we start coding with the next example, let's spend a few words to introduce LevelUP, the module that we are now going to work with. Introducing LevelUP and LevelDB LevelUP (https://npmjs.org/package/levelup) is a Node.js wrapper around Google's LevelDB, a key-value store originally built to implement IndexedDB in the Chrome browser, but it's much more than that. LevelDB has been defined by Dominic Tarr as the "Node.js of databases", because of its minimalism and extensibility. Like Node.js, LevelDB provides blazing fast performances and only the most basic set of features, allowing developers to build any kind of database on top of it. The Node.js community, and in this case Rod Vagg, did not miss the chance to bring the power of this database into Node.js by creating LevelUP. Born as a wrapper for LevelDB, it then evolved to support several kinds of backends, from in-memory stores, to other NoSQL databases such as Riak and Redis, to web storage engines such as IndexedDB and localStorage, allowing to use the same API on both the server and the client, opening up some really interesting scenarios. Today, there is a full-fledged ecosystem around LevelUp made of plugins and modules that extend the tiny core to implement features such as replication, secondary indexes, live updates, query engines, and more. Also, complete databases were built on top of LevelUP, including CouchDB clones such as PouchDB (https://npmjs.org/package/pouchdb) and CouchUP (https://npmjs.org/package/couchup), and even a graph database, levelgraph (https://npmjs.org/package/levelgraph) that can work both on Node.js and the browser! Find out more about the LevelUP ecosystem at https://github.com/rvagg/node-levelup/wiki/Modules. Implementing a LevelUP plugin In the next example, we are going to show how we can create a simple plugin for LevelUp using the Decorator pattern, and in particular, the object augmentation technique, which is the simplest but nonetheless the most pragmatic and effective way to decorate objects with additional capabilities. For convenience, we are going to use the level package (http://npmjs.org/package/level) that bundles both levelup and the default adapter called leveldown, which uses LevelDB as the backend. What we want to build is a plugin for LevelUP that allows to receive notifications every time an object with a certain pattern is saved into the database. For example, if we subscribe to a pattern such as {a: 1}, we want to receive a notification when objects such as {a: 1, b: 3} or {a: 1, c: 'x'} are saved into the database. Let's start to build our small plugin by creating a new module called levelSubscribe.js. We will then insert the following code: module.exports = function levelSubscribe(db) { db.subscribe = function(pattern, listener) { //[1] db.on('put', function(key, val) { //[2] var match = Object.keys(pattern).every(function(k) { //[3] return pattern[k] === val[k]; }); if(match) { listener(key, val); //[4] } }); }; return db; } That's it for our plugin, and it's extremely simple. Let's see what happens in the preceding code briefly: We decorated the db object with a new method named subscribe(). We simply attached the method directly to the provided db instance (object augmentation). We listen for any put operation performed on the database. We performed a very simple pattern-matching algorithm, which verified that all the properties in the provided pattern are also available on the data being inserted. If we have a match, we notify the listener. Let's now create some code—in a new file named levelSubscribeTest.js—to try out our new plugin: var level = require('level'); //[1] var db = level(__dirname + '/db', {valueEncoding: 'json'}); var levelSubscribe = require('./levelSubscribe'); //[2] db = levelSubscribe(db); db.subscribe({doctype: 'tweet', language: 'en'}, //[3] function(k, val){ console.log(val); }); //[4] db.put('1', {doctype: 'tweet', text: 'Hi', language: 'en'}); db.put('2', {doctype: 'company', name: 'ACME Co.'}); This is what we did in the preceding code: First, we initialize our LevelUP database, choosing the directory where the files will be stored and the default encoding for the values. Then, we attach our plugin, which decorates the original db object. At this point, we are ready to use the new feature provided by our plugin, the subscribe() method, where we specify that we are interested in all the objects with doctype: 'tweet' and language: 'en'. Finally, we save some values in the database, so that we can see our plugin in action: db.put('1', {doctype: 'tweet', text: 'Hi', language: 'en'}); db.put('2', {doctype: 'company', name: 'ACME Co.'}); This example shows a real application of the decorator pattern in its most simple implementation: object augmentation. It might look like a trivial pattern but it has undoubted power if used appropriately. For simplicity, our plugin will work only in combination with the put operations, but it can be easily expanded to work even with the batch operations (https://github.com/rvagg/node-levelup#batch). In the wild For more examples of how Decorator is used in the real world, we might want to inspect the code of some more LevelUp plugins: level-inverted-index (https://github.com/dominictarr/level-inverted-index): This is a plugin that adds inverted indexes to a LevelUP database, allowing to perform simple text searches across the values stored in the database level-plus (https://github.com/eugeneware/levelplus): This is a plugin that adds atomic updates to a LevelUP database Summary To learn more about Node.js, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Node.js Essentials (https://www.packtpub.com/web-development/nodejs-essentials) Node.js Blueprints (https://www.packtpub.com/web-development/nodejs-blueprints) Learning Node.js for Mobile Application Development (https://www.packtpub.com/web-development/learning-nodejs-mobile-application-development) Mastering Node.js (https://www.packtpub.com/web-development/mastering-nodejs) Resources for Article: Further resources on this subject: Node.js Fundamentals [Article] Learning Node.js for Mobile Application Development [Article] Developing a Basic Site with Node.js and Express [Article]
Read more
  • 0
  • 0
  • 36289

article-image-build-first-person-shooter
Packt
18 Feb 2016
29 min read
Save for later

Build a First Person Shooter

Packt
18 Feb 2016
29 min read
We will be creating a first person shooter; however, instead of shooting a gun to damage our enemies, we will be shooting a picture in a survival horror environment; similar to the Fatal Frame series of games and the recent indie title DreadOut. To get started on our project, we're first going to look at creating our level or, in this case, our environments starting with the exterior. In the game industry, there are two main roles in level creation: the environment artist and level designer. An environment artist is a person who builds the assets that go into the environment. He/she uses tools such as 3ds Max or Maya to create the model, and then uses other tools such as Photoshop to create textures and normal maps. The level designer is responsible for taking the assets that the environment artist has created and assembling them into an environment for players to enjoy. He/she designs the gameplay elements, creates the scripted events, and tests the gameplay. Typically, a level designer will create environments through a combination of scripting and using a tool that may or may not be in development as the game is being made. In our case, that tool is Unity. One important thing to note is that most companies have their own definition for different roles. In some companies, a level designer may need to create assets and an environment artist may need to create a level layout. There are also some places that hire someone to just do lighting, or just to place meshes (called a mesher) because they're so good at it. (For more resources related to this topic, see here.) Project overview In this article, we take on the role of an environment artist whose been tasked with creating an outdoor environment. We will use assets that I've placed in the example code as well as assets already provided to us by Unity for mesh placement. In addition to this, you will also learn some beginner-level design. Your objectives This project will be split into a number of tasks. It will be a simple step-by-step process from the beginning to end. Here is an outline of our tasks: Creating the exterior environment – Terrain Beautifying the environment – adding water, trees, and grass Building the atmosphere Designing the level layout and background The project setup At this point, I assume you have a fresh installation of Unity and have started it. You can perform the following steps: With Unity started, navigate to File | New Project. Select a project location of your choice somewhere on your hard drive and ensure that you have Setup defaults for set to 3D. Once completed, click on Create. Here, if you see the Welcome to Unity pop up, feel free to close it as we won't be using it. Level design 101 – planning Now just because we are going to be diving straight into Unity, I feel it's important to talk a little more about how level design is done in the gaming industry. While you may think a level designer will just jump into the editor and start playing, the truth is you normally would need to do a ton of planning ahead of time before you even open up your tool. Generally, a level design begins with an idea. This can come from anything; maybe you saw a really cool building, or a photo on the Internet gave you a certain feeling; maybe you want to teach the player a new mechanic. Turning this idea into a level is what a level designer does. Taking all of these ideas, the level designer will create a level design document, which will outline exactly what you're trying to achieve with the entire level from start to end. A level design document will describe everything inside the level; listing all of the possible encounters, puzzles, so on and so forth, which the player will need to complete as well as any side quests that the player will be able to achieve. To prepare for this, you should include as many references as you can with maps, images, and movies similar to what you're trying to achieve. If you're working with a team, making this document available on a website or wiki will be a great asset so that you know exactly what is being done in the level, what the team can use in their levels, and how difficult their encounters can be. Generally, you'll also want a top-down layout of your level done either on a computer or with a graph paper, with a line showing a player's general route for the level with encounters and missions planned out. Of course, you don't want to be too tied down to your design document and it will change as you playtest and work on the level, but the documentation process will help solidify your ideas and give you a firm basis to work from. For those of you interested in seeing some level design documents, feel free to check out Adam Reynolds (Level Designer on Homefront and Call of Duty: World at War) at http://wiki.modsrepository.com/index.php?title=Level_Design:_Level_Design_Document_Example. If you want to learn more about level design, I'm a big fan of Beginning Game Level Design, John Feil (previously my teacher) and Marc Scattergood, Cengage Learning PTR. For more of an introduction to all of game design from scratch, check out Level Up!: The Guide to Great Video Game Design, Scott Rogers, Wiley and The Art of Game Design, Jesse Schell, CRC Press. For some online resources, Scott has a neat GDC talk called Everything I Learned About Level Design I Learned from Disneyland, which can be found at http://mrbossdesign.blogspot.com/2009/03/everything-i-learned-about-game-design.html, and World of Level Design (http://worldofleveldesign.com/) is a good source for learning about level design, though it does not talk about Unity specifically. Exterior environment – terrain When creating exterior environments, we cannot use straight floors for the most part, unless you're creating a highly urbanized area. Our game takes place in a haunted house in the middle of nowhere, so we're going to create a natural landscape. In Unity, the best tool to use to create a natural landscape is the Terrain tool. Unity's terrain system lets us add landscapes, complete with bushes, trees, and fading materials to our game. To show how easy it is to use the terrain tool, let's get started. The first thing that we're going to want to do is actually create the terrain we'll be placing for the world. Let's first create a terrain by navigating to GameObject | Create Other | Terrain: If you are using Unity 4.6 or later, navigate to GameObject | Create  General | Terrain to create the Terrain. At this point, you should see the terrain. Right now, it's just a flat plane, but we'll be adding a lot to it to make it shine. If you look to the right with the Terrain object selected, you'll see the Terrain Editing tools, which can do the following (from left to right): Raise/Lower Height: This option will allow us to raise or lower the height of our terrain up to a certain radius to create hills, rivers, and more. Paint Height: If you already know the exact height that a part of your terrain needs to be, this option will allow you to paint a spot on that location. Smooth Height: This option averages out the area that it is in, and then attempts to smooth out areas and reduce the appearance of abrupt changes. Paint Texture: This option allows us to add textures to the surface of our terrain. One of the nice features of this is the ability to lay multiple textures on top of each other. Place Trees: This option allows us to paint objects in our environment, which will appear on the surface. Unity attempts to optimize these objects by billboarding distant trees so that we can have dense forests without a horrible frame rate. Paint Details: In addition to trees, we can also have small things such as rocks or grass covering the surface of our environment. We can use 2D images to represent individual clumps using bits of randomization to make it appear more natural. Terrain Settings: These are settings that will affect the overall properties of a particular terrain; options such as the size of the terrain and wind can be found here. By default, the entire terrain is set to be at the bottom, but we want to have some ground above and below us; so first, with the terrain object selected, click on the second button to the left of the terrain component (the Paint Height mode). From here, set the Height value under Settings to 100 and then click on the Flatten button. At this point, you should notice the plane moving up, so now everything is above by default. Next, we are going to add some interesting shapes to our world with some hills by painting on the surface. With the Terrain object selected, click on the first button to the left of our Terrain component (the Raise/Lower Terrain mode). Once this is completed, you should see a number of different brushes and shapes that you can select from. Our use of terrain is to create hills in the background of our scene so that it does not seem like the world is completely flat. Under the Settings area, change the Brush Size and Opacity values of your brush to 100 and left-click around the edges of the world to create some hills. You can increase the height of the current hills if you click on top of the previous hill. This is shown in the following screenshot: When creating hills, it's a good idea to look at multiple angles while you're building them, so you can make sure that none are too high or too short. Generally, you want to have taller hills as you go further back, otherwise you cannot see the smaller ones since they would be blocked. In the Scene view, to move your camera around, you can use the toolbar in the top right corner or hold down the right mouse button and drag it in the direction you want the camera to move around in, pressing the W, A, S, and D keys to pan. In addition, you can hold down the middle mouse button and drag it to move the camera around. The mouse wheel can be scrolled to zoom in and out from where the camera is. Even though you should plan out the level ahead of time on something like a piece of graph paper to plan out encounters, you will want to avoid making the level entirely from the preceding section, as the player will not actually see the game with a bird's eye view in the game at all (most likely). Referencing the map from the same perspective of your character will help ensure that the map looks great. To see many different angles at one time, you can use a layout with multiple views of the scene, such as the 4 Split. Once we have our land done, we now want to create some holes in the ground, which we will fill in with water later. This will provide a natural barrier to our world that players will know they cannot pass, so we will create a moat by first changing the Brush Size value to 50 and then holding down the Shift key, and left-clicking around the middle of our texture. In this case, it's okay to use the Top view; remember this will eventually be water to fill in lakes, rivers, and so on, as shown in the following screenshot: At this point, we have done what is referred to in the industry as "greyboxing", making the level in the engine in the simplest way possible but without artwork (also known as "whiteboxing" or "orangeboxing" depending on the company you're working for). At this point in a traditional studio, you'd spend time playtesting the level and iterating on it before an artist or you takes the time to make it look great. However, for our purposes, we want to create a finished project as soon as possible. When doing your own games, be sure to play your level and have others play your level before you polish it. For more information on greyboxing, check out http://www.worldofleveldesign.com/categories/level_design_tutorials/art_of_blocking_in_your_map.php. For an example with images of a greybox to the final level, PC Gamer has a nice article available at http://www.pcgamer.com/2014/03/18/building-crown-part-two-layout-design-textures-and-the-hammer-editor/. This is interesting enough, but being in an all-white world would be quite boring. Thankfully, it's very easy to add textures to everything. However, first we need to have some textures to paint onto the world and for this instance, we will make use of some of the free assets that Unity provides us with. So, with that in mind, navigate to Window | Asset Store. The Asset Store option is home to a number of free and commercial assets that can be used with Unity to help you create your own projects created by both Unity and the community. While we will not be using any unofficial assets, the Asset Store option may help you in the future to save your time in programming or art asset creation. An the top right corner, you'll see a search bar; type terrain assets and press Enter. Once there, the first asset you'll see is Terrain Assets, which is released by Unity Technologies for free. Left-click on it and then once at the menu, click on the Download button: Once it finishes downloading, you should see the Importing Package dialog box pop up. If it doesn't pop up, click on the Import button where the Download button used to be: Generally, you'll only want to select the assets that you want to use and uncheck the others. However, since you're exploring the tools, we'll just click on the Import button to place them all. Close the Asset Store screen; if it's still open, go back into our game view. You should notice the new Terrain Assets folder placed in our Assets folder. Double-click on it and then enter the Textures folder: These will be the textures we will be placing in our environment: Select the Terrain object and then click on the fourth button from the left to select the Paint Texture button. Here, you'll notice that it looks quite similar to the previous sections we've seen. However, there is a Textures section as well, but as of now, there is the information No terrain textures defined in this section. So let's fix that. Click on the Edit Textures button and then select Add Texture. You'll see an Add Terrain Texture dialog pop up. Under the Texture variable, place the Grass (Hill) texture and then click on the Add button: At this point, you should see the entire world change to green if you're far away. If you zoom in, you'll see that the entire terrain uses the Grass (Hill) texture now: Now, we don't want the entire world to have grass. Next, we will add cliffs around the edges where the water is. To do this, add an additional texture by navigating to Edit Textures... | Add Texture. Select Cliff (Layered Rock) as the texture and then select Add. Now if you select the terrain, you should see two textures. With the Cliff (Layered Rock) texture selected, paint the edges of the water by clicking and holding the mouse, and modifying the Brush Size value as needed: We now want to create a path for our player to follow, so we're going to create yet another texture this time using the GoodDirt material. Since this is a path the player may take, I'm going to change the Brush Size value to 8 and the Opacity value to 30, and use the second brush from the left, which is slightly less faded. Once finished, I'm going to paint in some trails that the player can follow. One thing that you will want to try to do is make sure that the player shouldn't go too far before having to backtrack and reward the player for exploration. The following screenshot shows the path: However, you'll notice that there are two problems with it currently. Firstly, it is too big to fit in with the world, and you can tell that it repeats. To reduce the appearance of texture duplication, we can introduce new materials with a very soft opacity, which we place in patches in areas where there is just plain ground. For example, let's create a new texture with the Grass (Meadow) texture. Change the Brush Size value to 16 and the Opacity value to something really low, such as 6, and then start painting the areas that look too static. Feel free to select the first brush again to have a smoother touch up. Now, if we zoom into the world as if we were a character there, I can tell that the first grass texture is way too big for the environment but we can actually change that very easily. Double-click on the texture to change the Size value to (8,8). This will make the texture smaller before it duplicates. It's a good idea to have different textures with different sizes so that the seams of each texture aren't visible to others. The following screenshot shows the size options: Do the same changes as in the preceding step for our Dirt texture as well, changing the Size option to (8,8): With this, we already have a level that looks pretty nice! However, that being said, it's just some hills. To really have a quality-looking title, we are going to need to do some additional work to beautify the environment. Beautifying the environment – adding water, trees, and grass We now have a base for our environment with the terrain, but we're still missing a lot of polish that can make the area stand out and look like a quality environment. Let's add some of those details now: First, let's add water. This time we will use another asset from Unity, but we will not have to go to the Asset Store. Navigate to Assets | Import Package | Water (Basic) and import all of the files included in the package. We will be creating a level for the night time, so navigate to Standard Assets | Water Basic and drag-and-drop the Nighttime Simple Water prefab onto the scene. Once there, set the Position values to (1000,50, 1000) and the Scale values to (1000,1,1000): At this point, you want to repaint your cliff materials to reflect being next to the water better. Next, let's add some trees to make this forest level come to life. Navigate to Terrain Assets | Trees Ambient-Occlusion and drag-and-drop a tree into your world (I'm using ScotsPineTree). By default, these trees do not contain collision information, so our player could just walk through it. This is actually great for areas that the player will not reach as we can add more trees without having to do meaningless calculations, but we need to stop the player from walking into these. To do that, we're going to need to add a collider. To do so, navigate to Component | Physics | Capsule Collider and then change the Radius value to 2. You have to use a Capsule Collider in order to have the collision carried over to the terrain. After this, move our newly created tree into the Assets folder under the Project tab and change its name to CollidingTree. Then, delete the object from the Hierarchy view. With this finished, go back to our Terrain object and then click on the Place Trees mode button. Just like working with painting textures, there are no trees here by default, so navigate to Edit Trees… | Add Tree, add our CollidingTree object created earlier in this step, and then select Add. Next, under the Settings section, change the Tree Density value to 15 and then with our new tree selected, paint the areas on the main island that do not have paths on them. Once you've finished with placing those trees, up the Tree Density value to 50 and then paint the areas that are far away from paths to make it less likely that players go that way. You should also enable Create Tree Collider in the terrain's Terrain Collider component: In our last step to create an environment, let's add some details. The mode next to the Plant Trees mode is Paint Details. Next, click on the Edit Details… button and select Add Grass Texture. Select the Grass texture for the Detail Texture option and then click on Add. In the terrain's Settings mode (the one on the far right), change the Detail Distance value to 250, and then paint the grass where there isn't any dirt along the route in the Paint Details mode: You may not see the results unless you zoom your camera in, which you can do by using the mouse wheel. Don't go too far in though, or the results may not show as well. This aspect of level creation isn't very difficult, just time consuming. However, it's taking time to enter these details that really sets a game apart from the other games. Generally, you'll want to playtest and make sure your level is fun before performing these actions; but I feel it's important to have an idea of how to do it for your future projects. Lastly, our current island is very flat, and while that's okay for cities, nature is random. Go back into the Raise/Lower Height tool and gently raise and lower some areas of the level to give the illusion of depth. Do note that your trees and grass will raise and fall with the changes that you make, as shown in the following screenshot: With this done, let's now add some details to the areas that the player will not be visiting, such as the outer hills. Go into the Place Trees mode and create another tree, but this time select the one without collision and then place it around the edges of the mountains, as shown in the following screenshot: At this point, we have a nice exterior shape created with the terrain tools! If you want to add even more detail to your levels, you can add additional trees and/or materials to the level area as long as it makes sense for them to be there. For more information on the terrain engine that Unity has, please visit http://docs.unity3d.com/Manual/script-Terrain.html. Creating our player Now that we have the terrain and its details, it's hard to get a good picture of what the game looks like without being able to see what it looks like down on the surface, so next we will do just that. However, instead of creating our player from scratch as we've done previously, we will make use of the code that Unity has provided us. We will perform the following steps: Start off by navigating to Assets | Import Package | Character Controller. When the Importing Package dialog comes up, we only need to import the files shown in the following screenshot: Now drag-and-drop the First Person Controller prefab under the Prefabs folder in our Project tab into your world, where you want the player to spawn, setting the Y Position value to above 100. If you see yourself fall through the world instead of hitting the ground when you spawn, then increase the Y Position value until you get there. If you open up the First Person Controller object in the Hierarchy tab, you'll notice that it has a Main Camera object already, so delete the Main Camera object that already exists in the world. Right now, if we played the game, you'd see that everything is dark because we don't have any light. For the purposes of demonstration, let's add a directional light by navigating to GameObject | Create Other | Directional Light. If you are using Unity 4.6 or later, navigate to GameObject | Create  General | Terrain to create the Terrain. Save your scene and hit the Play button to drop into your level: At this point, we have a playable level that we can explore and move around in! Building the atmosphere Now, the base of our world has been created; let's add some effects to make the game even more visually appealing and so it will start to fit in with the survival horror feel that we're going to be giving the game. The first part of creating the atmosphere is to add something for the sky aside from the light blue color that we currently use by default. To fix this, we will be using a skybox. A skybox is a method to create backgrounds to make the area seem bigger than it really is, by putting an image in the areas that are currently being filled with the light blue color, not moving in the same way that the sky doesn't move to us because it's so far away. The reason why we call a skybox a skybox is because it is made up of six textures that will be the inside of the box (one for each side of a cube). Game engines such as Unreal have skydomes, which are the same thing; but they are done with a hemisphere instead of a cube. We will perform the following steps to build the atmosphere: To add in our skybox, we are going to navigate to Assets | Import Package | Skyboxes. We want our level to display the night, so we'll be using the Starry Night Skybox. Just select the StarryNight Skybox.mat file and textures inside the Standard Assets/Skyboxes/Textures/StarryNight/ location, and then click on Import: With this file imported, we need to navigate to Edit | Render Settings next. Once there, we need to set the Skybox Material option to the Starry Night skybox: If you go into the game, you'll notice the level starting to look nicer already with the addition of the skybox, except for the fact that the sky says night while the world says it's daytime. Let's fix that now. Switch to the Game tab so that you can see the changes we'll be making next. While still at the RenderSettings menu, let's turn on the Fog property by clicking on the checkbox with its name and changing the Fog Color value to a black color. You should notice that the surroundings are already turning very dark. Play around with the Fog Density value until you're comfortable with how much the player can see ahead of them; I used 0.005. Fog obscures far away objects, which adds to the atmosphere and saves the rendering power. The denser the fog, the more the game will feel like a horror game. The first game of the Silent Hill franchise used fog to make the game run at an acceptable frame rate due to a large 3D environment it had on early PlayStation hardware. Due to how well it spooked players, it continued to be used in later games even though they could render larger areas with the newer technology. Let's add some lighting tweaks to make the environment that the player is walking in seem more like night. Go into the DirectionalLight properties section and change the Intensity value to 0.05. You'll see the value get darker, as shown in the following screenshot: If for some reason, you'd like to make the world pitch black, you'll need to modify the Ambient Light property to black inside the RenderSettings section. By default, it is dark grey, which will show up even if there are no lights placed in the world. In the preceding example, I increased the Intensity value to make it easier to see the world to make it easier for readers to follow, but in your project, you probably don't want the player to see so far out with such clarity. With this, we now have a believable exterior area at night! Now that we have this basic knowledge, let's add a flashlight so the players can see where they are going. Creating a flashlight Now that our level looks like a dark night, we still want to give our players the ability to see what's in front of them with a flashlight. We will customize the First Person Controller object to fit our needs: Create a spotlight by navigating to GameObject | Create Other | Spotlight. Once created, we are going to make the spotlight a child of the First Person Controller object's Main Camera object by dragging-and-dropping it on top of it. Once a child, change the Transform Position value to (0, -.95, 0). Since positions are relative to your parent's position, this places the light slightly lower than the camera's center, just like a hand holding a flashlight. Now change the Rotation value to (0,0,0) or give it a slight diagonal effect across the scene if you don't want it to look like it's coming straight out: Now, we want the flashlight to reach out into the distance. So we will change the Range value to 1000, and to make the light wider, we will change the Spot Angle value to 45. The effects are shown in the following screenshot: If you have Unity Pro, you can also give shadows to the world based on your lights by setting the Shadow Type property. We now have a flashlight, so the player can focus on a particular area and not worry. Walking / flashlight bobbing animation Now the flashlight looks fine in a screenshot, but if you walk throughout the world, it will feel very static and unnatural. If a person is actually walking through the forest, there will be a slight bob as you walk, and if someone is actually holding a flash light, it won't be stable the entire time because your hand would move. We can solve both of these problems by writing yet another script. We perform the following steps: Create a new folder called Scripts. Inside this folder, create a new C# script called BobbingAnimation. Open the newly created script and use the following code inside it: using UnityEngine; using System.Collections;   /// <summary> /// Allows the attached object to bob up and down through /// movement or /// by default. /// </summary> public class BobbingAnimation : MonoBehaviour {   /// <summary>   /// The elapsed time.   /// </summary>   private float elapsedTime;     /// <summary>   /// The starting y offset from the parent.   /// </summary>   private float startingY;     /// <summary>   /// The controller.   /// </summary>   private CharacterController controller;     /// <summary>   /// How far up and down the object will travel   /// </summary>   public float magnitude = .2f;     /// <summary>   /// How often the object will move up and down   /// </summary>   public float frequency = 10;     /// <summary>   /// Do you always want the object to bob up and down or   /// with movement?   /// </summary>   public bool alwaysBob = false;     /// <summary>   /// Start this instance.   /// </summary>   void Start ()   {     startingY = transform.localPosition.y;     controller = GetComponent<CharacterController> ();   }     /// <summary>   /// Update this instance.   /// </summary>   void Update ()   {          // Only increment elapsedTime if you want the player to     // bob, keeping it the same will keep it still     if(alwaysBob)     {       elapsedTime += Time.deltaTime;     }     else     {       if((Input.GetAxis("Horizontal") != 0.0f) || (Input.GetAxis("Vertical")  != 0.0f) )         elapsedTime += Time.deltaTime;     }         float yOffset = Mathf.Sin(elapsedTime * frequency) * magnitude;       //If we can't find the player controller or we're     // jumping, we shouldn't be bobbing     if(controller && !controller.isGrounded)     {       return;     }       //Set our position     Vector3 pos = transform.position;         pos.y = transform.parent.transform.position.y +             startingY + yOffset;         transform.position = pos;       } } The preceding code will tweak the object it's attached to so that it will bob up and down whenever the player is moving along the x or y axis. I've also added a variable called alwaysBob, which, when true, will make the object always bob. Math is a game developer's best friend, and here we are using sin (pronounced sine). Taking the sin of an angle number gives you the ratio of the length of the opposite side of the angle to the length of the hypotenuse of a right-angled triangle. If that didn't make any sense to you, don't worry. The neat feature of sin is that as the number it takes gets larger, it will continuously give us a value between 0 and 1 that will go up and down forever, giving us a smooth repetitive oscillation. For more information on sine waves, visit http://en.wikipedia.org/wiki/Sine_wave. While we're using the sin just for the player's movement and the flashlight, this could be used in a lot of effects, such as having save points/portals bob up and down, or any kind of object you would want to have slight movement or some special FX. Next, attach the BobbingAnimation component to the Main Camera object, leaving all the values with the defaults. After this, attach the BobbingAnimation component to the spotlight as well. With the spotlight selected, turn the Always Bob option on and change the Magnitude value to .05 and the Frequency value to 3. The effects are shown in the following screenshot: Summary To learn more about FPS game, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Building an FPS Game with Unity (https://www.packtpub.com/game-development/building-fps-game-unity) Resources for Article:   Further resources on this subject: Mobile Game Design Best Practices [article] Putting the Fun in Functional Python [article] Using a collider-based system [article]
Read more
  • 0
  • 0
  • 17531

article-image-machine-learning-practice
Packt
18 Feb 2016
13 min read
Save for later

Machine learning in practice

Packt
18 Feb 2016
13 min read
In this article, we will learn how we can implement machine learning in practice. To apply the learning process to real-world tasks, we'll use a five-step process. Regardless of the task at hand, any machine learning algorithm can be deployed by following this plan: Data collection: The data collection step involves gathering the learning material an algorithm will use to generate actionable knowledge. In most cases, the data will need to be combined into a single source like a text file, spreadsheet, or database. Data exploration and and preparation: The quality of any machine learning project is based largely on the quality of its input data. Thus, it is important to learn more about the data and its nuances during a practice called data exploration. Additional work is required to prepare the data for the learning process. This involves fixing or cleaning so-called "messy" data, eliminating unnecessary data, and recoding the data to conform to the learner's expected inputs. Model training: By the time the data has been prepared for analysis, you are likely to have a sense of what you are capable of learning from the data. The specific machine learning task chosen will inform the selection of an appropriate algorithm and the algorithm will represent the data in the form of a model. Model evaluation: Because each machine learning model results in a biased solution to the learning problem, it is important to evaluate how well the algorithm learns from its experience. Depending on the type of model used, you might be able to evaluate the accuracy of the model using a test dataset or you may need to develop measures of performance specific to the intended application. Model improvement: If better performance is needed, it becomes necessary to utilize more advanced strategies to augment the performance of the model. Sometimes, it may be necessary to switch to a different type of model altogether. You may need to supplement your data with additional data or perform additional preparatory work as in step two of this process. (For more resources related to this topic, see here.) After these steps are completed, if the model appears to be performing well, it can be deployed for its intended task. As the case may be, you might utilize your model to provide score data for predictions (possibly in real time), for projections of financial data, to generate useful insight for marketing or research, or to automate tasks such as mail delivery or flying aircraft. The successes and failures of the deployed model might even provide additional data to train your next generation learner. Types of input data The practice of machine learning involves matching the characteristics of input data to the biases of the available approaches. Thus, before applying machine learning to real-world problems, it is important to understand the terminology that distinguishes among input datasets. The phrase unit of observation is used to describe the smallest entity with measured properties of interest for a study. Commonly, the unit of observation is in the form of persons, objects or things, transactions, time points, geographic regions, or measurements. Sometimes, units of observation are combined to form units such as person-years, which denote cases where the same person is tracked over multiple years; each person-year comprises of a person's data for one year. The unit of observation is related but not identical to the unit of analysis, which is the smallest unit from which the inference is made. Although it is often the case, the observed and analyzed units are not always the same. For example, data observed from people might be used to analyze trends across countries. Datasets that store the units of observation and their properties can be imagined as collections of data consisting of: Examples: Instances of the unit of observation for which properties have been recorded Features: Recorded properties or attributes of examples that may be useful for learning It is the easiest to understand features and examples through real-world cases. To build a learning algorithm to identify spam e-mail, the unit of observation could be e-mail messages, examples would be specific messages, and its features might consist of the words used in the messages. For a cancer detection algorithm, the unit of observation could be patients, the examples might include a random sample of cancer patients, and its features may be the genomic markers from biopsied cells as well as the characteristics of patient such as weight, height, or blood pressure. While examples and features do not have to be collected in any specific form, they are commonly gathered in the matrix format, which means that each example has exactly the same features. The following spreadsheet shows a dataset in the matrix format. In the matrix data, each row in the spreadsheet is an example and each column is a feature. Here, the rows indicate examples of automobiles, while the columns record various each automobile's feature, such as price, mileage, color, and transmission type. Matrix format data is by far the most common form used in machine learning, though other forms are used occasionally in specialized cases: Features also come in various forms. If a feature represents a characteristic measured in numbers, it is unsurprisingly called numeric. Alternatively, if a feature is an attribute that consists of a set of categories, the feature is called categorical or nominal. A special case of categorical variables is called ordinal, which designates a nominal variable to the categories falling in an ordered list. Some examples of ordinal variables include clothing sizes such as small, medium, and large, or a measurement of customer satisfaction on a scale from "not at all happy" to "very happy." It is important to consider what the features represent, as the type and number of features in your dataset will assist in determining an appropriate machine learning algorithm for your task. Types of machine learning algorithms Machine learning algorithms are divided into categories according to their purpose. Understanding the categories of learning algorithms is an essential first step towards using data to drive the desired action. A predictive model is used for tasks that involve, as the name implies, the prediction of one value using other values in the dataset. The learning algorithm attempts to discover and model the relationship between the target feature (the feature being predicted) and the other features. Despite the common use of the word "prediction" to imply forecasting, predictive models need not necessarily foresee events in the future. For instance, a predictive model could be used to predict past events such as the date of a baby's conception using the mother's present-day hormone levels. Predictive models can also be used in real time to control traffic lights during rush hours. Because predictive models are given clear instruction on what they need to learn and how they are intended to learn it, the process of training a predictive model is known as supervised learning. The supervision does not refer to human involvement, but rather to the fact that the target values provide a way for the learner to know how well it has learned the desired task. Stated more formally, given a set of data, a supervised learning algorithm attempts to optimize a function (the model) to find the combination of feature values that result in the target output. The often used supervised machine learning task of predicting which category an example belongs to is known as classification. It is easy to think of potential uses for a classifier. For instance, you could predict whether: An e-mail message is spam A person has cancer A football team will win or lose An applicant will default on a loan In classification, the target feature to be predicted is a categorical feature known as the class and is divided into categories called levels. A class can have two or more levels, and the levels may or may not be ordinal. Because classification is so widely used in machine learning, there are many types of classification algorithms, with strengths and weaknesses suited for different types of input data. Supervised learners can also be used to predict numeric data such as income, laboratory values, test scores, or counts of items. To predict such numeric values, a common form of numeric prediction fits linear regression models to the input data. Although regression models are not the only type of numeric models, they are, by far, the most widely used. Regression methods are widely used for forecasting, as they quantify in exact terms the association between inputs and the target, including both, the magnitude and uncertainty of the relationship. Since it is easy to convert numbers into categories (for example, ages 13 to 19 are teenagers) and categories into numbers (for example, assign 1 to all males, 0 to all females), the boundary between classification models and numeric prediction models is not necessarily firm. A descriptive model is used for tasks that would benefit from the insight gained from summarizing data in new and interesting ways. As opposed to predictive models that predict a target of interest, in a descriptive model, no single feature is more important than the other. In fact, because there is no target to learn, the process of training a descriptive model is called unsupervised learning. Although it can be more difficult to think of applications for descriptive models—after all, what good is a learner that isn't learning anything in particular—they are used quite regularly for data mining. For example, the descriptive modeling task called pattern discovery is used to identify useful associations within data. Pattern discovery is often used for market basket analysis on retailers' transactional purchase data. Here, the goal is to identify items that are frequently purchased together, such that the learned information can be used to refine marketing tactics. For instance, if a retailer learns that swimming trunks are commonly purchased at the same time as sunscreens, the retailer might reposition the items more closely in the store or run a promotion to "up-sell" customers on associated items. Originally used only in retail contexts, pattern discovery is now starting to be used in quite innovative ways. For instance, it can be used to detect patterns of fraudulent behavior, screen for genetic defects, or identify hot spots for criminal activity. The descriptive modeling task of dividing a dataset into homogeneous groups is called clustering. This is, sometimes, used for segmentation analysis that identifies groups of individuals with similar behavior or demographic information so that advertising campaigns could be tailored for particular audiences. Although the machine is capable of identifying the clusters, human intervention is required to interpret them. For example, given five different clusters of shoppers at a grocery store, the marketing team will need to understand the differences among the groups in order to create a promotion that best suits each group. This is almost certainly easier than trying to create a unique appeal for each customer. Lastly, a class of machine learning algorithms known as meta-learners is not tied to a specific learning task, but is rather focused on learning how to learn more effectively. A meta-learning algorithm uses the result of some learnings to inform additional learning. This can be beneficial for very challenging problems or when a predictive algorithm's performance needs to be as accurate as possible. Matching input data to algorithms The following table lists the general types of machine learning algorithms, each of which may be implemented in several ways. These are the basis on which nearly all the other more advanced methods are based. Although this covers only a fraction of the entire set of machine learning algorithms, learning these methods will provide a sufficient foundation to make sense of any other method you may encounter in the future. Model Learning task Supervised Learning Algorithms Nearest Neighbor Classification Naive Bayes Classification Decision Trees Classification Classification Rule Learners Classification Linear Regression Numeric prediction Regression Trees Numeric prediction Model Trees Numeric prediction Neural Networks Dual use Support Vector Machines Dual use Unsupervised Learning Algorithms Association Rules Pattern detection k-means clustering Clustering Meta-Learning Algorithms Bagging Dual use Boosting Dual use Random Forests Dual use  To begin applying machine learning to a real-world project, you will need to determine which of the four learning tasks your project represents: classification, numeric prediction, pattern detection, or clustering. The task will drive the choice of algorithm. For instance, if you are undertaking pattern detection, you are likely to employ association rules. Similarly, a clustering problem will likely utilize the k-means algorithm and numeric prediction will utilize regression analysis or regression trees. For classification, more thought is needed to match a learning problem to an appropriate classifier. In these cases, it is helpful to consider various distinctions among algorithms—distinctions that will only be apparent by studying each of the classifiers in depth. For instance, within classification problems, decision trees result in models that are readily understood, while the models of neural networks are notoriously difficult to interpret. If you were designing a credit-scoring model, this could be an important distinction because law often requires that the applicant must be notified about the reasons he or she was rejected for the loan. Even if the neural network is better at predicting loan defaults, if its predictions cannot be explained, then it is useless for this application. Although you will sometimes find that these characteristics exclude certain models from consideration. In many cases, the choice of algorithm is arbitrary. When this is true, feel free to use whichever algorithm you are most comfortable with. Other times, when predictive accuracy is primary, you may need to test several algorithms and choose the one that fits the best or use a meta-learning algorithm that combines several different learners to utilize the strengths of each. Summary Machine learning originated at the intersection of statistics, database science, and computer science. It is a powerful tool, capable of finding actionable insight in large quantities of data. Still, caution must be used in order to avoid common abuses of machine learning in the real world. Conceptually, learning involves the abstraction of data into a structured representation and the generalization of this structure into action that can be evaluated for utility. In practical terms, a machine learner uses data containing examples and features of the concept to be learned and summarizes this data in the form of a model, which is then used for predictive or descriptive purposes. These purposes can be grouped into tasks, including classification, numeric prediction, pattern detection, and clustering. Among the many options, machine learning algorithms are chosen on the basis of the input data and the learning task. R provides support for machine learning in the form of community-authored packages. These powerful tools are free to download, but need to be installed before they can be used. Resources for Article:   Further resources on this subject: Introduction to Machine Learning with R [article] Machine Learning with R [article] Spark – Architecture and First Program [article]
Read more
  • 0
  • 0
  • 3238

article-image-developing-nodejs-web-applications
Packt
18 Feb 2016
11 min read
Save for later

Developing Node.js Web Applications

Packt
18 Feb 2016
11 min read
Node.js is a platform that supports various types of applications, but the most popular kind is the development of web applications. Node's style of coding depends on the community to extend the platform through third-party modules; these modules are then built upon to create new modules, and so on. Companies and single developers around the globe are participating in this process by creating modules that wrap the basic Node APIs and deliver a better starting point for application development. (For more resources related to this topic, see here.) There are many modules to support web application development but none as popular as the Connect module. The Connect module delivers a set of wrappers around the Node.js low-level APIs to enable the development of rich web application frameworks. To understand what Connect is all about, let's begin with a basic example of a basic Node web server. In your working folder, create a file named server.js, which contains the following code snippet: var http = require('http'); http.createServer(function(req, res) { res.writeHead(200, { 'Content-Type': 'text/plain' }); res.end('Hello World'); }).listen(3000); console.log('Server running at http://localhost:3000/'); To start your web server, use your command-line tool, and navigate to your working folder. Then, run the node CLI tool and run the server.js file as follows: $ node server Now open http://localhost:3000 in your browser, and you'll see the Hello World response. So how does this work? In this example, the http module is used to create a small web server listening to the 3000 port. You begin by requiring the http module and use the createServer() method to return a new server object. The listen() method is then used to listen to the 3000 port. Notice the callback function that is passed as an argument to the createServer() method. The callback function gets called whenever there's an HTTP request sent to the web server. The server object will then pass the req and res arguments, which contain the information and functionality needed to send back an HTTP response. The callback function will then do the following two steps: First, it will call the writeHead() method of the response object. This method is used to set the response HTTP headers. In this example, it will set the Content-Type header value to text/plain. For instance, when responding with HTML, you just need to replace text/plain with html/plain. Then, it will call the end() method of the response object. This method is used to finalize the response. The end() method takes a single string argument that it will use as the HTTP response body. Another common way of writing this is to add a write() method before the end() method and then call the end() method, as follows: res.write('Hello World'); res.end(); This simple application illustrates the Node coding style where low-level APIs are used to simply achieve certain functionality. While this is a nice example, running a full web application using the low-level APIs will require you to write a lot of supplementary code to support common requirements. Fortunately, a company called Sencha has already created this scaffolding code for you in the form of a Node module called Connect. Meet the Connect module Connect is a module built to support interception of requests in a more modular approach. In the first web server example, you learned how to build a simple web server using the http module. If you wish to extend this example, you'd have to write code that manages the different HTTP requests sent to your server, handles them properly, and responds to each request with the correct response. Connect creates an API exactly for that purpose. It uses a modular component called middleware, which allows you to simply register your application logic to predefined HTTP request scenarios. Connect middleware are basically callback functions, which get executed when an HTTP request occurs. The middleware can then perform some logic, return a response, or call the next registered middleware. While you will mostly write custom middleware to support your application needs, Connect also includes some common middleware to support logging, static file serving, and more. The way a Connect application works is by using an object called dispatcher. The dispatcher object handles each HTTP request received by the server and then decides, in a cascading way, the order of middleware execution. To understand Connect better, take a look at the following diagram: The preceding diagram illustrates two calls made to the Connect application: the first one should be handled by a custom middleware and the second is handled by the static files middleware. Connect's dispatcher initiates the process, moving on to the next handler using the next() method, until it gets to middleware responding with the res.end() method, which will end the request handling. Express is based on Connect's approach, so in order to understand how Express works, we'll begin with creating a Connect application. In your working folder, create a file named server.js that contains the following code snippet: var connect = require('connect'); var app = connect(); app.listen(3000); console.log('Server running at http://localhost:3000/'); As you can see, your application file is using the connect module to create a new web server. However, Connect isn't a core module, so you'll have to install it using NPM. As you already know, there are several ways of installing third-party modules. The easiest one is to install it directly using the npm install command. To do so, use your command-line tool, and navigate to your working folder. Then execute the following command: $ npm install connect NPM will install the connect module inside a node_modules folder, which will enable you to require it in your application file. To run your Connect web server, just use Node's CLI and execute the following command: $ node server Node will run your application, reporting the server status using the console.log() method. You can try reaching your application in the browser by visiting http://localhost:3000. However, you should get a response similar to what is shown in the following screenshot: Connect application's empty response What this response means is that there isn't any middleware registered to handle the GET HTTP request. This means two things: You've successfully managed to install and use the Connect module It's time for you to write your first Connect middleware Connect middleware Connect middleware is just JavaScript function with a unique signature. Each middleware function is defined with the following three arguments: req: This is an object that holds the HTTP request information res: This is an object that holds the HTTP response information and allows you to set the response properties next: This is the next middleware function defined in the ordered set of Connect middleware When you have a middleware defined, you'll just have to register it with the Connect application using the app.use() method. Let's revise the previous example to include your first middleware. Change your server.js file to look like the following code snippet: var connect = require('connect'); var app = connect(); var helloWorld = function(req, res, next) { res.setHeader('Content-Type', 'text/plain'); res.end('Hello World'); }; app.use(helloWorld); app.listen(3000); console.log('Server running at http://localhost:3000/'); Then, start your connect server again by issuing the following command in your command-line tool: $ node server Try visiting http://localhost:3000 again. You will now get a response similar to that in the following screenshot: Connect application's response Congratulations, you've just created your first Connect middleware! Let's recap. First, you added a middleware function named helloWorld(), which has three arguments: req, res, and next. In your middleware, you used the res.setHeader() method to set the response Content-Type header and the res.end() method to set the response text. Finally, you used the app.use() method to register your middleware with the Connect application. Understanding the order of Connect middleware One of Connect's greatest features is the ability to register as many middleware functions as you want. Using the app.use() method, you'll be able to set a series of middleware functions that will be executed in a row to achieve maximum flexibility when writing your application. Connect will then pass the next middleware function to the currently executing middleware function using the next argument. In each middleware function, you can decide whether to call the next middleware function or stop at the current one. Notice that each Connect middleware function will be executed in first-in-first-out (FIFO) order using the next arguments until there are no more middleware functions to execute or the next middleware function is not called. To understand this better, we will go back to the previous example and add a logger function that will log all the requests made to the server in the command line. To do so, go back to the server.js file and update it to look like the following code snippet: var connect = require('connect'); var app = connect(); var logger = function(req, res, next) { console.log(req.method, req.url); next(); }; var helloWorld = function(req, res, next) { res.setHeader('Content-Type', 'text/plain'); res.end('Hello World'); }; app.use(logger); app.use(helloWorld); app.listen(3000); console.log('Server running at http://localhost:3000/'); In the preceding example, you added another middleware called logger(). The logger() middleware uses the console.log() method to simply log the request information to the console. Notice how the logger() middleware is registered before the helloWorld() middleware. This is important as it determines the order in which each middleware is executed. Another thing to notice is the next() call in the logger() middleware, which is responsible for calling the helloWorld() middleware. Removing the next() call would stop the execution of middleware function at the logger() middleware, which means that the request would hang forever as the response is never ended by calling the res.end() method. To test your changes, start your connect server again by issuing the following command in your command-line tool: $ node server Then, visit http://localhost:3000 in your browser and notice the console output in your command-line tool. Mounting Connect middleware As you may have noticed, the middleware you registered responds to any request regardless of the request path. This does not comply with modern web application development because responding to different paths is an integral part of all web applications. Fortunately, Connect middleware supports a feature called mounting, which enables you to determine which request path is required for the middleware function to get executed. Mounting is done by adding the path argument to the app.use() method. To understand this better, let's revisit our previous example. Modify your server.js file to look like the following code snippet: var connect = require('connect'); var app = connect(); var logger = function(req, res, next) { console.log(req.method, req.url); next(); }; var helloWorld = function(req, res, next) { res.setHeader('Content-Type', 'text/plain'); res.end('Hello World'); }; var goodbyeWorld = function(req, res, next) { res.setHeader('Content-Type', 'text/plain'); res.end('Goodbye World'); }; app.use(logger); app.use('/hello', helloWorld); app.use('/goodbye', goodbyeWorld); app.listen(3000); console.log('Server running at http://localhost:3000/'); A few things have been changed in the previous example. First, you mounted the helloWorld() middleware to respond only to requests made to the /hello path. Then, you added another (a bit morbid) middleware called goodbyeWorld() that will respond to requests made to the /goodbye path. Notice how, as a logger should do, we left the logger() middleware to respond to all the requests made to the server. Another thing you should be aware of is that any requests made to the base path will not be responded by any middleware because we mounted the helloWorld() middleware to a specific path. Connect is a great module that supports various features of common web applications. Connect middleware is super simple as it is built with a JavaScript style in mind. It allows the endless extension of your application logic without breaking the nimble philosophy of the Node platform. While Connect is a great improvement over writing your web application infrastructure, it deliberately lacks some basic features you're used to having in other web frameworks. The reason lies in one of the basic principles of the Node community: create your modules lean and let other developers build their modules on top of the module you created. The community is supposed to extend Connect with its own modules and create its own web infrastructures. In fact, one very energetic developer named TJ Holowaychuk, did it better than most when he released a Connect-based web framework known as Express. Summary In this article, you learned about the basic principles of Node.js web applications and discovered the Connect web module. You created your first Connect application and learned how to use middleware functions. To learn more about Node.js and Web Development, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Node.js Design Patterns (https://www.packtpub.com/web-development/nodejs-design-patterns) Web Development with MongoDB and Node.js (https://www.packtpub.com/web-development/web-development-mongodb-and-nodejs) Resources for Article: Further resources on this subject: So, what is Node.js?[article] AngularJS[article] So, what is MongoDB?[article]
Read more
  • 0
  • 0
  • 14175

article-image-understanding-material-design
Packt
18 Feb 2016
22 min read
Save for later

Understanding Material Design

Packt
18 Feb 2016
22 min read
Material can be thought of as something like smart paper. Like paper, it has surfaces and edges that reflect light and cast shadows, but unlike paper, material has properties that real paper does not, such as its ability to move, change its shape and size, and merge with other material. Despite this seemingly magical behavior, material should be treated like a physical object with a physicality of its own. Material can be seen as existing in a three-dimensional space, and it is this that gives its interfaces a reassuring sense of depth and structure. Hierarchies become obvious when it is instantly clear whether an object is above or below another. Based largely on age-old principles taken from color theory, animation, traditional print design, and physics, material design provides a virtual space where developers can use surface and light to create meaningful interfaces and movement to design intuitive user interactions. (For more resources related to this topic, see here.) Material properties As mentioned in the introduction, material can be thought of as being bound by physical laws. There are things it can do and things it cannot. It can split apart and heal again, and change color and shape, but it cannot occupy the same space as another sheet of material or rotate around two of its axes. We will be dealing with these properties throughout the book, but it is a good idea to begin with a quick look at the things material can and can't do. The third dimension is fundamental when it comes to material. This is what gives the user the illusion that they are interacting with something more tangible than a rectangle of light. The illusion is generated by the widening and softening of shadows beneath material that is closer to the user. Material exists in virtual space, but a space that, nevertheless, represents the real dimensions of a phone or tablet. The x axis can be thought of as existing between the top and bottom of the screen, the y axis between the right and left edges, and the z axis confined to the space between the back of the handset and the glass of the screen. It is for this reason that material should not rotate around the x or y axes, as this would break the illusion of a space inside the phone. The basic laws of the physics of material are outlined, as follows, in the form of a list: All material is 1 dp thick (along the z axis). Material is solid, only one sheet can exist in one place at a time and material cannot pass through other material. For example, if a card needs to move past another, it must move over it. Elevation, or position along the z axis, is portrayed by shadow, with higher objects having wider, softer shadows. The z axis should be used to prompt interaction. For example, an action button rising up toward the user to demonstrate that it can be used to perform some action. Material does not fold or bend. Material cannot appear to rise higher than the screen surface. Material can grow and shrink along both x and y axes. Material can move along any axis. Material can be spontaneously created and destroyed, but this must not be without movement. The arrivals and departures of material components must be animated. For example, a card growing from the point that it was summoned from or sliding off the screen when dismissed. A sheet of material can split apart anywhere along the x or y axes, and join together again with its original partner or with other material. This covers the basic rules of material behavior but we have said nothing of its content. If material can be thought of as smart paper, then its content can only be described as smart ink. The rules governing how ink behaves are a little simpler: Material content can be text, imagery, or any other form of visual digital content Content can be of any shape or color and behaves independently from its container material It cannot be displayed beyond the edges of its material container It adds nothing to the thickness (z axis) of the material it is displayed on Setting up a development environment The Android development environment consists mainly of two distinct components: the SDK, which provides the code libraries behind Android and Android Studio, and a powerful code editor that is used for constructing and testing applications for Android phones and tablets, Wear, TV, Auto, Glass, and Cardboard. Both these components can both be downloaded as a single package from http://developer.android.com/sdk/index.html. Installing Android Studio The installation is very straightforward. Run the Android Studio bundle and follow the on-screen instructions, installing HAXM hardware acceleration if prompted, and selecting all SDK components, as shown here: Android Studio is dependent on the Java JDK. If you have not previously installed it, this will be detected while you are installing Android Studio, and you will be prompted to download and install it. If for some reason it does not, it can be found at http://www.oracle.com/technetwork/java/javase/downloads/index.html, from where you should download the latest version. This is not quite the end of the installation process. There are still some SDK components that we will need to download manually before we can build our first app. As we will see next, this is done using the Android SDK Manager. Configuring the Android SDK People often refer to Android versions by name, such as Lollipop, or an identity number, such as 5.1.1. As developers, it makes more sense to use the API level, which in the case of Android 5.1.1 would be API level 22. The SDK provides a platform for every API level since API level 8 (Android 2.2). In this section, we will use the SDK Manager to take a closer look at Android platforms, along with the other tools included in the SDK. Start a new Android Studio project or open an existing one with the minimum SDK at 21 or higher. You can then open the SDK manager from the menu via Tools | Android | SDK Manager or the matching icon on the main toolbar. The Android SDK Manager can also be started as a stand alone program. It can be found in the /Android/sdk directory, as can the Android Virtual Device (AVD) manager. As can be seen in the preceding screenshot, there are really three main sections in the SDK: A Tools folder A collection of platforms An Extras folder All these require a closer look. The Tools directory contains exactly what it says, that is, tools. There are a handful of these but the ones that will concern us are the SDK manager that we are using now, and the AVD manager that we will be using shortly to create a virtual device. Open the Tools folder. You should find the latest revisions of the SDK tools and the SDK Platform-tools already installed. If not, select these items, along with the latest Build-tools, that is, if they too have not been installed. These tools are often revised, and it is well worth it to regularly check the SDK manager for updates. When it comes to the platforms themselves, it is usually enough to simply install the latest one. This does not mean that these apps will not work on or be available to devices running older versions, as we can set a minimum SDK level when setting up a project, and along with the use of support libraries, we can bring material design to almost any Android device out there. If you open up the folder for the latest platform, you will see that some items have already been installed. Strictly speaking, the only things you need to install are the SDK platform itself and at least one system image. System images are copies of the hard drives of actual Android devices and are used with the AVD to create emulators. Which images you use will depend on your system and the form factors that you are developing for. In this book, we will be building apps for phones and tablets, so make sure you use one of these at least. Although they are not required to develop apps, the documentation and samples packages can be extremely useful. At the bottom of each platform folder are the Google APIs and corresponding system images. Install these if you are going to include Google services, such as Maps and Cloud, in your apps. You will also need to install the Google support libraries from the Extras directory, and this is what we will cover next. The Extras folder contains various miscellaneous packages with a range of functions. The ones you are most likely to want to download are listed as follows: Android support libraries are invaluable extensions to the SDK that provide APIs that not only facilitate backwards compatibility, but also provide a lot of extra components and functions, and most importantly for us, the design library. As we are developing on Android Studio, we need only install the Android Support Repository, as this contains the Android Support Library and is designed for use with Android. The Google Play services and Google Repository packages are required, along with the Google APIs mentioned a moment ago, to incorporate Google Services into an application. You will most likely need the Google USB Driver if you are intending to test your apps on a real device. How to do this will be explained later in this chapter. The HAXM installer is invaluable if you have a recent Intel processor. Android emulators can be notoriously slow, and this hardware acceleration can make a noticeable difference. Once you have downloaded your selected SDK components, depending on your system and/or project plans, you should have a list of installed packages similar to the one shown next: The SDK is finally ready, and we can start developing material interfaces. All that is required now is a device to test it on. This can, of course, be done on an actual device, but generally speaking, we will need to test our apps on as many devices as possible. Being able to emulate Android devices allows us to do this. Emulating Android devices The AVD allows us to test our designs across the entire range of form factors. There are an enormous number of screen sizes, shapes, and densities around. It is vital that we get to test our apps on as many device configurations as possible. This is actually more important for design than it is for functionality. An app might operate perfectly well on an exceptionally small or narrow screen, but not look as good as we had wanted, making the AVD one of the most useful tools available to us. This section covers how to create a virtual device using the AVD Manager. The AVD Manager can be opened from within Android Studio by navigating to Tools | Android | AVD Manager from the menu or the corresponding icon on the toolbar. Here, you should click on the Create Virtual Device... button. The easiest way to create an emulator is to simply pick a device definition from the list of hardware images and keep clicking on Next until you reach Finish. However, it is much more fun and instructive to either clone and edit an existing profile, or create one from scratch. Click on the New Hardware Profile button. This takes you to the Configure Hardware Profile window where you will be able to create a virtual device from scratch, configuring everything from cameras and sensors, to storage and screen resolution. When you are done, click on Finish and you will be returned to the hardware selection screen where your new device will have been added: As you will have seen from the Import Hardware Profiles button, it is possible to download system images for many devices not included with the SDK. Check the developer sections of device vendor's web sites to see which models are available. So far, we have only configured the hardware for our virtual device. We must now select all the software it will use. To do this, select the hardware profile you just created and press Next. In the following window, select one of the system images you installed earlier and press Next again. This takes us to the Verify Configuration screen where the emulator can be fine-tuned. Most of these configurations can be safely left as they are, but you will certainly need to play with the scale when developing for high density devices. It can also be very useful to be able to use a real SD card. Once you click on Finish, the emulator will be ready to run. An emulator can be rotated through 90 degrees with left Ctrl + F12. The menu can be called with F2, and the back button with ESC. Keyboard commands to emulate most physical buttons, such as call, power, and volume, and a complete list can be found at http://developer.android.com/tools/help/emulator.html. Android emulators are notoriously slow, during both loading and operating, even on quite powerful machines. The Intel hardware accelerator we encountered earlier can make a significant difference. Between the two choices offered, the one that you use should depend on how often you need to open and close a particular emulator. More often than not, taking advantage of your GPU is the more helpful of the two. Apart from this built-in assistance, there are a few other things you can do to improve performance, such as setting lower pixel densities, increasing the device's memory, and building the website for lower API levels. If you are comfortable doing so, set up exclusions in your anti-virus software for the Android Studio and SDK directories. There are several third-party emulators, such as Genymotion, that are not only faster, but also behave more like real devices. The slowness of Android emulators is not necessarily a big problem, as most early development needs only one device, and real devices suffer none of the performance issues found on emulators. As we shall see next, real devices can be connected to our development environment with very little effort. Connecting a real device Using an actual physical device to run and test applications does not have the flexibility that emulators provide, but it does have one or two advantages of its own. Real devices are faster than any emulator, and you can test features unavailable to a virtual device, such as accessing sensors, and making and receiving calls. There are two steps involved in setting up a real phone or tablet. We need to set developer options on the handset and configure the USB connection with our development computer: To enable developer options on your handset, navigate to Settings | About phone. Tap on Build number 7 times to enable Developer options, which will now be available from the previous screen. Open this to enable USB debugging and Allow mock locations. Connect the device to your computer and check that it is connected as a Media device (MTP). Your handset can now be used as a test device. Depending on your We need only install the Google USB. Connect the device to your computer with a USB cable, start Android Studio, and open a project. Depending on your setup, it is quite possible that you are already connected. If not, you can install the Google USB driver by following these steps: From the Windows start menu, open the device manager. Your handset can be found under Other Devices or Portable Devices. Open its Properties window and select the Driver tab. Update the driver with the Google version, which can be found in the sdkextrasgoogleusb_driver directory. An application can be compiled and run from Android Studio by selecting Run 'app' from the Run menu, pressing Shift + F10, or clicking on the green play icon on the toolbar. Once the project has finished building, you will be asked to confirm your choice of device before the app loads and then opens on your handset. With a fully set up development environment and devices to test on, we can now start taking a look at material design, beginning with the material theme that is included as the default in all SDKs with APIs higher than 21. The material theme Since API level 21 (Android 5.0), the material theme has been the built-in user interface. It can be utilized and customized, simplifying the building of material interfaces. However, it is more than just a new look; the material theme also provides the automatic touch feedback and transition animations that we associate with material design. To better understand Android themes and how to apply them, we need to understand how Android styles work, and a little about how screen components, such as buttons and text boxes, are defined. Most individual screen components are referred to as widgets or views. Views that contain other views are called view groups, and they generally take the form of a layout, such as the relative layout we will use in a moment. An Android style is a set of graphical properties defining the appearance of a particular screen component. Styles allow us to define everything from font size and background color, to padding elevation, and much more. An Android theme is simply a style applied across a whole screen or application. The best way to understand how this works is to put it into action and apply a style to a working project. This will also provide a great opportunity to become more familiar with Android Studio. Applying styles Styles are defined as XML files and are stored in the resources (res) directory of Android Studio projects. So that we can apply different styles to a variety of platforms and devices, they are kept separate from the layout code. To see how this is done, start a new project, selecting a minimum SDK of 21 or higher, and using the blank activity template. To the left of the editor is the project explorer pane. This is your access point to every branch of your project. Take a look at the activity_main.xml file, which would have been opened in the editor pane when the project was created. At the bottom of the pane, you will see a Text tab and a Design tab. It should be quite clear, from examining these, how the XML code defines a text box (TextView) nested inside a window (RelativeLayout). Layouts can be created in two ways: textually and graphically. Usually, they are built using a combination of both techniques. In the design view, widgets can be dragged and dropped to form layout designs. Any changes made using the graphical interface are immediately reflected in the code, and experimenting with this is a fantastic way to learn how various widgets and layouts are put together. We will return to both these subjects in detail later on in the book, but for now, we will continue with styles and themes by defining a custom style for the text view in our Hello world app. Open the res node in the project explorer; you can then right-click on the values node and select the New | Values resource file from the menu. Call this file my_style and fill it out as follows: <?xml version="1.0" encoding="utf-8"?> <resources>     <style name="myStyle">         <item name="android:layout_width">match_parent</item>         <item name="android:layout_height">wrap_content</item>         <item name="android:elevation">4dp</item>         <item name="android:gravity">center_horizontal</item>         <item name="android:padding">8dp</item>         <item name="android:background">#e6e6e6</item>         <item name="android:textSize">32sp</item>         <item name="android:textColor">#727272</item>     </style> </resources> This style defines several graphical properties, most of which should be self-explanatory with the possible exception of gravity, which here refers to how content is justified within the view. We will cover measurements and units later in the book, but for now, it is useful to understand dp and sp: Density-independent pixel (dp): Android runs on an enormous number of devices, with screen densities ranging from 120 dpi to 480 dpi and more. To simplify the process of developing for such a wide variety, Android uses a virtual pixel unit based on a 160 dpi screen. This allows us to develop for a particular screen size without having to worry about screen density. Scale-independent pixel (sp): This unit is designed to be applied to text. The reason it is scale-independent is because the actual text size on a user's device will depend on their font size settings. To apply the style we just defined, open the activity_main.xml file (from res/layouts, if you have closed it) and edit the TextView node so that it matches this: <TextView     style="@style/myStyle"     android_text="@string/hello_world" /> The effects of applying this style can be seen immediately from the design tab or preview pane, and having seen how styles are applied, we can now go ahead and create a style to customize the material theme palette. Customizing the material theme One of the most useful features of the material theme is the way it can take a small palette made of only a handful of colors and incorporate these colors into every aspect of a UI. Text and cursor colors, the way things are highlighted, and even system features such as the status and navigation bars can be customized to give our apps brand colors and an easily recognizable look. The use of color in material design is a topic in itself, and there are strict guidelines regarding color, shade, and text, and these will be covered in detail later in the book. For now, we will just look at how we can use a style to apply our own colors to a material theme. So as to keep our resources separate, and therefore easier to manage, we will define our palette in its own XML file. As we did earlier with the my_style.xml file, create a new values resource file in the values directory and call it colors. Complete the code as shown next: <?xml version="1.0" encoding="utf-8"?> <resources>     <color name="primary">#FFC107</color>     <color name="primary_dark">#FFA000</color>     <color name="primary_light">#FFECB3</color>     <color name="accent">#03A9F4</color>     <color name="text_primary">#212121</color>     <color name="text_secondary">#727272</color>     <color name="icons">#212121</color>     <color name="divider">#B6B6B6</color> </resources> In the gutter to the left of the code, you will see small, colored squares. Clicking on these will take you to a dialog with a color wheel and other color selection tools for quick color editing. We are going to apply our style to the entire app, so rather than creating a separate file, we will include our style in the theme that was set up by the project template wizard when we started the project. This theme is called AppTheme, as can be seen by opening the res/values/styles/styles.xml (v21) file. Edit the code in this file so that it looks like the following: <?xml version="1.0" encoding="utf-8"?> <resources>     <style name="AppTheme" parent="android:Theme.Material.Light">         <item name="android:colorPrimary">@color/primary</item>         <item name="android:colorPrimaryDark">@color/primary_dark</item>         <item name="android:colorAccent">@color/accent</item>         <item name="android:textColorPrimary">@color/text_primary</item>         <item name="android:textColor">@color/text_secondary</item>     </style> </resources> Being able to set key colors, such as colorPrimary and colorAccent, allows us to incorporate our brand colors throughout the app, although the project template only shows us how we have changed the color of the status bar and app bar. Try adding radio buttons or text edit boxes to see how the accent color is applied. In the following figure, a timepicker replaces the original text view: The XML for this looks like the following lines: <TimePicker     android_layout_width="wrap_content"     android_layout_height="wrap_content"     android_layout_alignParentBottom="true"     android_layout_centerHorizontal="true" /> For now, it is not necessary to know all the color guidelines. Until we get to them, there is an online material color palette generator at http://www.materialpalette.com/ that lets you try out different palette combinations and download color XML files that can just be cut and pasted into the editor. With a complete and up-to-date development environment constructed, and a way to customize and adapt the material theme, we are now ready to look into how material specific widgets, such as card views, are implemented. Summary The Android SDK, Android Studio, and AVD comprise a sophisticated development toolkit, and even setting them up is no simple task. But, with our tools in place, we were able to take a first look at one of material design's major components: the material theme. We have seen how themes and styles relate, and how to create and edit styles in XML. Finally, we have touched on material palettes, and how to customize a theme to utilize our own brand colors across an app. With these basics covered, we can move on to explore material design further, and in the next chapter, we will look at layouts and material components in greater detail. To learn more about material design, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Instant Responsive Web Design (https://www.packtpub.com/web-development/instant-responsive-web-design-instant) Mobile Game Design Essentials (https://www.packtpub.com/game-development/mobile-game-design-essentials) Resources for Article: Further resources on this subject: Speaking Java – Your First Game [article] Metal API: Get closer to the bare metal with Metal API [article] Looking Good – The Graphical Interface [article]
Read more
  • 0
  • 0
  • 10159

article-image-reactive-programming-and-flux-architecture
Packt
18 Feb 2016
12 min read
Save for later

Reactive Programming and the Flux Architecture

Packt
18 Feb 2016
12 min read
Reactive programming, including functional reactive programming as will be discussed later, is a programming paradigm that can be used in multiparadigm languages such as JavaScript, Python, Scala, and many more. It is primarily distinguished from imperative programming, in which a statement does something by what are called side effects, in literature, about functional and reactive programming. Please note, though, that side effects here are not what they are in common English, where all medications have some effects, which are the point of taking the medication, and some other effects are unwanted but are tolerated for the main benefit. For example, Benadryl is taken for the express purpose of reducing symptoms of airborne allergies, and the fact that Benadryl, in a way similar to some other allergy medicines, can also cause drowsiness is (or at least was; now it is also sold as a sleeping aid) a side effect. This is unwelcome but tolerated as the lesser of two evils by people, who would rather be somewhat tired and not bothered by allergies than be alert but bothered by frequent sneezing. Medication side effects are rarely the only thing that would ordinarily be considered side effects by a programmer. For them, side effects are the primary intended purpose and effect of a statement, often implemented through changes in the stored state for a program. (For more resources related to this topic, see here.) Reactive programming has its roots in the observer pattern, as discussed in Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides's classic book Design Patterns: Elements of Reusable Object-Oriented Software (the authors of this book are commonly called GoF or Gang of Four). In the observer pattern, there is an observable subject. It has a list of listeners, and notifies all of them when it has something to publish. This is somewhat simpler than the publisher/subscriber (PubSub) pattern, not having potentially intricate filtering of which messages reach which subscriber which is a normal feature to include. Reactive programming has developed a life of its own, a bit like the MVC pattern-turned-buzzword, but it is best taken in connection with the broader context explored in GoF. Reactive programming, including the ReactJS framework (which is explored in this title), is intended to avoid the shared mutable state and be idempotent. This means that, as with RESTful web services, you will get the same result from a function whether you call it once or a hundred times. Pete Hunt formerly of Facebook—perhaps the face of ReactJS as it now exists—has said that he would rather be predictable than right. If there is a bug in his code, Hunt would rather have the interface fail the same way every single time than go on elaborate hunts for heisenbugs. These are bugs that manifest only in some special and slippery edge cases, and are explored later in this book. ReactJS is called the V of MVC. That is, it is intended for user interface work and has little intentions of offering other standard features. But just as the painter Charles Cézanne said about the impressionist painter Claude Monet, "Monet is only an eye, but what an eye!" about MVC and ReactJS, we can say, "ReactJS is only a view, but what a view!" In this chapter, we will be covering the following topics: Declarative programming The war on heisenbugs The Flux Architecture From pit of despair to the pit of success A complete UI teardown and rebuild JavaScript as a Domain-specific Language (DSL) Big-Coffee Notation ReactJS, the library explored in this book, was developed by Facebook and made open source in the not-too-distant past. It is shaped by some of Facebook's concerns about making a large-scale site that is safe to debug and work on, and also allowing a large number of programmers to work on different components without having to store brain-bending levels of complexity in their heads. The quotation "Simplicity is the lack of interleaving," which can be found in the videos at http://facebook.github.io/react, is not about how much or how little stuff there is on an absolute scale, but about how many moving parts you need to juggle simultaneously to work on a system (See the section on Big-Coffee Notation for further reflections). Declarative programming Probably, the biggest theoretical advantage of the ReactJS framework is that the programming is declarative rather than imperative. In imperative programming, you specify what steps need to be done; declarative programming is the programming in which you specify what needs to be accomplished without telling how it needs to be done. It may be difficult at first to shift from an imperative paradigm to a declarative paradigm, but once the shift has been made, it is well worth the effort involved to get there. Familiar examples of declarative paradigms, as opposed to imperative paradigms, include both SQL and HTML. An SQL query would be much more verbose if you had to specify how exactly to find records and filter them appropriately, let alone say how indices are to be used, and HTML would be much more verbose if, instead of having an IMG tag, you had to specify how to render an image. Many libraries, for instance, are more declarative than a rolling of your own solution from scratch. With a library, you are more likely to specify only what needs to be done and not—in addition to this—how to do it. ReactJS is not in any sense the only library or framework that is intended to provide a more declarative JavaScript, but this is one of its selling points, along with other better specifics that it offers to help teams work together and be productive. And again, ReactJS has emerged from some of Facebook's efforts in managing bugs and cognitive load while enabling developers to contribute a lot to a large-scale project. The war on Heisenbugs In modern physics, Heisenberg's uncertainty principle loosely says that there is an absolute theoretical limit to how well a particle's position and velocity can be known. Regardless of how good a laboratory's measuring equipment gets, funny things will always happen when you try to pin things down too far. Heisenbugs, loosely speaking, are subtle, slippery bugs that can be very hard to pin down. They only manifest under very specific conditions and may even fail to manifest when one attempts to investigate them (note that this definition is slightly different from the jargon file's narrower and more specific definition at http://www.catb.org/jargon/html/H/heisenbug.html, which specifies that attempting to measure a heisenbug may suppress its manifestation). This motive—of declaring war on heisenbugs—stems from Facebook's own woes and experiences in working at scale and seeing heisenbugs keep popping up. One thing that Pete Hunt mentioned, in not a flattering light at all, was a point where Facebook's advertisement system was only understood by two engineers well enough who were comfortable with modifying it. This is an example of something to avoid. By contrast, looking at Pete Hunt's remark that he would "rather be predictable than right" is a statement that if a defectively designed lamp can catch fire and burn, his much, much rather have it catch fire and burn immediately, the same way, every single time, than at just the wrong point of the moon phase have something burn. In the first case, the lamp will fail testing while the manufacturer is testing, the problem will be noticed and addressed, and lamps will not be shipped out to the public until the defect has been property addressed. The opposite Heisenbug case is one where the lamp will spark and catch fire under just the wrong conditions, which means that a defect will not be caught until the laps have shipped and started burning customers' homes down. "Predictable" means "fail the same way, every time, if it's going to fail at all." "Right means "passes testing successfully, but we don't know whether they're safe to use [probably they aren't]." Now, he ultimately does, in fact, care about being right, but the choices that Facebook has made surrounding React stem from a realization that being predictable is a means to being right. It's not acceptable for a manufacturer to ship something that will always spark and catch fire when a consumer plugs it in. However, being predictable moves the problems to the front and the center, rather than being the occasional result of subtle, hard-to-pin-down interactions that will have unacceptable consequences in some rare circumstances. The choices in Flux and ReactJS are designed to make failures obvious and bring them to the surface, rather than them being manifested only in the nooks and crannies of a software labyrinth. Facebook's war on the shared mutable state is illustrated in the experience that they had regarding a chat bug. The chat bug became an overarching concern for its users. One crucial moment of enlightenment for Facebook came when they announced a completely unrelated feature, and the first comment on this feature was a request to fix the chat; it got 898 likes. Also, they commented that this was one of the more polite requests. The problem was that the indicator for unread messages could have a phantom positive message count when there were no messages available. Things came to a point where people seemed not to care about what improvements or new features Facebook was adding, but just wanted them to fix the phantom message count. And they kept investigating and kept addressing edge cases, but the phantom message count kept on recurring. The solution, besides ReactJS, was found in the flux pattern, or architecture, which is discussed in the next section. After a situation where not too many people felt comfortable making changes, all of a sudden, many more people felt comfortable making changes. These things simplified matters enough that new developers tended not to really need the ramp-up time and treatment that had previously been given. Furthermore, when there was a bug, the more experienced developers could guess with reasonable accuracy what part of the system was the culprit, and the newer developers, after working on a bug, tended to feel confident and have a general sense of how the system worked. The Flux Architecture One of the ways in which Facebook, in relation to ReactJS, has declared war on heisenbugs is by declaring war on the mutable state. Flux is an architecture and a pattern, rather than a specific technology, and it can be used (or not used) with ReactJS. It is somewhat like MVC, equivalent to a loose competitor to that approach, but it is very different from a simple MVC variant and is designed to have a pit of success that provides unidirectional data flow like this: from the action to the dispatcher, then to the store, and finally to the view (but some people have said that these two are so different that a direct comparison between Flux and MVC, in terms of trying to identify what part of Flux corresponds to what conceptual hook in MVC, is not really that helpful). Actions are like events—they are fed into a top funnel. Dispatchers go through the funnels and can not only pass actions but also make sure that no additional actions are dispatched until the previous one has completely settled out. Stores have similarities and difference to models. They are like models in that they keep track of state. They are unlike models in that they have only getters, not setters, which stops the effect of any part of the program with access to a model being able to change anything in its setters. Stores can accept input, but in a very controlled way, and in general a store is not at the mercy of anything possessing a reference to it. A view is what displays the current output based on what is obtained from stores. Stores, compared to models in some respects, have getters but not setters. This helps foster a kind of data flow that is not at the mercy of anyone who has access to a setter. It is possible for events to be percolated as actions, but the dispatcher acts as a traffic cop and ensures that new actions are processed only after the stores are completely settled. This de-escalates the complexity considerably. Flux simplified interactions so that Facebook developers no longer had subtle edge cases and bug that kept coming back—the chat bug was finally dead and has not come back. Summary We just took a whirlwind tour of some of the theory surrounding reactive programming with ReactJS. This includes declarative programming, one of the selling points of ReactJS that offers something easier to work with at the end than imperative programming. The war on heisenbugs, is an overriding concern surrounding decisions made by Facebook, including ReactJS. This takes place through Facebook's declared war on the shared mutable state. The Flux Architecture is used by Facebook with ReactJS to avoid some nasty classes of bugs. To learn more about Reactive Programming and the Flux Architecture, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Reactive Programming with JavaScript (https://www.packtpub.com/application-development/reactive-programming-javascript) Clojure Reactive Programming (https://www.packtpub.com/web-development/clojure-reactive-programming) Resources for Article:   Further resources on this subject: The Observer Pattern [article] Concurrency in Practice [article] Introduction to Akka [article]
Read more
  • 0
  • 0
  • 14262
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-test-all-things-python
Packt
18 Feb 2016
20 min read
Save for later

Test all the things with Python

Packt
18 Feb 2016
20 min read
The first testing tool we're going to look at is called doctest. The name is short for "document testing" or perhaps a "testable document". Either way, it's a literate tool designed to make it easy to write tests in such a way that computers and humans both benefit from them. Ideally, doctest tests both, informs human readers, and tells the computer what to expect. Mixing tests and documentation helps us: Keeps the documentation up-to-date with reality Make sure that the tests express the intended behavior Reuse some of the efforts involved in the documentation and test creation (For more resources related to this topic, see here.) Where doctest performs best The design decisions that went into doctest make it particularly well suited to writing acceptance tests at the integration and system testing levels. This is because doctest mixes human-only text with examples that both humans and computers can read. This structure doesn't support or enforce any of the formalizations of testing, but it conveys information beautifully and it still provides the computer with the ability to say that works or that doesn't work. As an added bonus, it is about the easiest way to write tests you'll ever see. In other words, a doctest file is a truly excellent program specification that you can have the computer check against your actual code any time you want. API documentation also benefits from being written as doctests and checked alongside your other tests. You can even include doctests in your docstrings. The basic idea you should be getting from all this is that doctest is ideal for uses where humans and computers will both benefit from reading them. The doctest language Like program source code, doctest tests are written in plain text. The doctest module extracts the tests and ignores the rest of the text, which means that the tests can be embedded in human-readable explanations or discussions. This is the feature that makes doctest suitable for uses such as program specifications. Example – creating and running a simple doctest We are going to create a simple doctest file, to show the fundamentals of using the tool. Perform the following steps: Open a new text file in your editor, and name it test.txt. Insert the following text into the file: This is a simple doctest that checks some of Python's arithmetic >>> 2 + 2 4 >>> 3 * 3 10 We can now run the doctest. At the command prompt, change to the directory where you saved test.txt. Type the following command: $ python3 ‑m doctest test.txt When the test is run, you should see output like this: Result – three times three does not equal ten You just wrote a doctest file that describes a couple of arithmetic operations, and ran it to check whether Python behaved as the tests said it should. You ran the tests by telling Python to execute doctest on the file containing the tests. In this case, Python's behavior differed from the tests because, according to the tests, three times three equals ten. However, Python disagrees on that. As doctest expected one thing and Python did something different, doctest presented you with a nice little error report showing where to find the failed test, and how the actual result differed from the expected result. At the bottom of the report is a summary showing how many tests failed in each file tested, which is helpful when you have more than one file containing tests. The syntax of doctests You might have already figured it out from looking at the previous example: doctest recognizes tests by looking for sections of text that look like they've been copied and pasted from a Python interactive session. Anything that can be expressed in Python is valid within a doctest. Lines that start with a >>> prompt are sent to a Python interpreter. Lines that start with a ... prompt are sent as continuations of the code from the previous line, allowing you to embed complex block statements into your doctests. Finally, any lines that don't start with >>> or ..., up to the next blank line or >>> prompt, represent the output expected from the statement. The output appears as it would in an interactive Python session, including both the return value and anything printed to the console. If you don't have any output lines, doctest assumes it to mean that the statement is expected to have no visible result on the console, which usually means that it returns None. The doctest module ignores anything in the file that isn't part of a test, which means that you can put explanatory text, HTML, line-art diagrams, or whatever else strikes your fancy in between your tests. We took advantage of this in the previous doctest to add an explanatory sentence before the test itself. Example – a more complex test Add the following code to your test.txt file, separated from the existing code by at least one blank line: Now we're going to take some more of doctest's syntax for a spin.   >>> import sys >>> def test_write(): ...     sys.stdout.write("Hellon") ...     return True >>> test_write() Hello True Now take a moment to consider before running the test. Will it pass or fail? Should it pass or fail? Result – five tests run? Just as we discussed before, run the test using the following command: python3 -m doctest test.txt You should see a result like this: Because we added the new tests to the same file containing the tests from before, we still see the notification that three times three does not equal 10. Now, though, we also see that five tests were run, which means our new tests ran and were successful. Why five tests? As far as doctest is concerned, we added the following three tests to the file: The first one says that, when we import sys, nothing visible should happen The second test says that, when we define the test_write function, nothing visible should happen The third test says that, when we call the test_write function, Hello and True should appear on the console, in that order, on separate lines Since all three of these tests pass, doctest doesn't bother to say much about them. All it did was increase the number of tests reported at the bottom from two to five. Expecting exceptions That's all well and good for testing that things work as expected, but it is just as important to make sure that things fail when they're supposed to fail. Put another way: sometimes your code is supposed to raise an exception, and you need to be able to write tests that check that behavior as well. Fortunately, doctest follows nearly the same principle in dealing with exceptions as it does with everything else; it looks for text that looks like a Python interactive session. This means it looks for text that looks like a Python exception report and traceback, and matches it against any exception that gets raised. The doctest module does handle exceptions a little differently from the way it handles other things. It doesn't just match the text precisely and report a failure if it doesn't match. Exception tracebacks tend to contain many details that are not relevant to the test, but that can change unexpectedly. The doctest module deals with this by ignoring the traceback entirely: it's only concerned with the first line, Traceback (most recent call last):, which tells it that you expect an exception, and the part after the traceback, which tells it which exception you expect. The doctest module only reports a failure if one of these parts does not match. This is helpful for a second reason as well: manually figuring out what the traceback will look like, when you're writing your tests, would require a significant amount of effort and would gain you nothing. It's better to simply omit them. Example – checking for an exception This is yet another test that you can add to test.txt, this time testing some code that ought to raise an exception. Insert the following text into your doctest file, as always separated by at least one blank line: Here we use doctest's exception syntax to check that Python is correctly enforcing its grammar. The error is a missing ) on the def line. >>> def faulty(: ...     yield from [1, 2, 3, 4, 5] Traceback (most recent call last): SyntaxError: invalid syntax The test is supposed to raise an exception, so it will fail if it doesn't raise the exception or if it raises the wrong exception. Make sure that you have your mind wrapped around this: if the test code executes successfully, the test fails, because it expected an exception. Run the tests using the following doctest: python3 -m doctest test.txt Result – success at failing The code contains a syntax error, which means this raises a SyntaxError exception, which in turn means that the example behaves as expected; this signifies that the test passes. When dealing with exceptions, it is often desirable to be able to use a wildcard matching mechanism. The doctest provides this facility through its ellipsis directive that we'll discuss shortly. Expecting blank lines The doctest uses the first blank line after  >>> to identify the end of the expected output, so what do you do when the expected output actually contains a blank line? The doctest handles this situation by matching a line that contains only the text <BLANKLINE> in the expected output with a real blank line in the actual output. Controlling doctest behavior with directives Sometimes, the default behavior of doctest makes writing a particular test inconvenient. For example, doctest might look at a trivial difference between the expected and real outputs and wrongly conclude that the test has failed. This is where doctest directives come to the rescue. Directives are specially formatted comments that you can place after the source code of a test and that tell doctest to alter its default behavior in some way. A directive comment begins with # doctest:, after which comes a comma-separated list of options that either enable or disable various behaviors. To enable a behavior, write a + (plus symbol) followed by the behavior name. To disable a behavior, white a – (minus symbol) followed by the behavior name. We'll take a look at the several directives in the following sections. Ignoring part of the result It's fairly common that only part of the output of a test is actually relevant to determining whether the test passes. By using the +ELLIPSIS directive, you can make doctest treat the text ... (called an ellipsis) in the expected output as a wildcard that will match any text in the output. When you use an ellipsis, doctest will scan until it finds text matching whatever comes after the ellipsis in the expected output, and continue matching from there. This can lead to surprising results such as an ellipsis matching against a 0-length section of the actual output, or against multiple lines. For this reason, it needs to be used thoughtfully. Example – ellipsis test drive We're going to use the ellipsis in a few different tests to better get a feel of how it works. As an added bonus, these tests also show the use of doctest directives. Add the following code to your test.txt file: Next up, we're exploring the ellipsis. >>> sys.modules # doctest: +ELLIPSIS {...'sys': <module 'sys' (built-in)>...}   >>> 'This is an expression that evaluates to a string' ... # doctest: +ELLIPSIS 'This is ... a string'   >>> 'This is also a string' # doctest: +ELLIPSIS 'This is ... a string'   >>> import datetime >>> datetime.datetime.now().isoformat() # doctest: +ELLIPSIS '...-...-...T...:...:...' Result – ellipsis elides The tests all pass, where they would all fail without the ellipsis. The first and last tests, in which we checked for the presence of a specific module in sys.modules and confirmed a specific formatting while ignoring the contents of a string, demonstrate the kind of situation where ellipsis is really useful, because it lets you focus on the part of the output that is meaningful and ignore the rest of the test. The middle tests demonstrate how different outputs can match the same expected result when ellipsis is in play. Look at the last test. Can you imagine any output that wasn't an ISO-formatted time stamp, but that would match the example anyway? Remember that the ellipsis can match any amount of text. Ignoring white space Sometimes, white space (spaces, tabs, newlines, and their ilk) is more trouble than it's worth. Maybe you want to be able to break a single line of expected output across several lines in your test file, or maybe you're testing a system that uses lots of white space but doesn't convey any useful information with it. The doctest gives you a way to "normalize" white space, turning any sequence of white space characters, in both the expected output and in the actual output, into a single space. It then checks whether these normalized versions match. Example – invoking normality We're going to write a couple of tests that demonstrate how whitespace normalization works. Insert the following code into your doctest file: Next, a demonstration of whitespace normalization. >>> [1, 2, 3, 4, 5, 6, 7, 8, 9] ... # doctest: +NORMALIZE_WHITESPACE [1, 2, 3,  4, 5, 6,  7, 8, 9]   >>> sys.stdout.write("This textn contains weird     spacing.n") ... # doctest: +NORMALIZE_WHITESPACE This text contains weird spacing. 39 Result – white space matches any other white space Both of these tests pass, in spite of the fact that the result of the first one has been wrapped across multiple lines to make it easy for humans to read, and the result of the second one has had its strange newlines and indentations left out, also for human convenience. Notice how one of the tests inserts extra whitespace in the expected output, while the other one ignores extra whitespace in the actual output? When you use +NORMALIZE_WHITESPACE, you gain a lot of flexibility with regard to how things are formatted in the text file. You may have noted the value 39 on the last line of the last example. Why is that there? It's because the write() method returns the number of bytes that were written, which in this case happens to be 39. If you're trying this example in an environment that maps ASCII characters to more than one byte, you will see a different number here; this will cause the test to fail until you change the expected number of bytes. Skipping an example On some occasions, doctest will recognize some text as an example to be checked, when in truth you want it to be simply text. This situation is rarer than it might at first seem, because usually there's no harm in letting doctest check everything it can. In fact, usually it's very helpful to have doctest check everything it can. For those times when you want to limit what doctest checks, though, there's the +SKIP directive. Example – humans only Append the following code to your doctest file: Now we're telling doctest to skip a test   >>> 'This test would fail.' # doctest: +SKIP If it were allowed to run. Result – it looks like a test, but it's not Before we added this last example to the file, doctest reported thirteen tests when we ran the file through it. After adding this code, doctest still reports thirteen tests. Adding the skip directive to the code completely removed it from consideration by doctest. It's not a test that passes, nor a test that fails. It's not a test at all. The other directives There are a number of other directives that can be issued to doctest, should you find the need. They're not as broadly useful as the ones already mentioned, but the time might come when you require one or more of them. The full documentation for all of the doctest directives can be found at http://docs.python.org/3/library/doctest.html#doctest-options. The remaining directives of doctest in the Python 3.4 version are as follows: DONT_ACCEPT_TRUE_FOR_1: This makes doctest differentiate between boolean values and numbers DONT_ACCEPT_BLANKLINE: This removes support for the <BLANKLINE> feature IGNORE_EXCEPTION_DETAIL: This makes doctest only care that an exception is of the expected type Strictly speaking, doctest supports several other options that can be set using the directive syntax, but they don't make any sense as directives, so we'll ignore them here. The execution scope of doctest tests When doctest is running the tests from text files, all the tests from the same file are run in the same execution scope. This means that, if you import a module or bind a variable in one test, that module or variable is still available in later tests. We took advantage of this fact several times in the tests written so far in this article: the sys module was only imported once, for example, although it was used in several tests. This behavior is not necessarily beneficial, because tests need to be isolated from each other. We don't want them to contaminate each other because, if a test depends on something that another test does, or if it fails because of something that another test does, these two tests are in some sense combined into one test that covers a larger section of your code. You don't want that to happen, because then knowing which test has failed doesn't give you as much information about what went wrong and where it happened. So, how can we give each test its own execution scope? There are a few ways to do it. One would be to simply place each test in its own file, along with whatever explanatory text that is needed. This works well in terms of functionality, but running the tests can be a pain unless you have a tool to find and run all of them for you. Another problem with this approach is that this breaks the idea that the tests contribute to a human-readable document. Another way to give each test its own execution scope is to define each test within a function, as follows: >>> def test1(): ...     import frob ...     return frob.hash('qux') >>> test1() 77 By doing this, the only thing that ends up in the shared scope is the test function (named test1 here). The frob module and any other names bound inside the function are isolated with the caveat that things that happen inside imported modules are not isolated. If the frob.hash() method changes a state inside the frob module, that state will still be changed if a different test imports the frob module again. The third way is to exercise caution with the names you create, and be sure to set them to known values at the beginning of each test section. In many ways this is the easiest approach, but this is also the one that places the most burden on you, because you have to keep track of what's in the scope. Why does doctest behave in this way, instead of isolating tests from each other? The doctest files are intended not just for computers to read, but also for humans. They often form a sort of narrative, flowing from one thing to the next. It would break the narrative to be constantly repeating what came before. In other words, this approach is a compromise between being a document and being a test framework, a middle ground that works for both humans and computers. Check your understanding Once you've decided on your answers to these questions, check them by writing a test document and running it through doctest: How does doctest recognize the beginning of a test in a document? How does doctest know when a test continues to further lines? How does doctest recognize the beginning and end of the expected output of a test? How would you tell doctest that you want to break the expected output across several lines, even though that's not how the test actually outputs it? Which parts of an exception report are ignored by doctest? When you assign a variable in a test file, which parts of the file can actually see that variable? Why do we care what code can see the variables created by a test? How can we make doctest not care what a section of output contains? Exercise – English to doctest Time to stretch your wings a bit. I'm going to give you a description of a single function in English. Your job is to copy the description into a new text file, and then add tests that describe all the requirements in a way that the computer can understand and check. Try to make the doctests so that they're not just for the computer. Good doctests tend to clarify things for human readers as well. By and large, this means that you present them to human readers as examples interspersed with the text. Without further ado, here is the English description: The fib(N) function takes a single integer as its only parameter N. If N is 0 or 1, the function returns 1. If N is less than 0, the function raises a ValueError. Otherwise, the function returns the sum of fib(N – 1) and fib(N – 2). The returned value will never be less than 1. A naïve implementation of this function would get very slow as N increased. I'll give you a hint and point out that the last sentence about the function being slow, isn't really testable. As computers get faster, any test you write that depends on an arbitrary definition of "slow" will eventually fail. Also, there's no good way to test the difference between a slow function and a function stuck in an infinite loop, so there's not much point in trying. If you find yourself needing to do that, it's best to back off and try a different solution. Not being able to tell whether a function is stuck or just slow is called the halting problem by computer scientists. We know that it can't be solved unless we someday discover a fundamentally better kind of computer. Faster computers won't do the trick, and neither will quantum computers, so don't hold your breath. The next-to-last sentence also provides some difficulty, since to test it completely would require running every positive integer through the fib() function, which would take forever (except that the computer will eventually run out of memory and force Python to raise an exception). How do we deal with this sort of thing, then? The best solution is to check whether the condition holds true for a random sample of viable inputs. The random.randrange() and random.choice() functions in the Python standard library make that fairly easy to do. Summary We learned the syntax of doctest, and went through several examples describing how to use it. After that, we took a real-world specification for the AVL tree, and examined how to formalize it as a set of doctests, so that we could use it to automatically check the correctness of an implementation. Specifically, we covered doctest's default syntax and the directives that alter it, how to write doctests in text files, how to write doctests in Python docstrings, and what it feels like to use doctest to turn a specification into tests. If, you want to learn more about Python Testing then you can refer the following books: Expert Python Programming: https://www.packtpub.com/application-development/expert-python-programming Python Testing Cookbook: https://www.packtpub.com/application-development/python-testing-cookbook Resources for Article:   Further resources on this subject: Façade Pattern – Being Adaptive with Façade [article] Predicting Sports Winners with Decision Trees and pandas [article] Gradient Descent at Work [article]
Read more
  • 0
  • 0
  • 3540

article-image-understanding-php-basics
Packt
17 Feb 2016
27 min read
Save for later

Understanding PHP basics

Packt
17 Feb 2016
27 min read
In this article by Antonio Lopez Zapata, the author of the book Learning PHP 7, you need to understand not only the syntax of the language, but also its grammatical rules, that is, when and why to use each element of the language. Luckily, for you, some languages come from the same root. For example, Spanish and French are romance languages as they both evolved from spoken Latin; this means that these two languages share a lot of rules, and learning Spanish if you already know French is much easier. (For more resources related to this topic, see here.) Programming languages are quite the same. If you already know another programming language, it will be very easy for you to go through this chapter. If it is your first time though, you will need to understand from scratch all the grammatical rules, so it might take some more time. But fear not! We are here to help you in this endeavor. In this chapter, you will learn about these topics: PHP in web applications Control structures Functions PHP in web applications Even though the main purpose of this chapter is to show you the basics of PHP, doing so in a reference-manual way is not interesting enough. If we were to copy paste what the official documentation says, you might as well go there and read it by yourself. Instead, let's not forget the main purpose of this book and your main goal—to write web applications with PHP. We will show you how can you apply everything you are learning as soon as possible, before you get too bored. In order to do that, we will go through the journey of building an online bookstore. At the very beginning, you might not see the usefulness of it, but that is just because we still haven't seen all that PHP can do. Getting information from the user Let's start by building a home page. In this page, we are going to figure out whether the user is looking for a book or just browsing. How do we find this out? The easiest way right now is to inspect the URL that the user used to access our application and extract some information from there. Save this content as your index.php file: <?php $looking = isset($_GET['title']) || isset($_GET['author']); ?> <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <p>You lookin'? <?php echo (int) $looking; ?></p> <p>The book you are looking for is</p> <ul> <li><b>Title</b>: <?php echo $_GET['title']; ?></li> <li><b>Author</b>: <?php echo $_GET['author']; ?></li> </ul> </body> </html> And now, access http://localhost:8000/?author=Harper Lee&title=To Kill a Mockingbird. You will see that the page is printing some of the information that you passed on to the URL. For each request, PHP stores in an array—called $_GET- all the parameters that are coming from the query string. Each key of the array is the name of the parameter, and its associated value is the value of the parameter. So, $_GET contains two entries: $_GET['author'] contains Harper Lee and $_GET['title'] contains To Kill a Mockingbird. On the first highlighted line, we are assigning a Boolean value to the $looking variable. If either $_GET['title'] or $_GET['author'] exists, this variable will be true; otherwise, false. Just after that, we close the PHP tag and then we start printing some HTML, but as you can see, we are actually mixing HTML with PHP code. Another interesting line here is the second highlighted line. We are printing the content of $looking, but before that, we cast the value. Casting means forcing PHP to transform a type of value to another one. Casting a Boolean to an integer means that the resultant value will be 1 if the Boolean is true or 0 if the Boolean is false. As $looking is true since $_GET contains valid keys, the page shows 1. If we try to access the same page without sending any information as in http://localhost:8000, the browser will say "Are you looking for a book? 0". Depending on the settings of your PHP configuration, you will see two notice messages complaining that you are trying to access the keys of the array that do not exist. Casting versus type juggling We already knew that when PHP needs a specific type of variable, it will try to transform it, which is called type juggling. But PHP is quite flexible, so sometimes, you have to be the one specifying the type that you need. When printing something with echo, PHP tries to transform everything it gets into strings. Since the string version of the false Boolean is an empty string, this would not be useful for our application. Casting the Boolean to an integer first assures that we will see a value, even if it is just "0". HTML forms HTML forms are one of the most popular ways to collect information from users. They consist a series of fields called inputs in the HTML world and a final submit button. In HTML, the form tag contains two attributes: the action points, where the form will be submitted and method that specifies which HTTP method the form will use—GET or POST. Let's see how it works. Save the following content as login.html and go to http://localhost:8000/login.html: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore - Login</title> </head> <body> <p>Enter your details to login:</p> <form action="authenticate.php" method="post"> <label>Username</label> <input type="text" name="username" /> <label>Password</label> <input type="password" name="password" /> <input type="submit" value="Login"/> </form> </body> </html> This form contains two fields, one for the username and one for the password. You can see that they are identified by the name attribute. If you try to submit this form, the browser will show you a Page Not Found message, as it is trying to access http://localhost:8000/authenticate.phpand the web server cannot find it. Let's create it then: <?php $submitted = !empty($_POST); ?> <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <p>Form submitted? <?php echo (int) $submitted; ?></p> <p>Your login info is</p> <ul> <li><b>username</b>: <?php echo $_POST['username']; ?></li> <li><b>password</b>: <?php echo $_POST['password']; ?></li> </ul> </body> </html> As with $_GET, $_POST is an array that contains the parameters received by POST. In this piece of code, we are first asking whether that array is not empty—note the ! operator. Afterwards, we just display the information received, just as in index.php. Note that the keys of the $_POST array are the values for the name argument of each input field. Control structures So far, our files have been executed line by line. Due to that, we are getting some notices on some scenarios, such as when the array does not contain what we are looking for. Would it not be nice if we could choose which lines to execute? Control structures to the rescue! A control structure is like a traffic diversion sign. It directs the execution flow depending on some predefined conditions. There are different control structures, but we can categorize them in conditionals and loops. A conditional allows us to choose whether to execute a statement or not. A loop will execute a statement as many times as you need. Let's take a look at each one of them. Conditionals A conditional evaluates a Boolean expression, that is, something that returns a value. If the expression is true, it will execute everything inside its block of code. A block of code is a group of statements enclosed by {}. Let's see how it works: <?php echo "Before the conditional."; if (4 > 3) { echo "Inside the conditional."; } if (3 > 4) { echo "This will not be printed."; } echo "After the conditional."; In this piece of code, we are using two conditionals. A conditional is defined by the keyword if followed by a Boolean expression in parentheses and by a block of code. If the expression is true, it will execute the block; otherwise, it will skip it. You can increase the power of conditionals by adding the keyword else. This tells PHP to execute a block of code if the previous conditions were not satisfied. Let's see an example: if (2 > 3) { echo "Inside the conditional."; } else { echo "Inside the else."; } This will execute the code inside else as the condition of if was not satisfied. Finally, you can also add an elseif keyword followed by another condition and block of code to continue asking PHP for more conditions. You can add as many elseif as you need after if. If you add else, it has to be the last one of the chain of conditions. Also keep in mind that as soon as PHP finds a condition that resolves to true, it will stop evaluating the rest of the conditions: <?php if (4 > 5) { echo "Not printed"; } elseif (4 > 4) { echo "Not printed"; } elseif (4 == 4) { echo "Printed."; } elseif (4 > 2) { echo "Not evaluated."; } else { echo "Not evaluated."; } if (4 == 4) { echo "Printed"; } In this last example, the first condition that evaluates to true is the one that is highlighted. After that, PHP does not evaluate any more conditions until a new if starts. With this knowledge, let's try to clean up a bit of our application, executing statements only when needed. Copy this code to your index.php file: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <p> <?php if (isset($_COOKIE[username'])) { echo "You are " . $_COOKIE['username']; } else { echo "You are not authenticated."; } ?> </p> <?php if (isset($_GET['title']) && isset($_GET['author'])) { ?> <p>The book you are looking for is</p> <ul> <li><b>Title</b>: <?php echo $_GET['title']; ?></li> <li><b>Author</b>: <?php echo $_GET['author']; ?></li> </ul> <?php } else { ?> <p>You are not looking for a book?</p> <?php } ?> </body> </html> In this new code, we are mixing conditionals and HTML code in two different ways. The first one opens a PHP tag and adds an if-else clause that will print whether we are authenticated or not with echo. No HTML is merged within the conditionals, which makes it clear. The second option—the second highlighted block—shows an uglier solution, but this is sometimes necessary. When you have to print a lot of HTML code, echo is not that handy, and it is better to close the PHP tag; print all the HTML you need and then open the tag again. You can do that even inside the code block of an if clause, as you can see in the code. Mixing PHP and HTML If you feel like the last file we edited looks rather ugly, you are right. Mixing PHP and HTML is confusing, and you have to avoid it by all means. Let's edit our authenticate.php file too, as it is trying to access $_POST entries that might not be there. The new content of the file would be as follows: <?php $submitted = isset($_POST['username']) && isset($_POST['password']); if ($submitted) { setcookie('username', $_POST['username']); } ?> <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Bookstore</title> </head> <body> <?php if ($submitted): ?> <p>Your login info is</p> <ul> <li><b>username</b>: <?php echo $_POST['username']; ?></li> <li><b>password</b>: <?php echo $_POST['password']; ?></li> </ul> <?php else: ?> <p>You did not submitted anything.</p> <?php endif; ?> </body> </html> This code also contains conditionals, which we already know. We are setting a variable to know whether we've submitted a login or not and to set the cookies if we have. However, the highlighted lines show you a new way of including conditionals with HTML. This way, tries to be more readable when working with HTML code, avoiding the use of {} and instead using : and endif. Both syntaxes are correct, and you should use the one that you consider more readable in each case. Switch-case Another control structure similar to if-else is switch-case. This structure evaluates only one expression and executes the block depending on its value. Let's see an example: <?php switch ($title) { case 'Harry Potter': echo "Nice story, a bit too long."; break; case 'Lord of the Rings': echo "A classic!"; break; default: echo "Dunno that one."; break; } The switch case takes an expression; in this case, a variable. It then defines a series of cases. When the case matches the current value of the expression, PHP executes the code inside it. As soon as PHP finds break, it will exit switch-case. In case none of the cases are suitable for the expression, if there is a default case         , PHP will execute it, but this is optional. You also need to know that breaks are mandatory if you want to exit switch-case. If you do not specify any, PHP will keep on executing statements, even if it encounters a new case. Let's see a similar example but without breaks: <?php $title = 'Twilight'; switch ($title) { case 'Harry Potter': echo "Nice story, a bit too long."; case 'Twilight': echo 'Uh...'; case 'Lord of the Rings': echo "A classic!"; default: echo "Dunno that one."; } If you test this code in your browser, you will see that it is printing "Uh...A classic!Dunno that one.". PHP found that the second case is valid so it executes its content. But as there are no breaks, it keeps on executing until the end. This might be the desired behavior sometimes, but not usually, so we need to be careful when using it! Loops Loops are control structures that allow you to execute certain statements several times—as many times as you need. You might use them on several different scenarios, but the most common one is when interacting with arrays. For example, imagine you have an array with elements but you do not know what is in it. You want to print all its elements so you loop through all of them. There are four types of loops. Each of them has their own use cases, but in general, you can transform one type of loop into another. Let's see them closely While While is the simplest of the loops. It executes a block of code until the expression to evaluate returns false. Let's see one example: <?php $i = 1; while ($i < 4) { echo $i . " "; $i++; } Here, we are defining a variable with the value 1. Then, we have a while clause in which the expression to evaluate is $i < 4. This loop will execute the content of the block of code until that expression is false. As you can see, inside the loop we are incrementing the value of $i by 1 each time, so after 4 iterations, the loop will end. Check out the output of that script, and you will see "0 1 2 3". The last value printed is 3, so by that time, $i was 3. After that, we increased its value to 4, so when the while was evaluating whether $i < 4, the result was false. Whiles and infinite loops One of the most common problems with while loops is creating an infinite loop. If you do not add any code inside while, which updates any of the variables considered in the while expression so it can be false at some point, PHP will never exit the loop! For This is the most complex of the four loops. For defines an initialization expression, an exit condition, and the end of the iteration expression. When PHP first encounters the loop, it executes what is defined as the initialization expression. Then, it evaluates the exit condition, and if it resolves to true, it enters the loop. After executing everything inside the loop, it executes the end of the iteration expression. Once this is done, it will evaluate the end condition again, going through the loop code and the end of iteration expression until it evaluates to false. As always, an example will help clarify this: <?php for ($i = 1; $i < 10; $i++) { echo $i . " "; } The initialization expression is $i = 1 and is executed only the first time. The exit condition is $i < 10, and it is evaluated at the beginning of each iteration. The end of the iteration expression is $i++, which is executed at the end of each iteration. This example prints numbers from 1 to 9. Another more common usage of the for loop is with arrays: <?php $names = ['Harry', 'Ron', 'Hermione']; for ($i = 0; $i < count($names); $i++) { echo $names[$i] . " "; } In this example, we have an array of names. As it is defined as a list, its keys will be 0, 1, and 2. The loop initializes the $i variable to 0, and it will iterate until the value of $i is not less than the amount of elements in the array 3 The first iteration $i is 0, the second will be 1, and the third one will be 2. When $i is 3, it will not enter the loop as the exit condition evaluates to false. On each iteration, we are printing the content of the $i position of the array; hence, the result of this code will be all three names in the array. We careful with exit conditions It is very common to set an exit condition. This is not exactly what we need, especially with arrays. Remember that arrays start with 0 if they are a list, so an array of 3 elements will have entries 0, 1, and 2. Defining the exit condition as $i <= count($array) will cause an error on your code, as when $i is 3, it also satisfies the exit condition and will try to access to the key 3, which does not exist. Foreach The last, but not least, type of loop is foreach. This loop is exclusive for arrays, and it allows you to iterate an array entirely, even if you do not know its keys. There are two options for the syntax, as you can see in these examples: <?php $names = ['Harry', 'Ron', 'Hermione']; foreach ($names as $name) { echo $name . " "; } foreach ($names as $key => $name) { echo $key . " -> " . $name . " "; } The foreach loop accepts an array; in this case, $names. It specifies a variable, which will contain the value of the entry of the array. You can see that we do not need to specify any end condition, as PHP will know when the array has been iterated. Optionally, you can specify a variable that will contain the key of each iteration, as in the second loop. Foreach loops are also useful with maps, where the keys are not necessarily numeric. The order in which PHP will iterate the array will be the same order in which you used to insert the content in the array. Let's use some loops in our application. We want to show the available books in our home page. We have the list of books in an array, so we will have to iterate all of them with a foreach loop, printing some information from each one. Append the following code to the body tag in index.php: <?php endif; $books = [ [ 'title' => 'To Kill A Mockingbird', 'author' => 'Harper Lee', 'available' => true, 'pages' => 336, 'isbn' => 9780061120084 ], [ 'title' => '1984', 'author' => 'George Orwell', 'available' => true, 'pages' => 267, 'isbn' => 9780547249643 ], [ 'title' => 'One Hundred Years Of Solitude', 'author' => 'Gabriel Garcia Marquez', 'available' => false, 'pages' => 457, 'isbn' => 9785267006323 ], ]; ?> <ul> <?php foreach ($books as $book): ?> <li> <i><?php echo $book['title']; ?></i> - <?php echo $book['author']; ?> <?php if (!$book['available']): ?> <b>Not available</b> <?php endif; ?> </li> <?php endforeach; ?> </ul> The highlighted code shows a foreach loop using the : notation, which is better when mixing it with HTML. It iterates all the $books arrays, and for each book, it will print some information as a HTML list. Also note that we have a conditional inside a loop, which is perfectly fine. Of course, this conditional will be executed for each entry in the array, so you should keep the block of code of your loops as simple as possible. Functions A function is a reusable block of code that, given an input, performs some actions and optionally returns a result. You already know several predefined functions, such as empty, in_array, or var_dump. These functions come with PHP so you do not have to reinvent the wheel, but you can create your own very easily. You can define functions when you identify portions of your application that have to be executed several times or just to encapsulate some functionality. Function declaration Declaring a function means to write it down so that it can be used later. A function has a name, takes arguments, and has a block of code. Optionally, it can define what kind of value is returning. The name of the function has to follow the same rules as variable names; that is, it has to start by a letter or underscore and can contain any letter, number, or underscore. It cannot be a reserved word. Let's see a simple example: function addNumbers($a, $b) { $sum = $a + $b; return $sum; } $result = addNumbers(2, 3); Here, the function's name is addNumbers, and it takes two arguments: $a and $b. The block of code defines a new variable $sum that is the sum of both the arguments and then returns its content with return. In order to use this function, you just need to call it by its name, sending all the required arguments, as shown in the highlighted line. PHP does not support overloaded functions. Overloading refers to the ability of declaring two or more functions with the same name but different arguments. As you can see, you can declare the arguments without knowing what their types are, so PHP would not be able to decide which function to use. Another important thing to note is the variable scope. We are declaring a $sum variable inside the block of code, so once the function ends, the variable will not be accessible any more. This means that the scope of variables declared inside the function is just the function itself. Furthermore, if you had a $sum variable declared outside the function, it would not be affected at all since the function cannot access that variable unless we send it as an argument. Function arguments A function gets information from outside via arguments. You can define any number of arguments—including 0. These arguments need at least a name so that they can be used inside the function, and there cannot be two arguments with the same name. When invoking the function, you need to send the arguments in the same order as we declared them. A function may contain optional arguments; that is, you are not forced to provide a value for those arguments. When declaring the function, you need to provide a default value for those arguments, so in case the user does not provide a value, the function will use the default one: function addNumbers($a, $b, $printResult = false) { $sum = $a + $b; if ($printResult) { echo 'The result is ' . $sum; } return $sum; } $sum1 = addNumbers(1, 2); $sum1 = addNumbers(3, 4, false); $sum1 = addNumbers(5, 6, true); // it will print the result This new function takes two mandatory arguments and an optional one. The default value is false, and is used as a normal value inside the function. The function will print the result of the sum if the user provides true as the third argument, which happens only the third time that the function is invoked. For the first two times, $printResult is set to false. The arguments that the function receives are just copies of the values that the user provided. This means that if you modify these arguments inside the function, it will not affect the original values. This feature is known as sending arguments by a value. Let's see an example: function modify($a) { $a = 3; } $a = 2; modify($a); var_dump($a); // prints 2 We are declaring the $a variable with the value 2, and then we are calling the modify method, sending $a. The modify method modifies the $a argument, setting its value to 3. However, this does not affect the original value of $a, which reminds to 2 as you can see in the var_dump function. If what you want is to actually change the value of the original variable used in the invocation, you need to pass the argument by reference. To do that, you add & in front of the argument when declaring the function: function modify(&$a) { $a = 3; } Now, after invoking the modify function, $a will be always 3. Arguments by value versus by reference PHP allows you to do it, and in fact, some native functions of PHP use arguments by reference—remember the array sorting functions; they did not return the sorted array; instead, they sorted the array provided. But using arguments by reference is a way of confusing developers. Usually, when someone uses a function, they expect a result, and they do not want their provided arguments to be modified. So, try to avoid it; people will be grateful! The return statement You can have as many return statements as you want inside your function, but PHP will exit the function as soon as it finds one. This means that if you have two consecutive return statements, the second one will never be executed. Still, having multiple return statements can be useful if they are inside conditionals. Add this function inside your functions.php file: function loginMessage() { if (isset($_COOKIE['username'])) { return "You are " . $_COOKIE['username']; } else { return "You are not authenticated."; } } Let's use it in your index.php file by replacing the highlighted content—note that to save some tees, I replaced most of the code that was not changed at all with //…: //... <body> <p><?php echo loginMessage(); ?></p> <?php if (isset($_GET['title']) && isset($_GET['author'])): ?> //... Additionally, you can omit the return statement if you do not want the function to return anything. In this case, the function will end once it reaches the end of the block of code. Type hinting and return types With the release of PHP7, the language allows developers to be more specific about what functions get and return. You can—always optionally—specify the type of argument that the function needs, for example, type hinting, and the type of result the function will return—return type. Let's first see an example: <?php declare(strict_types=1); function addNumbers(int $a, int $b, bool $printSum): int { $sum = $a + $b; if ($printSum) { echo 'The sum is ' . $sum; } return $sum; } addNumbers(1, 2, true); addNumbers(1, '2', true); // it fails when strict_types is 1 addNumbers(1, 'something', true); // it always fails This function states that the arguments need to be an integer, and Boolean, and that the result will be an integer. Now, you know that PHP has type juggling, so it can usually transform a value of one type to its equivalent value of another type, for example, the string 2 can be used as integer 2. To stop PHP from using type juggling with the arguments and results of functions, you can declare the strict_types directive as shown in the first highlighted line. This directive has to be declared on the top of each file, where you want to enforce this behavior. The three invocations work as follows: The first invocation sends two integers and a Boolean, which is what the function expects. So, regardless of the value of strict_types, it will always work. The second invocation sends an integer, a string, and a Boolean. The string has a valid integer value, so if PHP was allowed to use type juggling, the invocation would resolve to just normal. But in this example, it will fail because of the declaration on top of the file. The third invocation will always fail as the something string cannot be transformed into a valid integer. Let's try to use a function within our project. In our index.php file, we have a foreach loop that iterates the books and prints them. The code inside the loop is kind of hard to understand as it is mixing HTML with PHP, and there is a conditional too. Let's try to abstract the logic inside the loop into a function. First, create the new functions.php file with the following content: <?php function printableTitle(array $book): string { $result = '<i>' . $book['title'] . '</i> - ' . $book['author']; if (!$book['available']) { $result .= ' <b>Not available</b>'; } return $result; } This file will contain our functions. The first one, printableTitle, takes an array representing a book and builds a string with a nice representation of the book in HTML. The code is the same as before, just encapsulated in a function. Now, index.php will have to include the functions.php file and then use the function inside the loop. Let's see how this is done: <?php require_once 'functions.php' ?> <!DOCTYPE html> <html lang="en"> //... ?> <ul> <?php foreach ($books as $book): ?> <li><?php echo printableTitle($book); ?> </li> <?php endforeach; ?> </ul> //... Well, now our loop looks way cleaner, right? Also, if we need to print the title of the book somewhere else, we can reuse the function instead of duplicating code! Summary In this article, we went through all the basics of procedural PHP while writing simple examples in order to practice them. You now know how to use variables and arrays with control structures and functions and how to get information from HTTP requests among others. Resources for Article: Further resources on this subject: Getting started with Modernizr using PHP IDE[article] PHP 5 Social Networking: Implementing Public Messages[article] Working with JSON in PHP jQuery[article]
Read more
  • 0
  • 0
  • 16342

article-image-hands-docker-swarm
Packt
17 Feb 2016
16 min read
Save for later

Hands On with Docker Swarm

Packt
17 Feb 2016
16 min read
In this article, we will be taking a look at Docker Swarm. With Docker Swarm, you can create and manage Docker clusters. Swarm can be used to disperse containers across multiple hosts. It also has the ability to know how to scale containers as well. In this article, we will be covering the following topics: Installing Docker Swarm The Docker Swarm components Docker Swarm usage The Docker Swarm commands The Docker Swarm topics (For more resources related to this topic, see here.) Docker Swarm install Let's get things started by the typical way of installing Docker Swarm. Docker Swarm is only available for Linux and Mac OS X. The installation process for both is the same. Let's take a look at how we install Docker Swarm. Installation Ensure that you already have Docker installed, either through the curl command on Linux or through Docker Toolbox on Mac OS X. Once you have the Docker daemon installed, installing Docker Swarm will be simple: $ docker pull swarm One command and you are up and running. That's it! Docker Swarm components What components are involved with Docker Swarm? Let's take a look at the three components of Docker Swarm: Swarm Swarm manager Swarm host Swarm Docker Swarm is the container that runs on each Swarm host. Swarm uses a unique token for each cluster to be able to join the cluster. The Swarm container itself is the one that communicates on behalf of that Docker host to the other Docker hosts that are running Docker Swarm as well as the Docker Swarm manager. Swarm manager The Swarm manager is the host that is the central management point for all the Swarm hosts. The Swarm manager is where you issue all your commands to control nodes. You can switch between the nodes, join nodes, remove nodes, and manipulate the hosts. Swarm host Swarm hosts, which we saw earlier as the Docker hosts, are those that run the Docker containers. The Swarm host is managed from the Swarm manager. The preceding figure is an illustration of all the Docker Swarm components. We see that the Docker Swarm manager talks to each Swarm host that is running the Swarm container. Docker Swarm usage Let's now take look at Swarm usage and how we can do the following tasks: Creating a cluster Joining nodes Removing nodes Managing nodes Creating a cluster Let's start by creating the cluster, which starts with a Swarm manager. We first need a token that can be used to join all the nodes to the cluster: $ docker run --rm swarm create 85b335f95e9a37b679e2ea9e6ad8d6361 We can now use that token to create our Swarm manager: $ docker-machine create -d virtualbox --swarm --swarm-master --swarm-discovery token://85b335f95e9a37b679e2ea9e6ad8d6361 swarm-master Creating VirtualBox VM... Creating SSH key... Starting VirtualBox VM... Starting VM... To see how to connect Docker to this machine, run docker-machine env swarm-master. The swarm-master node is now in VirtualBox. We can see this machine by doing as follows: $ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM swarm-master virtualbox Running tcp://192.168.99.101:2376 swarm-master (master) Now, let's point Docker Machine at the new Swarm master. The earlier output we saw when we created the Swarm master tells us how to point to the node: $ docker-machine env swarm-master export DOCKER_TLS_VERIFY="1" export DOCKER_HOST="tcp://192.168.99.102:2376" export DOCKER_CERT_PATH="/Users/spg14/.docker/machine/machines/swarm-master" export DOCKER_MACHINE_NAME="swarm-master" # Run this command to configure your shell: # eval "$(docker-machine env swarm-master)" Upon running the previous command, we are told to run the following command to point to the Swarm master: $ eval "$(docker-machine env swarm-master)" Now, if we look at what machines are on our host, we can see that we have the swarm-master host as well. It is set to ACTIVE, which means that we can now run commands against this host: $ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM swarm-master * virtualbox Running tcp://192.168.99.101:2376 swarm-master (master Joining nodes Again using the token, which we got from the earlier commands, used to create the Swarm manager, we need that same token to join nodes to that cluster: $ docker-machine create -d virtualbox --swarm --swarm-discovery token://85b335f95e9a37b679e2ea9e6ad8d6361 swarm-node1 Now, if we look at the machines on our system, we can see that they are both part of the same Swarm: $ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM swarm-master * virtualbox Running tcp://192.168.99.102:2376 swarm-master(master) swarm-node1 virtualbox Running tcp://192.168.99.103:2376 swarm-master Listing nodes First, ensure you are pointing at the Swarm master: $ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM swarm-master * virtualbox Running tcp://192.168.99.102:2376 swarm-master(master) swarm-node1 virtualbox Running tcp://192.168.99.103:2376 swarm-master Now, we can see what machines are joined to this cluster based off the token used to join them all together: $ docker run --rm swarm list token://85b335f95e9a37b679e2ea9e6ad8d6361 192.168.99.102:2376 192.168.99.103:2376 Managing a cluster Let's see how we can do some management of all of the cluster nodes we are creating. So, there are two ways you can go about managing these Swarm hosts and the containers on each host that you are creating. But first, you need to know some information about them, so we will turn to our Docker Machine command again: $ docker-machine ls NAME ACTIVE DRIVER STATE URL SWARM swarm-master * virtualbox Running tcp://192.168.99.102:2376 swarm-master(master) swarm-node1 virtualbox Running tcp://192.168.99.103:2376 swarm-master You can switch to each Swarm host like we have seen earlier by doing something similar to the following—changing the values—and by following the instructions from the output of the command: $ docker-machine env <Node_Name> But this is a lot of tedious work. There is another way we can manage these hosts and see what is going on inside them. Let's take a look at how we can do it. From the previous docker-machine ls command, we see that we are currently pointing at the swarm-master node. So, any Docker commands we issue would go against this host. But, if we run the following, we can get information on the swarm-node1 node: $ docker -H tcp://192.168.99.103:2376 info Containers: 1 Images: 8 Storage Driver: aufs Root Dir: /mnt/sda1/var/lib/docker/aufs Backing Filesystem: tmpfs Dirs: 10 Dirperm1 Supported: true Execution Driver: native-0.2 Logging Driver: json-file Kernel Version: 4.0.9-boot2docker Operating System: Boot2Docker 1.8.2 (TCL 6.4); master : aba6192 - Thu Sep 10 20:58:17 UTC 2015 CPUs: 1 Total Memory: 996.2 MiB Name: swarm-node1 ID: SDEC:4RXZ:O3VL:PEPC:FYWM:IGIK:CFM5:UXPS:U4S5:PNQD:5ULK:TSCE Debug mode (server): true File Descriptors: 18 Goroutines: 29 System Time: 2015-09-16T09:32:27.67035212Z EventsListeners: 1 Init SHA1: Init Path: /usr/local/bin/docker Docker Root Dir: /mnt/sda1/var/lib/docker Labels: provider=virtualbox So, we can see the information on this host such as the number of containers, the numbers of images on the host, as well as information about the CPU, memory, and so on. We can see from the earlier information that one container is running. Let's take a look at what is running on the swarm-node1 host: $ docker -H tcp://192.168.99.103:2376 ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 12a400424c87 swarm:latest "/swarm join --advert" 17 hours ago Up 17 hours 2375/tcp swarm-agent Now, you can use any of the Docker commands using this method against any Swarm host that is listed in the output of your docker-machine ls output. The Docker Swarm commands Now, let's take a look at some Docker Swarm-specific commands that we can use. Let's revert to the ever-so-helpful—the help switch on the Docker Swarm command: $ docker run --rm swarm --help Usage: swarm [OPTIONS] COMMAND [arg...] A Docker-native clustering system Version: 0.4.0 (d647d82) Options: --debug debug mode [$DEBUG] --log-level, -l "info" Log level (options: debug, info, warn, error, fatal, panic) --help, -h show help --version, -v print the version Commands: create, c Create a cluster list, l List nodes in a cluster manage, m Manage a docker cluster join, j join a docker cluster help, h Shows a list of commands or help for one command Using TLS Let's take a look at the options you can use for Docker Swarm as well as the commands that are associated with it. Options Looking over the options from the preceding output, we can see the --debug and --log level switches. The other two are straightforward, as one will just print out the help information and the other one will print out the version number that we can see in the previous output. The options are used after each of the following subcommands of Docker Swarm. For example: $ docker run --rm swarm list --debug $ docker run --rm swarm manage --debug $ docker run --rm swarm create --debug list We looked at the Swarm list command before: $ docker run --rm swarm list token://85b335f95e9a37b679e2ea9e6ad8d6361 192.168.99.102:2376 192.168.99.103:2376 But there is also a switch that we can tack onto the list command and that is the --timeout switch: $ docker run --rm swarm list --timeout 20s token://85b335f95e9a37b679e2ea9e6ad8d6361 This will allow more time to find the nodes that are a part of Swarm. It could take time for the hosts to check, depending upon things such as network latency or if they are running in different parts of the globe. create We have seen how we can create a Swarm cluster as well. What this command actually does is it gives us the token that we need to create the cluster and join all the nodes to it. There are no other switches that can be used with this command as we have seen with other commands: $ docker run --rm swarm create 85b335f95e9a37b679e2ea9e6ad8d6361 manage We can manage a cluster with the manage subcommand in Docker Swarm. An example of this command would look like the following, replacing the information to align with your IP address and Swarm token: $ docker run --rm swarm manage -H tcp://192.168.99.104:2376 token://85b335f95e9a37b679e2ea9e6ad8d6361 The Docker Swarm topics There are three advanced topics we will take a look at in this section: Discovery services Advanced scheduling The Docker Swarm API Discovery services You can also use services such as etcd, ZooKeeper, consul, and many others to automatically add nodes to your Swarm cluster as well as do other things such as list the nodes or manage them. Let's take a look at consul and how you can use it. This will be the same for each discovery service that you might use. It just involves switching out the word consul with the discovery service you are using. On each node, you will need to do something different in how you join the machines. Earlier, we did something like this: $ docker-machine create -d virtualbox --swarm --swarm-discovery token://85b335f95e9a37b679e2ea9e6ad8d6361 swarm-node1 Now, we would do something similar to the following (based upon the discovery service you are using): $ docker-machine create -d virtualbox --swarm join --advertise=<swarm-node1_ip:2376> consul://<consul_ip> swarm-node1 You can now start manage on your laptop or the system that you will be using as the Swarm manager. Before, we would run something like this: $ docker run --rm swarm manage -H tcp://192.168.99.104:2376 token://85b335f95e9a37b679e2ea9e6ad8d6361 Now, we run this with regards to discovery services: $ docker run --rm swarm manage -H tcp://192.168.99.104:2376 consul://<consul_ip> We can also list the nodes in this cluster as well as the discovery service: $ docker run --rm swarm list -H tcp://192.168.99.104:2376 consul://<consul_ip> You can easily switch out consul for another discovery service such as etcd or ZooKeeper; the format will still be the same: $ docker-machine create -d virtualbox --swarm join --advertise=<swarm-node1_ip:2376> etcd://<etcd_ip> swarm-node1 $ docker-machine create -d virtualbox --swarm join --advertise=<swarm-node1_ip:2376> zk://<zookeeper_ip> swarm-node1 Advanced scheduling What is advanced scheduling with regards to Docker Swarm? Docker Swarm allows you to rank nodes within your cluster. It provides three different strategies to do this. These can be used by specifying them with the --strategy switch with the swarm manage command: spread binpack random spread and binpack use the same strategy to rank your nodes. They are ranked based off of the node's available RAM and CPU as well as the number of containers that it has running on it. spread will rank the host with less containers higher than a container with more containers (assuming that the memory and CPU values are the same). spread does what the name implies; it will spread the nodes across multiple hosts. By default, spread is used with regards to scheduling. binpack will try to pack as many containers on as few hosts as possible to keep the number of Swarm hosts to a minimal. random will do just that—it will randomly pick a Swarm host to place a node on. The Swarm scheduler comes with a few filters that can be used as well. These can be assigned with the --filter switch with the swarm manage command. These filters can be used to assign nodes to hosts. There are five filters that are associated with it: constraint: There are three types of constraints that can be assigned to nodes: storage=: This is used if you want to specify a node that is put on a host and has SSD drives in it region=: This is used if you want to set a region; mostly used for cloud computing such as AWS or Microsoft Azure environment=: This can set a node to be put into production, development, or other created environments affinity: This filter is used to create attractions between containers. This means that you can specify a filter name and then have all those containers run on the same node. port: The port filter finds a host that has the open port needed for the node to run; it then assigns the node to that host. So, if you have a MySQL instance and need port 3306 open, it will find a host that has port 3306 open and assign the node to that host for operation. dependency: The dependency filter schedules nodes to run on the same host based off of three dependencies: --volumes-from=dependency --link=dependency:<alias> --net=container:dependency health: The health filter is pretty straightforward. It will prevent the scheduling of nodes to run on unhealthy hosts. The Swarm API Before we dive into the Swarm API, let's first make sure you understand what an API is. An API is defined as an application programming interface. An API consists of routines, protocols, and tools to build applications. Think of an API as the bricks used to build a wall. This allows you to put the wall together using those bricks. What APIs allow you to do is code in the environment you are comfortable in and reach into other environments to do the work you need. So, if you are used to coding in Python, you can still use Python to do all your work while using the Swarm API to do the work in Swarm that you would like done. For example, if you wanted to create a container, you would use the following in your code: POST /containers/create HTTP/1.1 Content-Type: application/json { "Hostname": "", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": true, "AttachStderr": true, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": null, "Cmd": [ "date" ], "Entrypoint": "", "Image": "ubuntu", "Labels": { "com.example.vendor": "Acme", "com.example.license": "GPL", "com.example.version": "1.0" }, "Mounts": [ { "Source": "/data", "Destination": "/data", "Mode": "ro,Z", "RW": false } ], "WorkingDir": "", "NetworkDisabled": false, "MacAddress": "12:34:56:78:9a:bc", "ExposedPorts": { "22/tcp": {} }, "HostConfig": { "Binds": ["/tmp:/tmp"], "Links": ["redis3:redis"], "LxcConf": {"lxc.utsname":"docker"}, "Memory": 0, "MemorySwap": 0, "CpuShares": 512, "CpuPeriod": 100000, "CpusetCpus": "0,1", "CpusetMems": "0,1", "BlkioWeight": 300, "MemorySwappiness": 60, "OomKillDisable": false, "PortBindings": { "22/tcp": [{ "HostPort": "11022" }] }, "PublishAllPorts": false, "Privileged": false, "ReadonlyRootfs": false, "Dns": ["8.8.8.8"], "DnsSearch": [""], "ExtraHosts": null, "VolumesFrom": ["parent", "other:ro"], "CapAdd": ["NET_ADMIN"], "CapDrop": ["MKNOD"], "RestartPolicy": { "Name": "", "MaximumRetryCount": 0 }, "NetworkMode": "bridge", "Devices": [], "Ulimits": [{}], "LogConfig": { "Type": "json-file", "Config": {} }, "SecurityOpt": [""], "CgroupParent": "" } } You would use the preceding example to create a container; but there are also other things you can do such as inspect containers, get the logs from a container, attach to a container, and much more. Simply put, if you can do it through the command line, there is more than likely something in the API that can be used to tie into to do it through the programming language you are using. The Docker documentation states that the Swarm API is mostly compatible with the Docker Remote API. Now we could list them out in this section. But seeing that the list could change as things could be added into the Docker Swarm API or removed, I believe, it's best to refer to the link to the Swarm API documentation here instead of listing them out, so the information is not outdated: https://docs.docker.com/swarm/api/swarm-api/ The Swarm cluster example We will now go through an example of how to set up a Docker Swarm cluster: # Create a new Docker host with Docker Machine $ docker-machine create --driver virtualbox swarm # Point to the new Docker host $ eval "$(docker-machine env swarm)" # Generate a Docker Swarm Discovery Token $ docker run swarm create # Launch the Swarm Manager $ docker-machine create --driver virtualbox --swarm --swarm-master --swarm-discovery token://<DISCOVERY_TOKEN> swarm-master # Launch a Swarm node $ docker-machine create --driver virtualbox --swarm --swarm-discovery token://<DISCOVERY_TOKEN> swarm_node-01 # Launch another Swarm node $ docker-machine create --driver virtualbox --swarm --swarm-discovery token://<DISCOVERY_TOKEN> swarm_node-02 # Point to our Swarm Manager $ eval "$(docker-machine env swarm-master)" # Execute 'docker info' command to view information about your envionment $ docker info # Execute 'docker ps -a'; will show you all the containers running as well as how they are joined to the same Swarm cluster $ docker ps -a # Run simple test $ docker run hello-world # You can then execute the 'docker ps -a' command again to see what node it ran on $ docker ps -a # You will want to look at the column labeled 'NAMES'. If you continue to re-run the 'docker run hello-world' command/container you will see it will run on a different Swarm node Summary In this article, we took a dive into Docker Swarm. We took a look at how to install Docker Swarm and the Docker Swarm components; these are what make up Docker Swarm. We took a look at how to use Docker Swarm; joining, listing, and managing Swarm nodes. We reviewed the Swarm commands and how to use them. We also covered some advanced Docker Swarm topics such as advanced scheduling for your jobs, discovery services to discover new containers to add to Docker Swarm, and the Docker Swarm API that you can use to tie your own code to perform the Swarm commands. Resources for Article: Further resources on this subject: Introduction to Docker[article] Docker in Production[article] Speeding Vagrant Development With Docker[article]
Read more
  • 0
  • 0
  • 4292

article-image-introducing-object-oriented-programmng-typescript
Packt
17 Feb 2016
13 min read
Save for later

Introducing Object Oriented Programmng with TypeScript

Packt
17 Feb 2016
13 min read
In this article, we will see how to group our functions in reusable components, such as classes or modules. This article will cover the following topics: SOLID principles Classes Association, aggregation, and composition Inheritance (For more resources related to this topic, see here.) SOLID principles In the early days of software development, developers used to write code with procedural programming languages. In procedural programming languages, the programs follow a top-to-bottom approach and the logic is wrapped with functions. New styles of computer programming, such as modular programming or structured programming, emerged when developers realized that procedural computer programs could not provide them with the desired level of abstraction, maintainability, and reusability. The development community created a series of recommended practices and design patterns to improve the level of abstraction and reusability of procedural programming languages, but some of these guidelines required a certain level of expertise. In order to facilitate adherence to these guidelines, a new style of computer programming known as object-oriented programming (OOP) was created. Developers quickly noticed some common OOP mistakes and came up with five rules that every OOP developer should follow to create a system that is easy to maintain and extend over time. These five rules are known as the SOLID principles. SOLID is an acronym introduced by Michael Feathers, which stands for the following principles: Single responsibility principle (SRP): This principle states that a software component (function, class, or module) should focus on one unique task (have only one responsibility). Open/closed principle (OCP): This principle states that software entities should be designed with application growth (new code) in mind (should be open to extension), but the application growth should require the fewer possible number of changes to the existing code (be closed for modification). Liskov substitution principle (LSP): This principle states that we should be able to replace a class in a program with another class as long as both classes implement the same interface. After replacing the class, no other changes should be required, and the program should continue to work as it did originally. Interface segregation principle (ISP): This principle states that we should split interfaces that are very large (general-purpose interfaces) into smaller and more specific ones (many client-specific interfaces) so that clients will only need to know about the methods that are of interest to them. Dependency inversion principle (DIP): This principle states that entities should depend on abstractions (interfaces) as opposed to depending on concretion (classes). In this article, we will see how to write TypeScript code that adheres to these principles so that our applications are easy to maintain and extend over time. Classes In this section, we will look at some details and OOP concepts through examples. Let's start by declaring a simple class: class Person { public name : string; public surname : string; public email : string; constructor(name : string, surname : string, email : string){ this.email = email; this.name = name; this.surname = surname; } greet() { alert("Hi!"); } } var me : Person = new Person("Remo", "Jansen", "remo.jansen@wolksoftware.com"); We use classes to represent the type of an object or entity. A class is composed of a name, attributes, and methods. The class in the preceding example is named Person and contains three attributes or properties (name, surname, and email) and two methods (constructor and greet). Class attributes are used to describe the object's characteristics, while class methods are used to describe its behavior. A constructor is a special method used by the new keyword to create instances (also known as objects) of our class. We have declared a variable named me, which holds an instance of the Person class. The new keyword uses the Person class's constructor to return an object whose type is Person. A class should adhere to the single responsibility principle (SRP). The Person class in the preceding example represents a person, including all their characteristics (attributes) and behaviors (methods). Now let's add some email as validation logic: class Person { public name : string; public surname : string; public email : string; constructor(name : string, surname : string, email : string) { this.surname = surname; this.name = name; if(this.validateEmail(email)) { this.email = email; } else { throw new Error("Invalid email!"); } } validateEmail() { var re = /S+@S+.S+/; return re.test(this.email); } greet() { alert("Hi! I'm " + this.name + ". You can reach me at " + this.email); } } When an object doesn't follow the SRP and it knows too much (has too many properties) or does too much (has too many methods), we say that the object is a God object. The Person class here is a God object because we have added a method named validateEmail that is not really related to the Person class's behavior. Deciding which attributes and methods should or should not be part of a class is a relatively subjective decision. If we spend some time analyzing our options, we should be able to find a way to improve the design of our classes. We can refactor the Person class by declaring an Email class, responsible for e-mail validation, and use it as an attribute in the Person class: class Email { public email : string; constructor(email : string){ if(this.validateEmail(email)) { this.email = email; } else { throw new Error("Invalid email!"); } } validateEmail(email : string) { var re = /S+@S+.S+/; return re.test(email); } } Now that we have an Email class, we can remove the responsibility of validating the emails from the Person class and update its email attribute to use the type Email instead of string: class Person { public name : string; public surname : string; public email : Email; constructor(name : string, surname : string, email : Email){ this.email = email; this.name = name; this.surname = surname; } greet() { alert("Hi!"); } } Making sure that a class has a single responsibility makes it easier to see what it does and how we can extend/improve it. We can further improve our Person and Email classes by increasing the level of abstraction of our classes. For example, when we use the Email class, we don't really need to be aware of the existence of the validateEmail method; so this method could be invisible from outside the Email class. As a result, the Email class would be much simpler to understand. When we increase the level of abstraction of an object, we can say that we are encapsulating the object's data and behavior. Encapsulation is also known as information hiding. For example, the Email class allows us to use emails without having to worry about e-mail validation because the class will deal with it for us. We can make this clearer by using access modifiers (public or private) to flag as private all the class attributes and methods that we want to abstract from the use of the Email class: class Email { private email : string; constructor(email : string){ if(this.validateEmail(email)) { this.email = email; } else { throw new Error("Invalid email!"); } } private validateEmail(email : string) { var re = /S+@S+.S+/; return re.test(email); } get():string { return this.email; } } We can then simply use the Email class without needing to explicitly perform any kind of validation: var email = new Email("remo.jansen@wolksoftware.com"); Interfaces The feature that we will miss the most when developing large-scale web applications with JavaScript is probably interfaces. We have seen that following the SOLID principles can help us to improve the quality of our code, and writing good code is a must when working on a large project. The problem is that if we attempt to follow the SOLID principles with JavaScript, we will soon realize that without interfaces, we will never be able to write SOLID OOP code. Fortunately for us, TypeScript features interfaces. Traditionally, in OOP, we say that a class can extend another class and implement one or more interfaces. An interface can implement one or more interfaces and cannot extend another class or interface. Wikipedia's definition of interfaces in OOP is as follows: In object-oriented languages, the term interface is often used to define an abstract type that contains no data or code, but defines behaviors as method signatures. Implementing an interface can be understood as signing a contract. The interface is a contract, and when we sign it (implement it), we must follow its rules. The interface rules are the signatures of the methods and properties, and we must implement them. We will see many examples of interfaces later in this article. In TypeScript, interfaces don't strictly follow this definition. The two main differences are that in TypeScript: An interface can extend another interface or class An interface can define data and behaviors as opposed to only behaviors Association, aggregation, and composition In OOP, classes can have some kind of relationship with each other. Now, we will take a look at the three different types of relationships between classes. Association We call association those relationships whose objects have an independent lifecycle and where there is no ownership between the objects. Let's take an example of a teacher and student. Multiple students can associate with a single teacher, and a single student can associate with multiple teachers, but both have their own lifecycles (both can be create and delete independently); so when a teacher leaves the school, we don't need to delete any students, and when a student leaves the school, we don't need to delete any teachers. Aggregation We call aggregation those relationships whose objects have an independent lifecycle, but there is ownership, and child objects cannot belong to another parent object. Let's take an example of a cell phone and a cell phone battery. A single battery can belong to a phone, but if the phone stops working, and we delete it from our database, the phone battery will not be deleted because it may still be functional. So in aggregation, while there is ownership, objects have their own lifecycle. Composition We use the term composition to refer to relationships whose objects don't have an independent lifecycle, and if the parent object is deleted, all child objects will also be deleted. Let's take an example of the relationship between questions and answers. Single questions can have multiple answers, and answers cannot belong to multiple questions. If we delete questions, answers will automatically be deleted. Objects with a dependent life cycle (answers, in the example) are known as weak entities. Sometimes, it can be a complicated process to decide if we should use association, aggregation, or composition. This difficulty is caused in part because aggregation and composition are subsets of association, meaning they are specific cases of association. Inheritance One of the most fundamental object-oriented programming features is its capability to extend existing classes. This feature is known as inheritance and allows us to create a new class (child class) that inherits all the properties and methods from an existing class (parent class). Child classes can include additional properties and methods not available in the parent class. Let's return to our previously declared Person class. We will use the Person class as the parent class of a child class named Teacher: class Person { public name : string; public surname : string; public email : Email; constructor(name : string, surname : string, email : Email){ this.name = name; this.surname = surname; this.email = email; } greet() { alert("Hi!"); } } This example is included in the companion source code. Once we have a parent class in place, we can extend it by using the reserved keyword extends. In the following example, we declare a class called Teacher, which extends the previously defined Person class. This means that Teacher will inherit all the attributes and methods from its parent class: class Teacher extends Person { class Teacher extends Person { teach() { alert("Welcome to class!"); } } Note that we have also added a new method named teach to the class Teacher. If we create instances of the Person and Teacher classes, we will be able to see that both instances share the same attributes and methods with the exception of the teach method, which is only available for the instance of the Teacher class: var teacher = new Teacher("remo", "jansen", new Email("remo.jansen@wolksoftware.com")); var me = new Person("remo", "jansen", new Email("remo.jansen@wolksoftware.com")); me.greet(); teacher.greet(); me.teach(); // Error : Property 'teach' does not exist on type 'Person' teacher.teach(); Sometimes, we will need a child class to provide a specific implementation of a method that is already provided by its parent class. We can use the reserved keyword super for this purpose. Imagine that we want to add a new attribute to list the teacher's subjects, and we want to be able to initialize this attribute through the teacher constructor. We will use the super keyword to explicitly reference the parent class constructor inside the child class constructor. We can also use the super keyword when we want to extend an existing method, such as greet. This OOP language feature that allows a subclass or child class to provide a specific implementation of a method that is already provided by its parent classes is known as method overriding. class Teacher extends Person { public subjects : string[]; constructor(name : string, surname : string, email : Email, subjects : string[]){ super(name, surname, email); this.subjects = subjects; } greet() { super.greet(); alert("I teach " + this.subjects); } teach() { alert("Welcome to Maths class!"); } } var teacher = new Teacher("remo", "jansen", new Email("remo.jansen@wolksoftware.com"), ["math", "physics"]); We can declare a new class that inherits from a class that is already inheriting from another. In the following code snippet, we declare a class called SchoolPrincipal that extends the Teacher class, which extends the Person class: class SchoolPrincipal extends Teacher { manageTeachers() { alert("We need to help students to get better results!"); } } If we create an instance of the SchoolPrincipal class, we will be able to access all the properties and methods from its parent classes (SchoolPrincipal, Teacher, and Person): var principal = new SchoolPrincipal("remo", "jansen", new Email("remo.jansen@wolksoftware.com"), ["math", "physics"]); principal.greet(); principal.teach(); principal.manageTeachers(); It is not recommended to have too many levels in the inheritance tree. A class situated too deeply in the inheritance tree will be relatively complex to develop, test, and maintain. Unfortunately, we don't have a specific rule that we can follow when we are unsure whether we should increase the depth of the inheritance tree (DIT). We should use inheritance in such a way that it helps us to reduce the complexity of our application and not the opposite. We should try to keep the DIT between 0 and 4 because a value greater than 4 would compromise encapsulation and increase complexity. Summary In this article, we saw how to work with classes, and interfaces in depth. We were able to reduce the complexity of our application by using techniques such as encapsulation and inheritance. To learn more about TypeScript, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: TypeScript Essentials (https://www.packtpub.com/web-development/typescript-essentials) Mastering TypeScript(https://www.packtpub.com/web-development/mastering-typescript) Resources for Article: Further resources on this subject: Writing SOLID JavaScript code with TypeScript [article] Introduction to TypeScript [article] An Introduction to Mastering JavaScript Promises and Its Implementation in Angular.js [article]
Read more
  • 0
  • 0
  • 27935
article-image-putting-fun-functional-python
Packt
17 Feb 2016
21 min read
Save for later

Putting the Fun in Functional Python

Packt
17 Feb 2016
21 min read
Functional programming defines a computation using expressions and evaluation—often encapsulated in function definitions. It de-emphasizes or avoids the complexity of state change and mutable objects. This tends to create programs that are more succinct and expressive. In this article, we'll introduce some of the techniques that characterize functional programming. We'll identify some of the ways to map these features to Python. Finally, we'll also address some ways in which the benefits of functional programming accrue when we use these design patterns to build Python applications. Python has numerous functional programming features. It is not a purely functional programming language. It offers enough of the right kinds of features that it confers to the benefits of functional programming. It also retains all optimization power available from an imperative programming language. We'll also look at a problem domain that we'll use for many of the examples in this book. We'll try to stick closely to Exploratory Data Analysis (EDA) because its algorithms are often good examples of functional programming. Furthermore, the benefits of functional programming accrue rapidly in this problem domain. Our goal is to establish some essential principles of functional programming. We'll focus on Python 3 features in this book. However, some of the examples might also work in Python 2. (For more resources related to this topic, see here.) Identifying a paradigm It's difficult to be definitive on what fills the universe of programming paradigms. For our purposes, we will distinguish between just two of the many programming paradigms: Functional programming and Imperative programming. One important distinguishing feature between these two is the concept of state. In an imperative language, like Python, the state of the computation is reflected by the values of the variables in the various namespaces. The values of the variables establish the state of a computation; each kind of statement makes a well-defined change to the state by adding or changing (or even removing) a variable. A language is imperative because each statement is a command, which changes the state in some way. Our general focus is on the assignment statement and how it changes state. Python has other statements, such as global or nonlocal, which modify the rules for variables in a particular namespace. Statements like def, class, and import change the processing context. Other statements like try, except, if, elif, and else act as guards to modify how a collection of statements will change the computation's state. Statements like for and while, similarly, wrap a block of statements so that the statements can make repeated changes to the state of the computation. The focus of all these various statement types, however, is on changing the state of the variables. Ideally, each statement advances the state of the computation from an initial condition toward the desired final outcome. This "advances the computation" assertion can be challenging to prove. One approach is to define the final state, identify a statement that will establish this final state, and then deduce the precondition required for this final statement to work. This design process can be iterated until an acceptable initial state is derived. In a functional language, we replace state—the changing values of variables—with a simpler notion of evaluating functions. Each function evaluation creates a new object or objects from existing objects. Since a functional program is a composition of a function, we can design lower-level functions that are easy to understand, and we will design higher-level compositions that can also be easier to visualize than a complex sequence of statements. Function evaluation more closely parallels mathematical formalisms. Because of this, we can often use simple algebra to design an algorithm, which clearly handles the edge cases and boundary conditions. This makes us more confident that the functions work. It also makes it easy to locate test cases for formal unit testing. It's important to note that functional programs tend to be relatively succinct, expressive, and efficient when compared to imperative (object-oriented or procedural) programs. The benefit isn't automatic; it requires a careful design. This design effort is often easier than functionally similar procedural programming. Subdividing the procedural paradigm We can subdivide imperative languages into a number of discrete categories. In this section, we'll glance quickly at the procedural versus object-oriented distinction. What's important here is to see how object-oriented programming is a subset of imperative programming. The distinction between procedural and object-orientation doesn't reflect the kind of fundamental difference that functional programming represents. We'll use code examples to illustrate the concepts. For some, this will feel like reinventing a wheel. For others, it provides a concrete expression of abstract concepts. For some kinds of computations, we can ignore Python's object-oriented features and write simple numeric algorithms. For example, we might write something like the following to get the range of numbers: s = 0 for n in range(1, 10): if n % 3 == 0 or n % 5 == 0: s += n print(s) We've made this program strictly procedural, avoiding any explicit use of Python's object features. The program's state is defined by the values of the variables s and n. The variable, n, takes on values such that 1 ≤ n < 10. As the loop involves an ordered exploration of values of n, we can prove that it will terminate when n == 10. Similar code would work in C or Java using their primitive (non-object) data types. We can exploit Python's Object-Oriented Programming (OOP) features and create a similar program: m = list() for n in range(1, 10): if n % 3 == 0 or n % 5 == 0: m.append(n) print(sum(m)) This program produces the same result but it accumulates a stateful collection object, m, as it proceeds. The state of the computation is defined by the values of the variables m and n. The syntax of m.append(n) and sum(m) can be confusing. It causes some programmers to insist (wrongly) that Python is somehow not purely Object-oriented because it has a mixture of the function()and object.method() syntax. Rest assured, Python is purely Object-oriented. Some languages, like C++, allow the use of primitive data type such as int, float, and long, which are not objects. Python doesn't have these primitive types. The presence of prefix syntax doesn't change the nature of the language. To be pedantic, we could fully embrace the object model, the subclass, the list class, and add a sum method: class SummableList(list): def sum( self ): s= 0 for v in self.__iter__(): s += v return s If we initialize the variable, m, with the SummableList() class instead of the list() method, we can use the m.sum() method instead of the sum(m) method. This kind of change can help to clarify the idea that Python is truly and completely object-oriented. The use of prefix function notation is purely syntactic sugar. All three of these examples rely on variables to explicitly show the state of the program. They rely on the assignment statements to change the values of the variables and advance the computation toward completion. We can insert the assert statements throughout these examples to demonstrate that the expected state changes are implemented properly. The point is not that imperative programming is broken in some way. The point is that functional programming leads to a change in viewpoint, which can, in many cases, be very helpful. We'll show a function view of the same algorithm. Functional programming doesn't make this example dramatically shorter or faster. Using the functional paradigm In a functional sense, the sum of the multiples of 3 and 5 can be defined in two parts: The sum of a sequence of numbers A sequence of values that pass a simple test condition, for example, being multiples of three and five The sum of a sequence has a simple, recursive definition: def sum(seq): if len(seq) == 0: return 0 return seq[0] + sum(seq[1:]) We've defined the sum of a sequence in two cases: the base case states that the sum of a zero length sequence is 0, while the recursive case states that the sum of a sequence is the first value plus the sum of the rest of the sequence. Since the recursive definition depends on a shorter sequence, we can be sure that it will (eventually) devolve to the base case. The + operator on the last line of the preceeding example and the initial value of 0 in the base case characterize the equation as a sum. If we change the operator to * and the initial value to 1, it would just as easily compute a product. Similarly, a sequence of values can have a simple, recursive definition, as follows: def until(n, filter_func, v): if v == n: return [] if filter_func(v): return [v] + until( n, filter_func, v+1 ) else: return until(n, filter_func, v+1) In this function, we've compared a given value, v, against the upper bound, n. If v reaches the upper bound, the resulting list must be empty. This is the base case for the given recursion. There are two more cases defined by the given filter_func() function. If the value of v is passed by the filter_func() function, we'll create a very small list, containing one element, and append the remaining values of the until() function to this list. If the value of v is rejected by the filter_func() function, this value is ignored and the result is simply defined by the remaining values of the until() function. We can see that the value of v will increase from an initial value until it reaches n, assuring us that we'll reach the base case soon. Here's how we can use the until() function to generate the multiples of 3 or 5. First, we'll define a handy lambda object to filter values: mult_3_5= lambda x: x%3==0 or x%5==0 (We will use lambdas to emphasize succinct definitions of simple functions. Anything more complex than a one-line expression requires the def statement.) We can see how this lambda works from the command prompt in the following example: >>> mult_3_5(3) True >>> mult_3_5(4) False >>> mult_3_5(5) True This function can be used with the until() function to generate a sequence of values, which are multiples of 3 or 5. The until() function for generating a sequence of values works as follows: >>> until(10, lambda x: x%3==0 or x%5==0, 0) [0, 3, 5, 6, 9] We can use our recursive sum() function to compute the sum of this sequence of values. The various functions, such as sum(), until(), and mult_3_5() are defined as simple recursive functions. The values are computed without restoring to use intermediate variables to store state. We'll return to the ideas behind this purely functional recursive function definition in several places. It's important to note here that many functional programming language compilers can optimize these kinds of simple recursive functions. Python can't do the same optimizations. Using a functional hybrid We'll continue this example with a mostly functional version of the previous example to compute the sum of the multiples of 3 and 5. Our hybrid functional version might look like the following: print( sum(n for n in range(1, 10) if n%3==0 or n%5==0) ) We've used nested generator expressions to iterate through a collection of values and compute the sum of these values. The range(1, 10) method is an iterable and, consequently, a kind of generator expression; it generates a sequence of values . The more complex expression, n for n in range(1, 10) if n%3==0 or n%5==0, is also an iterable expression. It produces a set of values . A variable, n, is bound to each value, more as a way of expressing the contents of the set than as an indicator of the state of the computation. The sum() function consumes the iterable expression, creating a final object, 23. The bound variable doesn't change once a value is bound to it. The variable, n, in the loop is essentially a shorthand for the values available from the range() function. The if clause of the expression can be extracted into a separate function, allowing us to easily repurpose this with other rules. We could also use a higher-order function named filter() instead of the if clause of the generator expression. As we work with generator expressions, we'll see that the bound variable is at the blurry edge of defining the state of the computation. The variable, n, in this example isn't directly comparable to the variable, n, in the first two imperative examples. The for statement creates a proper variable in the local namespace. The generator expression does not create a variable in the same way as a for statement does: >>> sum( n for n in range(1, 10) if n%3==0 or n%5==0 ) 23 >>> n Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'n' is not defined Because of the way Python uses namespaces, it might be possible to write a function that can observe the n variable in a generator expression. However, we won't. Our objective is to exploit the functional features of Python, not to detect how those features have an object-oriented implementation under the hood. Looking at object creation In some cases, it might help to look at intermediate objects as a history of the computation. What's important is that the history of a computation is not fixed. When functions are commutative or associative, then changes to the order of evaluation might lead to different objects being created. This might have performance improvements with no changes to the correctness of the results. Consider this expression: >>> 1+2+3+4 10 We are looking at a variety of potential computation histories with the same result. Because the + operator is commutative and associative, there are a large number of candidate histories that lead to the same result. Of the candidate sequences, there are two important alternatives, which are as follows: >>> ((1+2)+3)+4 10 >>> 1+(2+(3+4)) 10 In the first case, we fold in values working from left to right. This is the way Python works implicitly. Intermediate objects 3 and 6 are created as part of this evaluation. In the second case, we fold from right-to-left. In this case, intermediate objects 7 and 9 are created. In the case of simple integer arithmetic, the two results have identical performance; there's no optimization benefit. When we work with something like the list append, we might see some optimization improvements when we change the association rules. Here's a simple example: >>> import timeit >>> timeit.timeit("((([]+[1])+[2])+[3])+[4]") 0.8846941249794327 >>> timeit.timeit("[]+([1]+([2]+([3]+[4])))") 1.0207440659869462 In this case, there's some benefit in working from left to right. What's important for functional design is the idea that the + operator (or add() function) can be used in any order to produce the same results. The + operator has no hidden side effects that restrict the way this operator can be used. The stack of turtles When we use Python for functional programming, we embark down a path that will involve a hybrid that's not strictly functional. Python is not Haskell, OCaml, or Erlang. For that matter, our underlying processor hardware is not functional; it's not even strictly object-oriented—CPUs are generally procedural. All programming languages rest on abstractions, libraries, frameworks and virtual machines. These abstractions, in turn, may rely on other abstractions, libraries, frameworks and virtual machines. The most apt metaphor is this: the world is carried on the back of a giant turtle. The turtle stands on the back of another giant turtle. And that turtle, in turn, is standing on the back of yet another turtle. It's turtles all the way down.                                                                                                             – Anonymous
There's no practical end to the layers of abstractions. More importantly, the presence of abstractions and virtual machines doesn't materially change our approach to designing software to exploit the functional programming features of Python. Even within the functional programming community, there are more pure and less pure functional programming languages. Some languages make extensive use of monads to handle stateful things like filesystem input and output. Other languages rely on a hybridized environment that's similar to the way we use Python. We write software that's generally functional with carefully chosen procedural exceptions. Our functional Python programs will rely on the following three stacks of abstractions: Our applications will be functions—all the way down—until we hit the objects The underlying Python runtime environment that supports our functional programming is objects—all the way down—until we hit the turtles The libraries that support Python are a turtle on which Python stands The operating system and hardware form their own stack of turtles. These details aren't relevant to the problems we're going to solve. A classic example of functional programming As part of our introduction, we'll look at a classic example of functional programming. This is based on the classic paper Why Functional Programming Matters by John Hughes. The article appeared in a paper called Research Topics in Functional Programming, edited by D. Turner, published by Addison-Wesley in 1990. Here's a link to the paper Research Topics in Functional Programming: http://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf This discussion of functional programming in general is profound. There are several examples given in the paper. We'll look at just one: the Newton-Raphson algorithm for locating the roots of a function. In this case, the function is the square root. It's important because many versions of this algorithm rely on the explicit state managed via loops. Indeed, the Hughes paper provides a snippet of the Fortran code that emphasizes stateful, imperative processing. The backbone of this approximation is the calculation of the next approximation from the current approximation. The next_() function takes x, an approximation to the sqrt(n) method and calculates a next value that brackets the proper root. Take a look at the following example: def next_(n, x): return (x+n/x)/2 This function computes a series of values . The distance between the values is halved each time, so they'll quickly get to converge on the value such that, which means . We don't want to call the method next() because this name would collide with a built-in function. We call it the next_() method so that we can follow the original presentation as closely as possible. Here's how the function looks when used in the command prompt: >>> n= 2 >>> f= lambda x: next_(n, x) >>> a0= 1.0 >>> [ round(x,4) for x in (a0, f(a0), f(f(a0)), f(f(f(a0))),) ] [1.0, 1.5, 1.4167, 1.4142] We've defined the f() method as a lambda that will converge on . We started with 1.0 as the initial value for . Then we evaluated a sequence of recursive evaluations: , and so on. We evaluated these functions using a generator expression so that we could round off each value. This makes the output easier to read and easier to use with doctest. The sequence appears to converge rapidly on . We can write a function, which will (in principle) generate an infinite sequence of values converging on the proper square root: def repeat(f, a): yield a for v in repeat(f, f(a)): yield v This function will generate approximations using a function, f(), and an initial value, a. If we provide the next_() function defined earlier, we'll get a sequence of approximations to the square root of the n argument. The repeat() function expects the f() function to have a single argument, however, our next_() function has two arguments. We can use a lambda object, lambda x: next_(n, x), to create a partial version of the next_() function with one of two variables bound. The Python generator functions can't be trivially recursive, they must explicitly iterate over the recursive results, yielding them individually. Attempting to use a simple return repeat(f, f(a)) will end the iteration, returning a generator expression instead of yielding the sequence of values. We have two ways to return all the values instead of returning a generator expression, which are as follows: We can write an explicit for loop as follows: for x in some_iter: yield x. We can use the yield from statement as follows: yield from some_iter. Both techniques of yielding the values of a recursive generator function are equivalent. We'll try to emphasize yield from. In some cases, however, the yield with a complex expression will be more clear than the equivalent mapping or generator expression. Of course, we don't want the entire infinite sequence. We will stop generating values when two values are so close to each other that we can call either one the square root we're looking for. The common symbol for the value, which is close enough, is the Greek letter Epsilon, ε, which can be thought of as the largest error we will tolerate. In Python, we'll have to be a little clever about taking items from an infinite sequence one at a time. It works out well to use a simple interface function that wraps a slightly more complex recursion. Take a look at the following code snippet: def within(ε, iterable): def head_tail(ε, a, iterable): b= next(iterable) if abs(a-b) <= ε: return b return head_tail(ε, b, iterable) return head_tail(ε, next(iterable), iterable) We've defined an internal function, head_tail(), which accepts the tolerance, ε, an item from the iterable sequence, a, and the rest of the iterable sequence, iterable. The next item from the iterable bound to a name b. If , then the two values that are close enough together that we've found the square root. Otherwise, we use the b value in a recursive invocation of the head_tail() function to examine the next pair of values. Our within() function merely seeks to properly initialize the internal head_tail() function with the first value from the iterable parameter. Some functional programming languages offer a technique that will put a value back into an iterable sequence. In Python, this might be a kind of unget() or previous() method that pushes a value back into the iterator. Python iterables don't offer this kind of rich functionality. We can use the three functions next_(), repeat(), and within() to create a square root function, as follows: def sqrt(a0, ε, n): return within(ε, repeat(lambda x: next_(n,x), a0)) We've used the repeat() function to generate a (potentially) infinite sequence of values based on the next_(n,x) function. Our within() function will stop generating values in the sequence when it locates two values with a difference less than ε. When we use this version of the sqrt() method, we need to provide an initial seed value, a0, and an ε value. An expression like sqrt(1.0, .0001, 3) will start with an approximation of 1.0 and compute the value of to within 0.0001. For most applications, the initial a0 value can be 1.0. However, the closer it is to the actual square root, the more rapidly this method converges. The original example of this approximation algorithm was shown in the Miranda language. It's easy to see that there are few profound differences between Miranda and Python. The biggest difference is Miranda's ability to construct cons, a value back into an iterable, doing a kind of unget. This parallelism between Miranda and Python gives us confidence that many kinds of functional programming can be easily done in Python. Summary We've looked at programming paradigms with an eye toward distinguishing the functional paradigm from two common imperative paradigms in details. For more information kindly take a look at the following books, also by Packt Publishing: Learning Python (https://www.packtpub.com/application-development/learning-python) Mastering Python (https://www.packtpub.com/application-development/mastering-python) Mastering Object-oriented Python (https://www.packtpub.com/application-development/mastering-object-oriented-python) Resources for Article: Further resources on this subject: Saying Hello to Unity and Android [article] Using Specular in Unity [article] Unity 3.x Scripting-Character Controller versus Rigidbody [article]
Read more
  • 0
  • 1
  • 6081

article-image-openstack-performance-availability
Packt
17 Feb 2016
21 min read
Save for later

OpenStack Performance, Availability

Packt
17 Feb 2016
21 min read
In this article by Tony Campbell, author of the book Troubleshooting OpenStack, we will cover some of the chronic issues that might be early signs of trouble. This article is more about prevention and aims to help you avoid emergency troubleshooting as much as possible. (For more resources related to this topic, see here.) Database Many OpenStack services make heavy use of a database. Production deployments will typically use MySQL or Postgres as the backend database server. As you may have learned, a failing or misconfigured database will quickly lead to trouble in your OpenStack cluster. Database problems can also present more subtle concerns that may grow into huge problems if neglected. Availability This database server can become a single point of failure if your database server is not deployed in a highly available configuration. OpenStack does not require a high availability installation of your database, and as a result, many installations may skip this step. However, production deployments of OpenStack should take care to ensure that their database can survive the failure of a single database server. MySQL with Galera Cluster For installations using the MySQL database engine, there are several options for clustering your installation. One popular method is to leverage Galera Cluster (http://http://galeracluster.com/). Galera Cluster for MySQL leverages synchronous replication and provides a multi-master cluster, which offers high availability for your OpenStack databases. Postgres Installations that use the Postgres database engine have several options such as high availability, load balancing, and replication. These options include block device replication with DRBD, log shipping, Master-Standby replication based on triggers, statement-based replication, and asynchronous multi-master replication. For details, refer to the Postgres High Availability Guide (http://www.postgresql.org/docs/current/static/high-availability.html). Performance Database performance is one of those metrics that can degrade over time. For an administrator who does not pay attention to small problems in this area, these can eventually become large problems. A wise administrator will regularly monitor the performance of their database and will constantly be on the lookout for slow queries, high database loads, and other indications of trouble. MySQL There are several options for monitoring your MySQL server, some of which are commercial and many others that are open source. Administrators should evaluate the options available and select a solution that fits their current set of tools and operating environment. There are several performance metrics you will want to monitor. Show Status The MySQL SHOW STATUS statement can be executed from the mysql command prompt. The output of this statement is server status information with over 300 variables reported. To narrow down the information, you can leverage a LIKE clause on the variable_name to display the sections you are interested in. Here is an abbreviated list of the output instances returned by SHOW STATUS: mysql> SHOW STATUS; +------------------------------------------+-------------+ | Variable_name                            | Value       | +------------------------------------------+-------------+ | Aborted_clients                          | 29          | | Aborted_connects                         | 27          | | Binlog_cache_disk_use                    | 0           | | Binlog_cache_use                         | 0           | | Binlog_stmt_cache_disk_use               | 0           | | Binlog_stmt_cache_use                    | 0           | | Bytes_received                           | 614         | | Bytes_sent                               | 33178       | Mytop Mytop is a command-line utility inspired by the Linux top command. Mytop retrieves data from the MySql SHOW PROCESSLIST command and the SHOW STATUS command. Data from these commands is refreshed, processed, and displayed in the output of the Mytop command. The Mytop output includes a header, which contains summary data followed by a thread section. Mytop header section Here is an example of the header output from the Mytop command: MySQL on localhost (5.5.46)                                                                                                                    load 1.01 0.85 0.79 4/538 23573 up 5+02:19:24 [14:35:24]  Queries: 3.9M     qps:    9 Slow:     0.0         Se/In/Up/De(%):    49/00/08/00  Sorts:     0 qps now:   10 Slow qps: 0.0  Threads:   30 (   1/   4) 40/00/12/00  Cache Hits: 822.0 Hits/s:  0.0 Hits now:   0.0  Ratio:  0.0%  Ratio now:  0.0%  Key Efficiency: 97.3%  Bps in/out:  1.7k/ 3.1k   Now in/out:  1.0k/ 3.9k As demonstrated in the preceding output, the header section for the Mytop command includes the following information: The hostname and MySQL version The server load The MySQL server uptime The total number of queries The average number of queries Slow queries The percentage of Select, Insert, Update, and Delete queries Queries per second Threads Cache hits Key efficiency Mytop thread section They Mytop thread section will list as many threads as it can display. The threads are ordered by the Time column, which displays the threads' idle time:        Id      User         Host/IP         DB       Time    Cmd    State Query                                                                                                                                --      ----         -------         --       ----    ---    ----- ----------                                                                                                                         3461   neutron  174.143.201.98    neutron   5680  Sleep                                                                                                                                            3477    glance  174.143.201.98     glance   1480  Sleep                                                                                                                                            3491      nova  174.143.201.98     nova      880  Sleep                                                                                                                                             3512      nova  174.143.201.98    nova      281  Sleep                                                                                                                                             3487  keystone  174.143.201.98   keystone        280  Sleep                                                                                                                                             3489    glance  174.143.201.98     glance        280  Sleep                                                                                                                                            3511  keystone  174.143.201.98   keystone        280  Sleep                                                                                                                                            3513   neutron  174.143.201.98    neutron        280  Sleep                                                                                                                                             3505  keystone  174.143.201.98   keystone        279  Sleep                                                                                                                                             3514  keystone  174.143.201.98   keystone        141  Sleep                                                                                                                                            ... The Mytop thread section displays the ID of each thread followed by the user and host. Finally, this section will display the database, idle time, and state or command query. Mytop will allow you to keep an eye on the performance of your MySql database server. Percona toolkit The Percona Toolkit is a very useful set of command-line tools for performing MySQL operations and system tasks. The toolkit can be downloaded from Percona at https://www.percona.com/downloads/percona-toolkit/. The output from these tools can be fed into your monitoring system allowing you to effectively monitor your MyQL installation. Postgres Like MySQL, the Postgres database also has a series of tools, which can be leveraged to monitor database performance. In addition to standard Linux troubleshooting tools, such as top and ps, Postgres also offers its own collection of statistics. The PostgreSQL statistics collector The statistics collector in Postgres allows you to collect data about server activity. The statistics collected in this tool are varied and may be helpful for troubleshooting or system monitoring. In order to leverage the statistics collector, you must turn on the functionality in the postgresql.conf file. The settings are commented out by default in the RUNTIME STATISTICS section of the configuration file. Uncomment the lines in the Query/Index Statistics Collector subsection. #------------------------------------------------------------------------------ # RUNTIME STATISTICS #------------------------------------------------------------------------------   # - Query/Index Statistics Collector -   track_activities = on track_counts = on track_io_timing = off track_functions = none                 # none, pl, all track_activity_query_size = 1024       # (change requires restart) update_process_title = on stats_temp_directory = 'pg_stat_tmp' Once the statistics collector is configured, restart the database server or execute a pg_ctl reload for the configuration to take effect. Once the collector has been configured, there will be a series of views created and named with the prefix “pg_stat”. These views can be queried for relevant statistics in the Posgres database server. Database bckups A diligent operator will ensure that a backup of the database for each OpenStack project is created. Since most OpenStack services make heavy use of the database for persisting things such as states and metadata, a corruption or loss of data can render your OpenStack cloud unusable. The current database backups can help rescue you from this fate. MySQL users can use the mysqldump utility to back up all of the OpenStack datbases. mysqldump --opt --all-databases > all_openstack_dbs.sql Similarly, Postgres users can back up all OpenStack databases with a command similar to the following: pg_dumpall > all_openstack_dbs.sql Your cadence for backups will depend on your environment and tolerance for data corruption of loss. You should store these backups in a safe place and occasional deploy test restores from the data to ensure that they work as expected. Monitoring Monitoring is often your early warning system that something is going wrong in your cluster. Your monitoring system can also be a rich source of information when the time comes to troubleshoot issues with the cluster. There are multiple options available for monitoring OpenStack. Many of your current application monitoring platforms will handle OpenStack just as well as any other Linux system. Regardless of the tool you select to for monitoring, there are several parts of OpenStack that you should focus on. Resource monitoring OpenStack is typically deployed on a series of Linux servers. Monitoring the resources on those servers is essential. A set it and forget it attitude is a recipe for disaster. The things you may want to monitor on your host servers include the following: CPU Disk Memory Log file size Network I/O Database Message broker OpenStack qotas OpenStack operators have the option to set usage quotas for each tenant/project. As an administrator, it is helpful to monitor a project’s usage as it pertains to these quotas. Once users reach a quota, they may not be able to deploy additional resources. Users may misinterpret this as an error in the system and report it to you . By keeping an eye on the quotas, your can proactively warn users as they reach their thresholds or you can decide to increase the quotas as appropriate. Some of the services have client commands that can be used to retrieve quota statistics. As an example, we demonstrate the nova absolute-limits command here: nova absolute-limits +--------------------+------+-------+ | Name               | Used | Max   | +--------------------+------+-------+ | Cores              | 1    | 20    | | FloatingIps        | 0    | 10    | | ImageMeta          | -    | 128   | | Instances          | 1    | 10    | | Keypairs           | -    | 100   | | Personality        | -    | 5     | | Personality Size   | -    | 10240 | | RAM                | 512  | 51200 | | SecurityGroupRules | -    | 20    | | SecurityGroups     | 1    | 10    | | Server Meta        | -    | 128   | | ServerGroupMembers | -    | 10    | | ServerGroups       | 0    | 10    | +--------------------+------+-------+ The absolute-limits command in Nova is nice because it displays the project’s current usage alongside the quota maximum, making it easy to notice that a project/tenant is coming close to the limit. RabbitMQ RabbitMQ is the default message broker used in OpenStack installations. However, if it is installed as is out the box, it can become a single point of failure. Administrators should consider clustering RabbitMQ and activating mirrored queues. Summary OpenStack is the leading open source software for running private clouds. Its popularity has grown exponentially since it was founded by Rackspace and NASA. The output of this engaged community is staggering, resulting in plenty of new features finding their way into OpenStack with each release. The project is at a size now where no one can truly know the details of each service. When working with such a complex project, it is inevitable that you will run into problems, bugs, errors, issues, and plain old trouble. Resources for Article:   Further resources on this subject: Concepts for OpenStack [article] Implementing OpenStack Networking and Security [article] Using the OpenStack Dashboard [article]
Read more
  • 0
  • 0
  • 12568

article-image-probabilistic-graphical-models
Packt
17 Feb 2016
6 min read
Save for later

Probabilistic Graphical Models

Packt
17 Feb 2016
6 min read
Probabilistic graphical models, or simply graphical models as we will refer to them in this article, are models that use the representation of a graph to describe the conditional independence relationships between a series of random variables. This topic has received an increasing amount of attention in recent years and probabilistic graphical models have been successfully applied to tasks ranging from medical diagnosis to image segmentation. In this article, we'll present some of the necessary background that will pave the way to understanding the most basic graphical model, the Naïve Bayes classifier. We will then look at a slightly more complicated graphical model, known as the Hidden Markov Model, or HMM for short. To get started in this field, we must first learn about graphs. (For more resources related to this topic, see here.) A Little Graph Theory Graph theory is a branch of mathematics that deals with mathematical objects known as graphs. Here, a graph does not have the everyday meaning that we are more used to talking about, in the sense of a diagram or plot with an x and y axis. In graph theory, a graph consists of two sets. The first is a set of vertices, which are also referred to as nodes. We typically use integers to label and enumerate the vertices. The second set consists of edges between these vertices. Thus, a graph is nothing more than a description of some points and the connections between them. The connections can have a direction so that an edge goes from the source or tail vertex to the target or head vertex. In this case, we have a directed graph. Alternatively, the edges can have no direction, so that the graph is undirected. A common way to describe a graph is via the adjacency matrix. If we have V vertices in the graph, an adjacency matrix is a V×V matrix whose entries are 0 if the vertex represented by the row number is not connected to the vertex represented by the column number. If there is a connection, the entry is 1. With undirected graphs, both nodes at each edge are connected to each other so the adjacency matrix is symmetric. For directed graphs, a vertex vi is connected to a vertex vj via an edge (vi,vj); that is, an edge where vi is the tail and vj is the head. Here is an example adjacency matrix for a graph with seven nodes: > adjacency_m 1 2 3 4 5 6 7 1 0 0 0 0 0 1 0 2 1 0 0 0 0 0 0 3 0 0 0 0 0 0 1 4 0 0 1 0 1 0 1 5 0 0 0 0 0 0 0 6 0 0 0 1 1 0 1 7 0 0 0 0 1 0 0 This matrix is not symmetric, so we know that we are dealing with a directed graph. The first 1 value in the first row of the matrix denotes the fact that there is an edge starting from vertex 1 and ending on vertex 6. When the number of nodes is small, it is easy to visualize a graph. We simply draw circles to represent the vertices and lines between them to represent the edges. For directed graphs, we use arrows on the lines to denote the directions of the edges. It is important to note that we can draw the same graph in an infinite number of different ways on the page. This is because the graph tells us nothing about the positioning of the nodes in space; we only care about how they are connected to each other. Here are two different but equally valid ways to draw the graph described by the adjacency matrix we just saw: Two vertices are said to be connected with each other if there is an edge between them (taking note of the order when talking about directed graphs). If we can move from vertex vi to vertex vj by starting at the first vertex and finishing at the second vertex, by moving on the graph along the edges and passing through an arbitrary number of graph vertices, then these intermediate edges form a path between these two vertices. Note that this definition requires that all the vertices and edges along the path are distinct from each other (with the possible exception of the first and last vertex). For example, in our graph, vertex 6 can be reached from vertex 2 by a path leading through vertex 1. Sometimes, there can be many such possible paths through the graph, and we are often interested in the shortest path, which moves through the fewest number of intermediary vertices. We can define the distance between two nodes in the graph as the length of the shortest path between them. A path that begins and ends at the same vertex is known as a cycle. A graph that does not have any cycles in it is known as an acyclic graph. If an acyclic graph has directed edges, it is known as a directed acyclic graph, which is often abbreviated as a DAG. There are many excellent references on graph theory available. One such reference which is available online, is Graph Theory, Reinhard Diestel, Springer. This landmark reference is now in its 4th edition and can be found at http://diestel-graph-theory.com/. It might not seem obvious at first, but it turns out that a large number of real world situations can be conveniently described using graphs. For example, the network of friendships on social media sites, such as Facebook, or followers on Twitter, can be represented as graphs. On Facebook, the friendship relation is reciprocal, and so the graph is undirected. On Twitter, the follower relation is not, and so the graph is directed. Another graph is the network of websites on the Web, where links from one web page to the next form directed edges. Transport networks, communication networks, and electricity grids can be represented as graphs. For the predictive modeler, it turns out that a special class of models known as probabilistic graphical models, or graphical models for short, are models that involve a graph structure. In a graphical model, the nodes represent random variables and the edges in between represent the dependencies between them. Before we can go into further detail, we'll need to take a short detour in order to visit Bayes' Theorem, a classic theorem in statistics that despite its simplicity has implications both profound and practical when it comes to statistical inference and prediction. Summary In this article, we learned that graphs are consist of nodes and edges. We also learned the way of describing a graph is via the adjacency matrix. For more information on graphical models, you can refer to the books published by Packt (https://www.packtpub.com/): Mastering Predictive Analytics with Python (https://www.packtpub.com/big-data-and-business-intelligence/mastering-predictive-analytics-python) R Graphs Cookbook Second Edition (https://www.packtpub.com/big-data-and-business-intelligence/r-graph-cookbook-%E2%80%93-second-edition) Resources for Article: Further resources on this subject: Data Analytics[article] Big Data Analytics[article] Learning Data Analytics with R and Hadoop[article]
Read more
  • 0
  • 0
  • 3502
article-image-understanding-docker
Packt
17 Feb 2016
26 min read
Save for later

Understanding Docker

Packt
17 Feb 2016
26 min read
This article will cover the Docker basics that you should already have a pretty good handle on. But if you don't already have the required knowledge at this point, this article will help give you the basics. (For more resources related to this topic, see here.) In this article, we're going to review the following higher level topics with subtopics in each section: Understanding Docker Docker versus typical VMs The Dockerfile and its function Docker networking/linking Docker installers/installation Types of installers and how they operate Controlling your Docker daemon The Kitematic GUI Docker commands Useful commands for Docker, Docker images, and Docker containers Understanding Docker In this section, we will be covering the structure of Docker and the flow of what happens behind the scenes in this world. We will also take a look at Dockerfile and all the magic it can do. Lastly, in this section, we will look at the Docker networking/linking. Difference between Docker and typical VMs First, we must know what exactly Docker is and does. Docker is a container management system that helps easily manage Linux Containers (LXC) in an easier and universal fashion. This lets you create images in virtual environments on your laptop and run commands or operations against them. The actions you do to the containers that you run in these environments locally on your own machine will be the same commands or operations you run against them when they are running in your production environment. This helps in not having to do things differently when you go from a development environment like that on your local machine to a production environment on your server. Now, let's take a look at the differences between Docker containers and the typical virtual machine environments. In the following illustration, we can see the typical Docker setup on the right-hand side versus the typical VM setup on the left-hand side: This illustration gives us a lot of insight into the biggest key benefit of Docker; and that is its no need for a full operating system every time we need to bring up a new container, which cuts down on the overall size of containers. Docker relies on using the host OS's Linux kernel (since almost all the versions of Linux use the standard kernel models) for the OS it was built upon, such as Red Hat, CentOS, Ubuntu, and so on. For this reason, you can have almost any Linux OS as your host operating system (Ubuntu in the previous illustration) and be able to layer other OSes on top of the host. For example, in the earlier illustration, we could have Red Hat running for one app (the one on the left) and Debian running for the other app (the one on the right), but there would never be a need to actually install Red Hat or Debian on the host. Thus, another benefit of Docker is the size of images when they are born. They are not built with the largest piece: the kernel or the operating system. This makes them incredibly small, compact, and easy to ship. Dockerfile Next, let's take a look at the most important file pertaining to Docker: Dockerfile. Dockerfile is the core file that contains instructions to be performed when an image is built. For example, in an Ubuntu-based system, if you want to install the Apache package, you would first do an apt-get update followed by an apt-get install -y apache2. These would be the type of instructions you would find inside a typical Dockerfile. Items such as commands, calls to other scripts, setting environmental variables, adding files, and setting permissions can all be done via Dockerfile. Dockerfile is also where you specify what image is to be used as your base image for the build. Let's take a look at a very basic Dockerfile and then go over the individual pieces that make one up and what they all do: FROM ubuntu:latest MAINTAINER Scott P. Gallagher <email@somewhere.com> RUN apt-get update && apt-get install -y apache2 ADD 000-default.conf /etc/apache2/sites-available/ RUN chown root:root /etc/apache2/sites-available/000-default.conf EXPOSE 80 CMD ["/usr/sbin/apache2ctl", "-D", "FOREGROUND"] These are the typical items you would find in a basic Dockerfile. The first line states the image we want to start off with when we build the container. In this example, we will be using Ubuntu; the item after the colon can be called if you want a specific version of it. In this case, I am just going to say use the latest version of Ubuntu; but you will also specify trusty, precise, raring, and so on. The second line is the line that is relevant to the maintainer of Dockerfile. In this case, I just have my information in there; well, at least, my name is there. This is for people to contact you if they have any questions or find any errors in your file. Typically, most people just include their name and e-mail address. The next line is a typical line you will see while pulling updates and packages in a Ubuntu environment. You might think they should be separate and wonder why they should be put on the same line separated by &&. Well, in the Dockerfile, it helps by only having to run one process to encompass the entire line. If you were to split it into separate lines, it would have to run one process, finish the process, then start the next process, and finish it. With this, it helps speed up the process by pairing the processes together. They still run one after another, but with more efficiency. The next two lines complement each other. The first adds your custom configurations to the path you specified and changes the owner and group owner to the root user. The EXPOSE line will expose the ports to anything external to the container and to the host it is running on. (This will, by default, expose the container externally beyond the host, unless the firewall is enabled and protecting it.) The last line is the command that is run when the container is launched. This particular command in a Dockerfile should only be used once. If it is used more than once, the last CMD in the Dockerfile will be launched upon the container that is running. This also helps emphasize the one process per container rule. The idea is to spread out the processes so that each process runs in its own container, thus the value of the containers will become more understandable. Essentially, something that runs in the foreground, such as the earlier command to keep the Apache running in the foreground. If we were to use CMD ["service apache2 start"], the container would start and then immediately stop. There is nothing to keep the container running. You can also have other instructions, such as ENV to specify the environmental variables that users can pass upon runtime. These are typically used and are useful while using shell scripts to perform actions such as specifying a database to be created in MySQL or setting permission databases. Docker networking/linking Another important aspect that needs to be understood is how Docker containers are networked or linked together. The way they are networked or linked together highlights another important and large benefit of Docker. When a container is created, it creates a bridge network adapter for which it is assigns an address; it is through these network adapters that the communication flows when you link containers together. Docker doesn't have the need to expose ports to link containers. Let's take a look at it with the help of the following illustration: In the preceding illustration, we can see that the typical VM has to expose ports for others to be able to communicate with each other. This can be dangerous if you don't set up your firewalls or, in this case with MySQL, your MySQL permissions correctly. This can also cause unwanted traffic to the open ports. In the case of Docker, you can link your containers together, so there is no need to expose the ports. This adds security to your setup, as there is now a secure connection between your containers. We've looked at the differences between Docker and typical VMs, as well as the Dockerfile structure and the components that make up the file. We also looked at how Docker containers are linked together for security purposes as opposed to typical VMs. Now, let's review the installers for Docker and the structure behind the installation once they are installed, manipulating them to ensure they are operating correctly. Docker installers/installation Installers are one of the first pieces you need to get up and running with Docker on both your local machine as well as your server environments. Let's first take a look at what environments you can install Docker in: Apple OS X (Mac) Windows Linux (various Linux flavors) Cloud (AWS, DigitalOcean, Microsoft Azure, and so on) Types of installers With the various types of installers listed earlier, there are different ways Docker actually operates on the operating system. Docker natively runs on Linux; so if you are using Linux, then it's pretty straightforward how Docker runs right on your system. However, if you are using Windows or Mac OS X, then it operates a little differently, since it relies on using Linux. With these operating systems, they need Linux in some sort of way, thus enters the virtual machine needed to run the Linux part that Docker operates on, which is called boot2docker. The installers for both Windows and Mac OS X are bundled with the boot2docker package alongside the virtual machine software that, by default, is the Oracle VirtualBox. Now, it is worthwhile to note that Docker recently moved away from offering boot2docker. But, I feel, it is important to understand the boot2docker terms and commands in case you run across anyone running the previous version of the Docker installer. This will help you understand what is going on and move forward to the new installer(s). Currently, they are offering up Docker Toolbox that, like the name implies, includes a lot of items that the installer will install for you. The installers for each OS contain different applications with regards to Docker such as: Docker Toolbox piece Mac OS X Windows Docker Client X X Docker Machine X X Docker Compose X   Docker Kitematic X X VirtualBox X X First, let's take a look at the older style commands of boot2docker. Then, we will take a look at the new commands or application that you can use to achieve these outcomes. Controlling the Docker VM (boot2docker) Now, there are ways to run boot2docker on different VM software. But to start off, VirtualBox is the best and easiest way to operate boot2docker: $ boot2docker Usage: boot2docker [<options>] {help|init|up|ssh|save|down|poweroff|reset|restart|config|status|info|ip|shellinit|delete|download|upgrade|version} [<args>] Now, after we have installed Docker on Linux, OS X, or Windows, how do we go about controlling this virtual machine in the events when we need to start it up, restart it, or even shut it down? This is where the boot2docker command-line parameters come into play. As you can see in the earlier illustration, there are a lot of options you can use for your boot2docker instance. The options you will use mostly are up, down, poweroff, restart, status, ip, upgrade, and version. Some of these commands you will use mostly to troubleshoot items when you are trying to see why the Docker commands might hang, or when you run into any other issues with your boot2docker virtual machine. You can see what each command does by executing the following command: $ boot2docker help The most useful command that I have found while troubleshooting is the boot2docker status command: $ boot2docker status Another useful boot2docker command is: $ boot2docker version This command will help see what version of boot2docker you are currently running. This is helpful in knowing when to use the boot2docker upgrade command. The last command we will look at with respect to boot2docker is the boot2docker ip command. This command is very useful when you need to know what IP address is to be used to access the machines you have been running on a particular host: $ boot2docker ip 192.168.59.103 As you can see, the earlier command gives us the IP address of the boot2docker client running on my OS X machine inside VirtualBox. By using this IP, I can now access the containers I might have been running using the IP address alongside any of the open ports I have exposed. Docker Machine – the new boot2docker So, with boot2docker on its way out, there needs to be a new way to do what boot2docker does. This being said, enter Docker Machine. With Docker Machine, you can do the same things you did with boot2docker, but now in Machine. The following table shows the commands you used in boot2docker and what they are now in Machine: Command boot2docker Docker Machine command boot2docker docker-machine help boot2docker help docker-machine help status boot2docker status docker-machine status version boot2docker version docker-machine sionus i ip boot2docker ip docker-machine ip Kitematic Now that we have covered all the basics of controlling your boot2docker VM, let's take a look at another way you can run Docker containers on your local machine. Let's take a look at Kitematic. Kitematic is a recent addition to the Docker portfolio. Up until now, everything we have done has been command line-based. With Kitematic, you can manage your Docker containers through a GUI. Kitematic can be used either on Windows or OS X, just not on Linux; besides who needs a GUI on Linux anyways! Kitematic, just like boot2docker, operates on a VM defaulting to VirtualBox. Pictures are worth a thousand words, so let's take a look at some screenshots of Kitematic: The previous screenshot depicts what you will see when you launch Kitematic for the first time. After you start running the containers, they will show up on the left-hand side column. You can manipulate and get information about them through the GUI. You can search for prebuilt images on the Docker Hub and click on the CREATE button once you have found the one you want to use or test. In the preceding screenshot, we have created and are running the hello-world-nginx image inside Kitematic. We can now use the STOP, RESTART, and EXEC commands against the container as well as view the settings of the running container. In the following screenshot, we can go to settings and view what ports are exposed from the container to the outside: In the following screenshot, you can see that you can use your login credentials to log in to the Docker Hub and view the repositories you have created and pushed there: The Docker commands We have covered the types of installers and what they can be run on. We have also seen how to control the Docker VM that gets created for you and how to use Kitematic. Let's look at some Docker commands that you should be familiar with already. We will start with some common commands and then take a peek at the commands that are used for the Docker images. We will then take a dive into the commands that are used for the containers. The first command we will be taking a look at will be one of the most useful commands not only in Docker but in any command-line utility you use—the help command. It is run simply by executing the command as follows: $ docker help The earlier command will give you a full list of all the Docker commands at your disposal and a brief description of what each command does. For further help with a particular command, you can run the following: $ docker COMMAND --help You will then receive additional information on using the command, such as the switches, arguments, and descriptions of the arguments. Similar to the boot2docker version command we ran earlier, there is also a version command for the Docker daemon: $ docker version Now, this command will give us a little bit more information than the boot2docker command output, as follows: Client version: 1.7.0 Client API version: 1.19 Go version (client): go1.4.2 Git commit (client): 0baf609 OS/Arch (client): darwin/amd64 Server version: 1.7.0 Server API version: 1.19 Go version (server): go1.4.2 Git commit (server): 0baf609 OS/Arch (server): linux/amd64 This is helpful when you want to see the version of the Docker daemon you may be running to see if you need/want to upgrade. The Docker images Next, let's take a dive into the Docker images. You will learn how to view the images you currently have that you can run, search for images on the Docker Hub, and pull them down to your environment, so you can run them. Let's first take a look at the docker images command. Upon running the command, we will get an output similar to the following output: REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE ubuntu 14.10 ab57dbafeeea 11 days ago 194.5 MB ubuntu trusty 6d4946999d4f 11 days ago 188.3 MB ubuntu latest 6d4946999d4f 11 days ago 188.3 MB Your output will differ based on whether you have any images at all in your Docker environment or upon what images you do have. There are a few important pieces you need to understand from the output you see. Let's go over the columns and what is contained in each. The first column you see is the REPOSITORY column; this column contains the name of the repository as it exists in the Docker Hub. If you were to have a repository that was from someone's user account, it may show up as follows: REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE scottpgallagher/mysql latest 57df9c7989a1 9 weeks ago 321.7 MB The next column, the TAG column, will show you different versions of a repository. As you can see in the preceding example with the Ubuntu repository, there are tag names for the different versions. So, if you want to specify a particular version of a repository in your Dockerfile (as we saw earlier), you are able to. This is useful, so you're not always reliant on having to use the latest version of an operating system and can use the one your application supports the best. It can also help you do backward compatibility testing for your application. The next column is labeled IMAGE ID and it is based on a unique 64 hexadecimal digit string of characters. The image ID simplifies this down to the first 12 digits for easier viewing. Imagine if you had to view all 64 bits on one line! You will learn when to use this unique image ID for later tasks. The last two columns are pretty straightforward; the first being the creation date for the repository, followed by the virtual size of the image. The size is very important as you want to keep or use images that are very small in size if you plan to be moving them around a lot. The smaller the image, the faster is the load time; and who doesn't like it faster? Searching for the Docker images Okay, so let's look at how we can search for the images that are in the Docker Hub using the Docker commands. The command we will be looking at is docker search. With the docker search command, you can search based on the different criteria you are looking for. For example, we can search for all the images with the term ubuntu in them and see what all is available. Here is what we would get back in our results; it would go as follows: $ docker search ubuntu We would get back our results: NAME DESCRIPTION STARS OFFICIAL AUTOMATED ubuntu Ubuntu is a Debian-based Linux operating s... 1835 [OK] ubuntu-upstart Upstart is an event-based replacement for ... 26 [OK] tutum/ubuntu Ubuntu image with SSH access. For the root... 25 [OK] torusware/speedus-ubuntu Always updated official Ubuntu docker imag... 25 [OK] ubuntu-debootstrap debootstrap --variant=minbase --components... 10 [OK] rastasheep/ubuntu-sshd Dockerized SSH service, built on top of of... 4 [OK] maxexcloo/ubuntu Docker base image built on Ubuntu with Sup... 2 [OK] nuagebec/ubuntu Simple always updated Ubuntu docker images... 2 [OK] nimmis/ubuntu This is a docker images different LTS vers... 1 [OK] alsanium/ubuntu Ubuntu Core image for Docker 1 [OK] Based on these results, we can now decipher some information. We can see the name of the repository, a reduced description, how many people have starred and think it is a good repository, whether it's an official repository; which means it's been approved by the Docker team, as well as if it's an automated build. An automated build is typically a Docker image that is built automatically when a Git repository it is linked to is updated. The code gets updated, the web hook is called, and a new Docker image is built in the Docker Hub. If we find an image we want to use, we can simply pull it using its repository name with the docker pull command, as follows: $ docker pull tutum/ubuntu The image will be downloaded and show up in our list when we perform the docker images command we ran earlier. We now know how to search for Docker images and pull them down to our machine. What if we want to get rid of them? That's where the docker rmi command comes into play. With the docker rmi command, you can remove unwanted images from your machine(s). So, let's take look at the images we currently have on our machine with the docker images command. We will get the following: REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE ubuntu 14.10 ab57dbafeeea 11 days ago 194.5 MB ubuntu trusty 6d4946999d4f 11 days ago 188.3 MB ubuntu latest 6d4946999d4f 11 days ago 188.3 MB We can see that we have duplicate images here taking up space. We can see this by looking at the image ID and seeing the exact image ID for both ubuntu:trusty and ubuntu:latest. We now know that ubuntu:trusty is the latest Ubuntu image, so there is no need to keep them both around. Let's free up some space by removing ubuntu:trusty and just keeping ubuntu:latest. We do this by using the docker rmi command, as follows: $ docker rmi ubuntu:trusty If you issue the docker images command now, you will see that ubuntu:trusty no longer shows up in your images list and has been removed. Now, you can remove machines based on their image ID as well. But be careful while you do so; in this scenario, not only will you remove ubuntu:trusty, but you will also remove ubuntu:latest as they have the same image ID. Manipulating the Docker images We have gone over the images and know how to obtain and manipulate them in some ways. Next, we are going to take a look at what it takes to fire them up and manipulate them. This is the part where the images become containers! Let's first go over the basics of the docker run command and how to run containers. We will cover some basic docker run items in this article. So, let's just look at how to get images up, running, and turned into containers. The most basic way to run a container is as follows: $ docker run -i -t <image_name>:<tag> /bin/bash Upon closer inspection of the earlier command, we start off with the docker run command, followed by two switches: -i and -t. The -i gives us an interactive shell into the running container, the -t will allocate a pseudo-tty that, while using interactive processes, must be used together with the -i switch. You can also use switches together; for example, -it is commonly used for these two switches. This will help you test the container to see how it operates before running it as a daemon. Once you are comfortable with your container, you can test how it operates in the daemon mode: $ docker run -d <image_name>:<tag> If the container is set up correctly and has an entry point setup, you should be able to see the running container by issuing the docker ps command. You will see something similar to the following: $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES cc1fefcfa098 ubuntu:14.10 "/bin/bash" 3 seconds ago Up 3 seconds boring_mccarthy Based on the earlier command, we get a lot of other important information indicating that the container is running. We can see the container ID, the image name that is running, the command that is running to keep the image alive, when the container started, its current status, if any ports were exposed they would be listed here, as well as the name given to the container. Now, these names are random, unless it is specified otherwise by the --name= switch. You can also the expose the ports on your containers by using the -p switch as follows: $ docker run -d -p <host_port>:<container_port> <image>:<tag> $ docker run -d -p 8080:80 ubuntu:14.10 This will run the ubuntu 14.10 container in the demonized mode, exposing port 8080 on the Docker host to port 80 on the running container: CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 55cfdcb6beb6 ubuntu:14.10 "/bin/bash" 2 seconds ago Up 2 seconds 0.0.0.0:8080->80/tcp babbage Now, there will come a time when containers don't want to behave. For this, you can see the issues you have by using the docker logs command. The command is very straightforward. You specify the container you want to see the logs off. For this command, you need to use the container ID or the name of the container from the docker ps output: $ docker logs 55cfdcb6beb6 Or: $ docker logs babbage You can also get this ID when you first initiate the docker run command: $ docker run -d ubuntu:14.10 /bin/bash da92261485db98c7463fffadb43e3f684ea9f47949f287f92408fd0f3e4f2bad Stopping containers Now, let's take a look at how we can stop these containers. For various reasons, we would want to do this. There are a few commands we could use; they are docker kill, docker stop, docker pause, and docker unpause. Let's cover them briefly as they are fairly straightforward. First, let's look at the difference between docker kill and docker stop. The docker kill command will do just that—kill the container immediately. For a graceful shutdown of the container, you would want to use the docker stop command. Mostly, when you are testing, you will be using docker kill. When you're in your production environments, you will want to use docker stop to ensure you don't corrupt any data you might have in the Docker volumes. The commands are used exactly like the docker logs command, where you can use the container ID, the random name given to the container, or the one you might specify with the --name= switch. Now, let's take a dive into how we can execute some commands, view information on our running containers, and manipulate them in a small sense. The first thing we want to take a look at, which will make things a little easier with the upcoming commands, is the docker rename command. With the docker rename command, we can change the name that has been randomly generated for the container. When we performed the docker run command, a random name was assigned to our container; most times, these names are fine. But if you are looking for an easy way to manage the containers, a name can be sometimes easier to remember. For this, you can use the docker rename command as follows: $ docker rename <current_container_name> <new_container_name> Now that we have an easily recognizable and rememberable name, let's take a peek inside our containers with the docker stats and docker top commands, taking them in order: $ docker stats <container_name> CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O web1 0.00% 1.016 MB/2.099 GB 0.05% 0 B/0 B The other command docker top provides a list of all running processes inside the container. Again, we can use the name of the container to pull the information: $ docker top <container_name> We will receive an output similar to the following one based on what processes are running inside the container: UID PID PPID C STIME TTY TIME CMD root 8057 1380 0 13:02 pts/0 00:00:00 /bin/bash We can see who is running the process (in this case, the root user), the command being run (in this case, /bin/bash), as well as the other information that might be useful. Lastly, let's cover how we can remove the containers. The same way we looked at removing images earlier with the docker rmi command, we can use the docker rm command to remove unwanted containers. This is useful if you want to reuse a name you provided to a container: $ docker rm <container_name> Summary In this article, we have gone over the basics of what Docker is and how it is compared to typical virtual machines. We looked at the Dockerfile structure and the networking and linking of containers. We went over the installers, how they operate on different operating systems, and how to control them through the command line. We briefly looked at the latest Docker addition Kitematic for those interested in a GUI version for Windows or OS X. Then, we took a small but deep dive into the basic Docker commands to get you started. Resources for Article: Further resources on this subject: Introduction to Docker[article] Docker in Production[article] Speeding Vagrant Development With Docker[article]
Read more
  • 0
  • 0
  • 50741

article-image-developing-basic-site-nodejs-and-express
Packt
17 Feb 2016
21 min read
Save for later

Developing a Basic Site with Node.js and Express

Packt
17 Feb 2016
21 min read
In this article, we will continue with the Express framework. It's one of the most popular frameworks available and is certainly a pioneering one. Express is still widely used and several developers use it as a starting point. (For more resources related to this topic, see here.) Getting acquainted with Express Express (http://expressjs.com/) is a web application framework for Node.js. It is built on top of Connect (http://www.senchalabs.org/connect/), which means that it implements middleware architecture. In the previous chapter, when exploring Node.js, we discovered the benefit of such a design decision: the framework acts as a plugin system. Thus, we can say that Express is suitable for not only simple but also complex applications because of its architecture. We may use only some of the popular types of middleware or add a lot of features and still keep the application modular. In general, most projects in Node.js perform two functions: run a server that listens on a specific port, and process incoming requests. Express is a wrapper for these two functionalities. The following is basic code that runs the server: var http = require('http'); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('Hello Worldn'); }).listen(1337, '127.0.0.1'); console.log('Server running at http://127.0.0.1:1337/'); var http = require('http'); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('Hello Worldn'); }).listen(1337, '127.0.0.1'); console.log('Server running at http://127.0.0.1:1337/'); This is an example extracted from the official documentation of Node.js. As shown, we use the native module http and run a server on the port 1337. There is also a request handler function, which simply sends the Hello world string to the browser. Now, let's implement the same thing but with the Express framework, using the following code: var express = require('express'); var app = express(); app.get("/", function(req, res, next) { res.send("Hello world"); }).listen(1337); console.log('Server running at http://127.0.0.1:1337/'); It's pretty much the same thing. However, we don't need to specify the response headers or add a new line at the end of the string because the framework does it for us. In addition, we have a bunch of middleware available, which will help us process the requests easily. Express is like a toolbox. We have a lot of tools to do the boring stuff, allowing us to focus on the application's logic and content. That's what Express is built for: saving time for the developer by providing ready-to-use functionalities. Installing Express There are two ways to install Express. We'll will start with the simple one and then proceed to the more advanced technique. The simpler approach generates a template, which we may use to start writing the business logic directly. In some cases, this can save us time. From another viewpoint, if we are developing a custom application, we need to use custom settings. We can also use the boilerplate, which we get with the advanced technique; however, it may not work for us. Using package.json Express is like every other module. It has its own place in the packages register. If we want to use it, we need to add the framework in the package.json file. The ecosystem of Node.js is built on top of the Node Package Manager. It uses the JSON file to find out what we need and installs it in the current directory. So, the content of our package.json file looks like the following code: { "name": "projectname", "description": "description", "version": "0.0.1", "dependencies": { "express": "3.x" } } These are the required fields that we have to add. To be more accurate, we have to say that the mandatory fields are name and version. However, it is always good to add descriptions to our modules, particularly if we want to publish our work in the registry, where such information is extremely important. Otherwise, the other developers will not know what our library is doing. Of course, there are a bunch of other fields, such as contributors, keywords, or development dependencies, but we will stick to limited options so that we can focus on Express. Once we have our package.json file placed in the project's folder, we have to call npm install in the console. By doing so, the package manager will create a node_modules folder and will store Express and its dependencies there. At the end of the command's execution, we will see something like the following screenshot: The first line shows us the installed version, and the proceeding lines are actually modules that Express depends on. Now, we are ready to use Express. If we type require('express'), Node.js will start looking for that library inside the local node_modules directory. Since we are not using absolute paths, this is normal behavior. If we miss running the npm install command, we will be prompted with Error: Cannot find module 'express'. Using a command-line tool There is a command-line instrument called express-generator. Once we run npm install -g express-generator, we will install and use it as every other command in our terminal. If you use the framework inseveral projects, you will notice that some things are repeated. We can even copy and paste them from one application to another, and this is perfectly fine. We may even end up with our own boiler plate and can always start from there. The command-line version of Express does the same thing. It accepts few arguments and based on them, creates a skeleton for use. This can be very handy in some cases and will definitely save some time. Let's have a look at the available arguments: -h, --help: This signifies output usage information. -V, --version: This shows the version of Express. -e, --ejs: This argument adds the EJS template engine support. Normally, we need a library to deal with our templates. Writing pure HTML is not very practical. The default engine is set to JADE. -H, --hogan: This argument is Hogan-enabled (another template engine). -c, --css: If wewant to use the CSS preprocessors, this option lets us use LESS(short forLeaner CSS) or Stylus. The default is plain CSS. -f, --force: This forces Express to operate on a nonempty directory. Let's try to generate an Express application skeleton with LESS as a CSS preprocessor. We use the following line of command: express --css less myapp A new myapp folder is created with the file structure, as seen in the following screenshot: We still need to install the dependencies, so cd myapp && npm install is required. We will skip the explanation of the generated directories for now and will move to the created app.js file. It starts with initializing the module dependencies, as follows: var express = require('express'); var path = require('path'); var favicon = require('static-favicon'); var logger = require('morgan'); var cookieParser = require('cookie-parser'); var bodyParser = require('body-parser'); var routes = require('./routes/index'); var users = require('./routes/users'); var app = express(); Our framework is express, and path is a native Node.js module. The middleware are favicon, logger, cookieParser, and bodyParser. The routes and users are custom-made modules, placed in local for the project folders. Similarly, as in the Model-View-Controller(MVC) pattern, these are the controllers for our application. Immediately after, an app variable is created; this represents the Express library. We use this variable to configure our application. The script continues by setting some key-value pairs. The next code snippet defines the path to our views and the default template engine: app.set('views', path.join(__dirname, 'views')); app.set('view engine', 'jade'); The framework uses the methods set and get to define the internal properties. In fact, we may use these methods to define our own variables. If the value is a Boolean, we can replace set and get with enable and disable. For example, see the following code: app.set('color', 'red'); app.get('color'); // red app.enable('isAvailable'); The next code adds middleware to the framework. Wecan see the code as follows: app.use(favicon()); app.use(logger('dev')); app.use(bodyParser.json()); app.use(bodyParser.urlencoded()); app.use(cookieParser()); app.use(require('less-middleware')({ src: path.join(__dirname, 'public') })); app.use(express.static(path.join(__dirname, 'public'))); The first middleware serves as the favicon of our application. The second is responsible for the output in the console. If we remove it, we will not get information about the incoming requests to our server. The following is a simple output produced by logger: GET / 200 554ms - 170b GET /stylesheets/style.css 200 18ms - 110b The json and urlencoded middleware are related to the data sent along with the request. We need them because they convert the information in an easy-to-use format. There is also a middleware for the cookies. It populates the request object, so we later have access to the required data. The generated app uses LESS as a CSS preprocessor, and we need to configure it by setting the directory containing the .less files. Eventually, we define our static resources, which should be delivered by the server. These are just few lines, but we've configured the whole application. We may remove or replace some of the modules, and the others will continue working. The next code in the file maps two defined routes to two different handlers, as follows: app.use('/', routes); app.use('/users', users); If the user tries to open a missing page, Express still processes the request by forwarding it to the error handler, as follows: app.use(function(req, res, next) { var err = new Error('Not Found'); err.status = 404; next(err); }); The framework suggests two types of error handling:one for the development environment and another for the production server. The difference is that the second one hides the stack trace of the error, which should be visible only for the developers of the application. As we can see in the following code, we are checking the value of the env property and handling the error differently: // development error handler if (app.get('env') === 'development') { app.use(function(err, req, res, next) { res.status(err.status || 500); res.render('error', { message: err.message, error: err }); }); } // production error handler app.use(function(err, req, res, next) { res.status(err.status || 500); res.render('error', { message: err.message, error: {} }); }); At the end, the app.js file exports the created Express instance, as follows: module.exports = app; To run the application, we need to execute node ./bin/www. The code requires app.js and starts the server, which by default listens on port 3000. #!/usr/bin/env node var debug = require('debug')('my-application'); var app = require('../app'); app.set('port', process.env.PORT || 3000); var server = app.listen(app.get('port'), function() { debug('Express server listening on port ' + server.address().port); }); The process.env declaration provides an access to variables defined in the current development environment. If there is no PORT setting, Express uses 3000 as the value. The required debug module uses a similar approach to find out whether it has to show messages to the console. Managing routes The input of our application is the routes. The user visits our page at a specific URL and we have to map this URL to a specific logic. In the context of Express, this can be done easily, as follows: var controller = function(req, res, next) { res.send("response"); } app.get('/example/url', controller); We even have control over the HTTP's method, that is, we are able to catch POST, PUT, or DELETE requests. This is very handy if we want to retain the address path but apply a different logic. For example, see the following code: var getUsers = function(req, res, next) { // ... } var createUser = function(req, res, next) { // ... } app.get('/users', getUsers); app.post('/users', createUser); The path is still the same, /users, but if we make a POST request to that URL, the application will try to create a new user. Otherwise, if the method is GET, it will return a list of all the registered members. There is also a method, app.all, which we can use to handle all the method types at once. We can see this method in the following code snippet: app.all('/', serverHomePage); There is something interesting about the routing in Express. We may pass not just one but many handlers. This means that we can create a chain of functions that correspond to one URL. For example, it we need to know if the user is logged in, there is a module for that. We can add another method that validates the current user and attaches a variable to the request object, as follows: var isUserLogged = function(req, res, next) { req.userLogged = Validator.isCurrentUserLogged(); next(); } var getUser = function(req, res, next) { if(req.userLogged) { res.send("You are logged in. Hello!"); } else { res.send("Please log in first."); } } app.get('/user', isUserLogged, getUser); The Validator class is a class that checks the current user's session. The idea is simple: we add another handler, which acts as an additional middleware. After performing the necessary actions, we call the next function, which passes the flow to the next handler, getUser. Because the request and response objects are the same for all the middlewares, we have access to the userLogged variable. This is what makes Express really flexible. There are a lot of great features available, but they are optional. At the end of this chapter, we will make a simple website that implements the same logic. Handling dynamic URLs and the HTML forms The Express framework also supports dynamic URLs. Let's say we have a separate page for every user in our system. The address to those pages looks like the following code: /user/45/profile Here, 45 is the unique number of the user in our database. It's of course normal to use one route handler for this functionality. We can't really define different functions for every user. The problem can be solved by using the following syntax: var getUser = function(req, res, next) { res.send("Show user with id = " + req.params.id); } app.get('/user/:id/profile', getUser); The route is actually like a regular expression with variables inside. Later, that variable is accessible in the req.params object. We can have more than one variable. Here is a slightly more complex example: var getUser = function(req, res, next) { var userId = req.params.id; var actionToPerform = req.params.action; res.send("User (" + userId + "): " + actionToPerform) } app.get('/user/:id/profile/:action', getUser); If we open http://localhost:3000/user/451/profile/edit, we see User (451): edit as a response. This is how we can get a nice looking, SEO-friendly URL. Of course, sometimes we need to pass data via the GET or POST parameters. We may have a request like http://localhost:3000/user?action=edit. To parse it easily, we need to use the native url module, which has few helper functions to parse URLs: var getUser = function(req, res, next) { var url = require('url'); var url_parts = url.parse(req.url, true); var query = url_parts.query; res.send("User: " + query.action); } app.get('/user', getUser); Once the module parses the given URL, our GET parameters are stored in the .query object. The POST variables are a bit different. We need a new middleware to handle that. Thankfully, Express has one, which is as follows: app.use(express.bodyParser()); var getUser = function(req, res, next) { res.send("User: " + req.body.action); } app.post('/user', getUser); The express.bodyParser() middleware populates the req.body object with the POST data. Of course, we have to change the HTTP method from .get to .post or .all. If we want to read cookies in Express, we may use the cookieParser middleware. Similar to the body parser, it should also be installed and added to the package.json file. The following example sets the middleware and demonstrates its usage: var cookieParser = require('cookie-parser'); app.use(cookieParser('optional secret string')); app.get('/', function(req, res, next){ var prop = req.cookies.propName }); Returning a response Our server accepts requests, does some stuff, and finally, sends the response to the client's browser. This can be HTML, JSON, XML, or binary data, among others. As we know, by default, every middleware in Express accepts two objects, request and response. The response object has methods that we can use to send an answer to the client. Every response should have a proper content type or length. Express simplifies the process by providing functions to set HTTP headers and sending content to the browser. In most cases, we will use the .send method, as follows: res.send("simple text"); When we pass a string, the framework sets the Content-Type header to text/html. It's great to know that if we pass an object or array, the content type is application/json. If we develop an API, the response status code is probably going to be important for us. With Express, we are able to set it like in the following code snippet: res.send(404, 'Sorry, we cannot find that!'); It's even possible to respond with a file from our hard disk. If we don't use the framework, we will need to read the file, set the correct HTTP headers, and send the content. However, Express offers the .sendfile method, which wraps all these operations as follows: res.sendfile(__dirname + "/images/photo.jpg"); Again, the content type is set automatically; this time it is based on the filename's extension. When building websites or applications with a user interface, we normally need to serve an HTML. Sure, we can write it manually in JavaScript, but it's good practice to use a template engine. This means we save everything in external files and the engine reads the markup from there. It populates them with some data and, at the end, provides ready-to-show content. In Express, the whole process is summarized in one method, .render. However, to work properly, we have to instruct the framework regarding which template engine to use. We already talked about this in the beginning of this chapter. The following two lines of code, set the path to our views and the template engine: app.set('views', path.join(__dirname, 'views')); app.set('view engine', 'jade'); Let's say we have the following template ( /views/index.jade ): h1= title p Welcome to #{title} Express provides a method to serve templates. It accepts the path to the template, the data to be applied, and a callback. To render the previous template, we should use the following code: res.render("index", {title: "Page title here"}); The HTML produced looks as follows: <h1>Page title here</h1><p>Welcome to Page title here</p> If we pass a third parameter, function, we will have access to the generated HTML. However, it will not be sent as a response to the browser. The example-logging system We've seen the main features of Express. Now let's build something real. The next few pages present a simple website where users can read only if they are logged in. Let's start and set up the application. We are going to use Express' command-line instrument. It should be installed using npm install -g express-generator. We create a new folder for the example, navigate to it via the terminal, and execute express --css less site. A new directory, site, will be created. If we go there and run npm install, Express will download all the required dependencies. As we saw earlier, by default, we have two routes and two controllers. To simplify the example, we will use only the first one: app.use('/', routes). Let's change the views/index.jade file content to the following HTML code: doctype html html head title= title link(rel='stylesheet', href='/stylesheets/style.css') body h1= title hr p That's a simple application using Express. Now, if we run node ./bin/www and open http://127.0.0.1:3000, we will see the page. Jade uses indentation to parse our template. So, we should not mix tabs and spaces. Otherwise, we will get an error. Next, we need to protect our content. We check whether the current user has a session created; if not, a login form is shown. It's the perfect time to create a new middleware. To use sessions in Express, install an additional module: express-session. We need to open our package.json file and add the following line of code: "express-session": "~1.0.0" Once we do that, a quick run of npm install will bring the module to our application. All we have to do is use it. The following code goes to app.js: var session = require('express-session'); app.use(session({ secret: 'app', cookie: { maxAge: 60000 }})); var verifyUser = function(req, res, next) { if(req.session.loggedIn) { next(); } else { res.send("show login form"); } } app.use('/', verifyUser, routes); Note that we changed the original app.use('/', routes) line. The session middleware is initialized and added to Express. The verifyUser function is called before the page rendering. It uses the req.session object, and checks whether there is a loggedIn variable defined and if its value is true. If we run the script again, we will see that the show login form text is shown for every request. It's like this because no code sets the session exactly the way we want it. We need a form where users can type their username and password. We will process the result of the form and if the credentials are correct, the loggedIn variable will be set to true. Let's create a new Jade template, /views/login.jade: doctype html html head title= title link(rel='stylesheet', href='/stylesheets/style.css') body h1= title hr form(method='post') label Username: br input(type='text', name='username') br label Password: br input(type='password', name='password') br input(type='submit') Instead of sending just a text with res.send("show login form"); we should render the new template, as follows: res.render("login", {title: "Please log in."}); We choose POST as the method for the form. So, we need to add the middleware that populates the req.body object with the user's data, as follows: app.use(bodyParser()); Process the submitted username and password as follows: var verifyUser = function(req, res, next) { if(req.session.loggedIn) { next(); } else { var username = "admin", password = "admin"; if(req.body.username === username && req.body.password === password) { req.session.loggedIn = true; res.redirect('/'); } else { res.render("login", {title: "Please log in."}); } } } The valid credentials are set to admin/admin. In a real application, we may need to access a database or get this information from another place. It's not really a good idea to place the username and password in the code; however, for our little experiment, it is fine. The previous code checks whether the passed data matches our predefined values. If everything is correct, it sets the session, after which the user is forwarded to the home page. Once you log in, you should be able to log out. Let's add a link for that just after the content on the index page (views/index.jade ): a(href='/logout') logout Once users clicks on this link, they will be forward to a new page. We just need to create a handler for the new route, remove the session, and forward them to the index page where the login form is reflected. Here is what our logging out handler looks like: // in app.js var logout = function(req, res, next) { req.session.loggedIn = false; res.redirect('/'); } app.all('/logout', logout); Setting loggedIn to false is enough to make the session invalid. The redirect sends users to the same content page they came from. However, this time, the content is hidden and the login form pops up. Summary In this article, we learned about one of most widely used Node.js frameworks, Express. We discussed its fundamentals, how to set it up, and its main characteristics. The middleware architecture, which we mentioned in the previous chapter, is the base of the library and gives us the power to write complex but, at the same time, flexible applications. The example we used was a simple one. We required a valid session to provide page access. However, it illustrates the usage of the body parser middleware and the process of registering the new routes. We also updated the Jade templates and saw the results in the browser. For more information on Node.js Refer to the following URLs: https://www.packtpub.com/web-development/instant-nodejs-starter-instant https://www.packtpub.com/web-development/learning-nodejs-net-developers https://www.packtpub.com/web-development/nodejs-essentials Resources for Article: Further resources on this subject: Writing a Blog Application with Node.js and AngularJS [article] Testing in Node and Hapi [article] Learning Node.js for Mobile Application Development [article]
Read more
  • 0
  • 0
  • 14618
Modal Close icon
Modal Close icon