Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
Packt
21 Jul 2015
6 min read
Save for later

WildFly – the Basics

Packt
21 Jul 2015
6 min read
In this article, Luigi Fugaro, author of the book Wildfly Cookbook says that the JBoss.org community is a huge community, where people all over the world develop, test, and document pieces of code. There are a lot of projects in there, not just JBoss AS, which is now WildFly. I can mention a few: Infinispan, Undertow, PicketLink, Arquillian, HornetQ, RESTeasy, AeroGear, and Vert.X. For a complete list of all projects, visit http://www.jboss.org/projects/. (For more resources related to this topic, see here.) Despite marketing reasons, as there is no preferred project, the community wanted to change the name of the JBoss AS project to something different, which would not collide with the community name. There was also another reason, which was about the JBoss Red Hat supported version named JBoss Enterprise Application Platform (EAP). This was another point toward replacing the JBoss AS name. Software prerequisites WildFly runs on top of the Java platform. It needs at least Java Runtime Environment (JRE) version 1.7 to run (further reference to version 1.7 and 7 should be considered equal; the same applies to 1.8 and 8 as well), but it also works perfectly with latest JRE version 8. As we will also need to compile and build Java web applications, we will need the Java Development Kit (JDK), which gives the necessary tools to work with the Java source code. In the JDK panorama, we can find the Oracle JDK, developed and maintained by Oracle, and OpenJDK, which relies on contributions by the community. Nevertheless, since April 2015, Oracle no longer posts updates of Java SE 7 to its public download sites as mentioned at http://www.oracle.com/technetwork/java/javase/downloads/eol-135779.html. Also, bear in mind that Java Critical Patch Update is released on a quarterly basis; therefore, for reasons of stability and feature support, we will use the Oracle JDK 8, which is freely available for download at http://www.oracle.com/technetwork/java/javase/downloads/index.html. While writing the book, the latest stable Oracle JDK is version 1.8.0_31 (8u31). From here on, every reference to Java Virtual Machine (JVM), Java, JRE, and JDK will be intended to be to Oracle JDK 1.8.0_31. To keep things simple, if you don't mind, use that same version. In addition to the JDK, we will need Apache Maven 3, which is a build tool for Java projects. It is freely available for download at http://maven.apache.org/download.cgi. A generic download link can be found at http://www.us.apache.org/dist/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz. Downloading and installing WildFly In this article, you will learn how to get and install WildFly. As always in the open source world, you can do the same thing in different ways. WildFly can be installed using your preferred software manager or by downloading the bundle provided by the http://wildfly.org/ site. Going as per the JDK, we will choose the second way. Getting ready Just open your favorite browser and point it to http://wildfly.org/downloads/. You should see a page similar to this one: WildFly's download page Now download the latest version into our WFC folder. How to do it… Once the download is complete, open a terminal and extract its content into the WFC folder, executing the following commands: $ cd ~/WFC && tar zx wildfly-9.0.0.Beta2.tar.gz The preceding command will first point to our WildFly Cookbook folder; it will then extract the WildFly archive from it. Listing our WFC folder, we should find the newly created WildFly folder named wildfly-9.0.0.Beta2. To better remember and handle WildFly's installation directory, rename it to wildfly, as follows: $ cd ~/WFC && mv wildfly-9.0.0.Beta2 wildfly By the way, WildFly can be also installed using YUM, Fedora's traditional software manager. In production environment, you will not place the WildFly installation directory in the home folder of a specific user; rather, you will place it in different paths, relative to the context you are working on. Now, we need to create the JBOSS_HOME environment variable, which is used by WildFly itself as base directory when it starts up (in the feature release, this was updated to WILDFLY_HOME). We will also create the WILDFLY_HOME environment variable, which we will use through the whole book to reference WildFly's installation directory. Thus, open the .bash_profile file, placed in your home folder, with your favorite text editor, and add the following directives: export JBOSS_HOME=~/WFC/wildfly export WILDFLY_HOME=$JBOSS_HOME For the changes to take effect, you can either log out and log back in, or just issue the following command: $ source ~/.bash_profile Your .bash_profile file should look as follows: Understanding the WildFly directory overview Now that we have finished installing WildFly, let's look into its folders. How to do it… Open your terminal and run the following command: $ cd $WILDFLY_HOME $ pwd && ls -la The output of your commands should be similar to the following image: WildFly's folders overview How it works The preceding image depicts WildFly's folder on the filesystem. Each is outlined in the following table: Folder's name Description appclient These are configuration files, deployment content, and writable areas used by the application client container run from this installation. bin These are the startup scripts; startup configuration files, and various command-line utilities, such as Vault, add-user, and Java diagnostic report available for the Unix and Windows environments. bin/client This contains a client JAR for use by non maven-based clients. docs/schema These are XML schema definition files. docs/examples/configs These are example configuration files representing specific use cases. domain These are configuration files, deployment content, and writable areas used by the domain mode processes run from this installation. modules WildFly is based on a modular class-loading architecture. The various modules used in the server are stored here. standalone These are configuration files, deployment content, and writable areas used by the single standalone server run from this installation. welcome-content This is the default Welcome Page content. In the preceding table, I've emphasized the domain and standalone folders, which are those that determine in which mode WildFly will run: standalone or domain. From here on, whenever mentioned, WildFly's home will be intended as $WILDFLY_HOME. Summary In this article, we covered the software prerequisites of WildFly, downloading and installing WildFly, and an overview of WildFly's folder structure. Resources for Article: Further resources on this subject: Various subsystem configurations [article] Creating a JSF composite component [article] Introducing PrimeFaces [article]
Read more
  • 0
  • 0
  • 12627

article-image-more-about-julia
Packt
21 Jul 2015
28 min read
Save for later

More about Julia

Packt
21 Jul 2015
28 min read
In this article by Malcolm Sherrington, author of the book Mastering Julia, we will see why write a book on Julia when the language is not yet reached the version v1.0 stage? It was the first question which needed to be addressed when deciding on the contents and philosophy behind the book. (For more resources related to this topic, see here.) Julia at the time as v0.2, it is now soon to achieve a stable v0.4 but already the blueprint for v0.5 is being touted. There were some common misconceptions which I wished to address: It is a language designed for Geeks It's main attribute, possibly only, was its speed It was a scientific language primarily a MATLAB clone It is not as easy to use compared with the alternatives such Python and R There are not enough library support to tackle Enterprise Solutions In fact none of these apply to Julia. True it is a relatively young programming language. The initial design work on Julia project began at MIT in August 2009, by February 2012 it became open source. It is largely the work of three developers Stefan Karpinski, Jeff Bezanson, and Viral Shah. Initially Julia was envisaged by the designers as a scientific language sufficiently rapid to make the necessity of modeling in an interactive language and subsequently having to redevelop in a compiled language, such as C or Fortran. To achieve this, Julia code would need to be transformed to the underlying machine code of the computer but using the low level virtual machine (LLVM) compilation system, at the time itself a new project. This was a masterstroke. LLVM is now the basis of a variety of systems, the Apple C compiler (clang) uses it, Google V8 JavaScript engine and Mozilla's Rust language use it and Python is attempting to achieve significant increases in speed with its numba module. In Julia LLVM always works, there are no exceptions because it has to. When launched possibly the community itself saw Julia as a replacement for MATLAB but that proved not to be just case. The syntax of Julia is similar to MATLAB, so much so that anyone competent in the latter can easily learn Julia but, it is a much richer language with many significant differences. The task of the book was to focus on these. In particular my target audience was the data scientist and programmer analyst but have sufficient for the "jobbing" C++ and Java programmer. Julia's features The Julia programming language is free and open source (MIT licensed) and the source is available on GitHub. To the veteran programmer it has a look and feel similar to MATLAB. Blocks created by for, while, and if statements are all terminated by end rather than by endfor, endwhile, and endif or using the familiar {} style syntax. However it is not a MATLAB clone and sources written for MATLAB will not run on Julia. The following are some of the Julia's features: Designed for parallelism and distributed computation (multicore and cluster) C functions called directly (no wrappers or special APIs needed) Powerful shell-like capabilities for managing other processes Lisp-like macros and other metaprogramming facilities User-defined types are as fast and compact as built-ins LLVM-based, just-in-time (JIT) compiler that allows Julia to approach and often match the performance of C/C++ An extensive mathematical function library (written in Julia) Integrated mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing Julia's core is implemented in C and C++, its parser in Scheme, and the LLVM compiler framework is used for JIT generation of machine code. The standard library is written in Julia itself using the Node.js's libuv library for efficient, cross-platform I/O. Julia has a rich language of types for constructing and describing objects that can also optionally be used to make type declarations. The ability to define function behavior across many combinations of argument types via multiple dispatch which is a key cornerstone of the language design. Julia can utilize code in other programming languages by a directly calling routines written in C or Fortran and stored in shared libraries or DLLs. This is a feature of the language syntax. In addition it is possible to interact with Python via the PyCall and this is used in the implementation of the IJulia programming environment. A quick look at some Julia To get feel for programming in Julia let's look at an example which uses random numbers to price an Asian derivative on the options market. A share option is the right to purchase a specific stock at a nominated price sometime in the future. The person granting the option is called the grantor and the person who has the benefit of the option is the beneficiary. At the time the option matures the beneficiary may choose to exercise the option if it is in his/her interest the grantor is then obliged to complete the contract. The following snippet is part of the calculation and computes a single trial and uses the Winston package to display the trajectory: using Winston; S0 = 100;     # Spot price K   = 102;     # Strike price r   = 0.05;     # Risk free rate q   = 0.0;      # Dividend yield v   = 0.2;     # Volatility tma = 0.25;     # Time to maturity T = 100;       # Number of time steps dt = tma/T;     # Time increment S = zeros(Float64,T) S[1] = S0; dW = randn(T)*sqrt(dt); [ S[t] = S[t-1] * (1 + (r - q - 0.5*v*v)*dt + v*dW[t] + 0.5*v*v*dW[t]*dW[t]) for t=2:T ]   x = linspace(1, T, length(T)); p = FramedPlot(title = "Random Walk, drift 5%, volatility 2%") add(p, Curve(x,S,color="red")) display(p) Plot one track so only compute a vector S of T elements. The stochastic variance dW is computed in a single vectorized statement. The track S is computed using a "list comprehension". The array x is created using linspace to define a linear absicca for the plot. Using the Winston package to produce the display, which only requires 3 statements: to define the plot space, add a curve to it and display the plot as shown in the following figure: Generating Julia sets Both the Mandelbrot set and Julia set (for a given constant z0) are the sets of all z (complex number) for which the iteration z → z*z + z0 does not diverge to infinity. The Mandelbrot set is those z0 for which the Julia set is connected. We create a file jset.jl and its contents defines the function to generate a Julia set. functionjuliaset(z, z0, nmax::Int64) for n = 1:nmax if abs(z) > 2 (return n-1) end z = z^2 + z0 end returnnmax end Here z and z0 are complex values and nmax is the number of trials to make before returning. If the modulus of the complex number z gets above 2 then it can be shown that it will increase without limit. The function returns the number of iterations until the modulus test succeeds or else nmax. Also we will write a second file pgmfile.jl to handling displaying the Julia set. functioncreate_pgmfile(img, outf::String) s = open(outf, "w") write(s, "P5n") n, m = size(img) write(s, "$m $n 255n") for i=1:n, j=1:m    p = img[i,j]    write(s, uint8(p)) end close(s) end It is quite easy to create a simple disk file using the portable bitmap (netpbm) format. This consists of a "magic" number P1 - P6, followed on the next line the image height, width and a maximum color value which must be greater than 0 and less than 65536; all of these are ASCII values not binary. Then follows the image values (height x width) which make be ASCII for P1, P2, P3 or binary for P4, P5, P6. There are three different types of portable bitmap; B/W (P1/P4), Grayscale (P2/P5), and Colour (P3/P6). The function create_pgmfile() creates a binary grayscale file (magic number = P5) from an image matrix where the values are written as Uint8. Notice that the for loop defines the indices i, j in a single statement with correspondingly only one end statement. The image matrix is output in column order which matches the way it is stored in Julia. So the main program looks like: include("jset.jl") include("pgmfile.jl") h = 400; w = 800; M = Array(Int64, h, w); c0 = -0.8+0.16im; pgm_name = "juliaset.pgm";   t0 = time(); for y=1:h, x=1:w c = complex((x-w/2)/(w/2), (y-h/2)/(w/2)) M[y,x] = juliaset(c, c0, 256) end t1 = time(); create_pgmfile(M, pgm_name); print("Written $pgm_namenFinished in $(t1-t0) seconds.n"); This is how the previous code works: We define an array N of type Int64 to hold the return values from the juliaset function. The constant c0 is arbitrary, different values of c0 will produce different Julia sets. The starting complex number is constructed from the (x,y) coordinates and scaled to the half width. We 'cheat' a little by defining the maximum number of iterations as 256. Because we are writing byte values (Uint8) and value which remains bounded will be 256 and since overflow values wrap around will be output as 0 (black). The script defines a main program in a function jmain(): julia>jmain Written juliaset.pgm Finished in 0.458 seconds # => (on my laptop) Julia type system and multiple dispatch Julia is not an object-oriented language so when we speak of object they are a different sort of data structure to those in traditional O-O languages. Julia does not allow types to have methods or so it is not possible to create subtypes which inherit methods. While this might seem restrictive it does permit methods to use a multiple dispatch call structure rather than the single dispatch system employed in object orientated ones. Coupled with Julia's system of types, multiple dispatch is extremely powerful. Moreover it is a more logical approach for data scientists and scientific programmers and if for no other reason exposing this to you the analyst/programmer is a reason to use Julia. A function is an object that maps a tuple of arguments to a return value. In the case where the arguments are not valid the function should handle the situation cleanly by catching the error and handling it or throw an exception. When a function is applied to its argument tuple it selects the appropriate method and this process is called dispatch. In traditional object-oriented languages the method chosen is based only on the object type and this paradigm is termed single-dispatch. With Julia the combination of all a functions argument determine which method is chosen, this is the basis of multiple dispatch. To the scientific programmer this all seems very natural. It makes little sense in most circumstances for one argument to be more important than the others. In Julia all functions and operators, which are also functions, use multiple dispatch. The methods chosen for any combination of operators. For example looking at the methods of the power operator (^): julia> methods(^) # 43 methods for generic function "^": ^(x::Bool,y::Bool) at bool.jl:41 ^(x::BigInt,y::Bool) at gmp.jl:314 ^(x::Integer,y::Bool) at bool.jl:42     ^(A::Array{T,2},p::Number) at linalg/dense.jl:180 ^(::MathConst{:e},x::AbstractArray{T,2}) at constants.jl:87 We can see that there are 43 methods for ^ and the file and line where the methods is defined are given too. Because any untyped argument is designed as type Any, it is possible to define a set of function methods such that there is no unique most specific method applicable to some combinations of arguments. julia> pow(a,b::Int64) = a^b; julia> pow(a::Float64,b) = a^b; Warning: New definition pow(Float64,Any) at /Applications/JuliaStudio.app/Contents/Resources/Console/Console.jl:1 is ambiguous with: pow(Any,Int64) at /Applications/JuliaStudio.app/Contents/Resources/Console/Console.jl:1. To fix, define pow(Float64,Int64) before the new definition. A call of pow(3.5, 2) can be handled by either function. In this case they will give the same result by only because of the function bodies and Julia can't know that. Working with Python The ability for Julia with call code written in other languages is one of its main strengths. From its inception Julia had to play "catchup" and a key feature was it makes calling code written in C, and by implication Fortran, very easy. The code to be called must be available as a shared library rather than just a standalone object file. There is a zero-overhead in the call, meaning that it is reduced to a single machine instruction in the LLVM compilation. In addition Python models can be accessed via the PyCall package which provides a @pyimport macro that mimics a Python import statement. This imports a Python module and provides Julia wrappers for all of the functions and constants including automatic conversion of types between Julia and Python. This work has led to the creation of an IJulia kernel to the IPython IDE which now is a principle component of the Jupyter project. In Pycall, type conversions are automatically performed for numeric, boolean, string, IO streams plus all tuples, arrays and dictionaries of these types. julia> using PyCall julia> @pyimport scipy.optimize as so julia> @pyimport scipy.integrate as si julia> so.ridder(x -> x*cos(x), 1, pi); # => 1.570796326795 julia> si.quad(x -> x*sin(x), 1, pi)[1]; # => 2.840423974650 In the preceding commands, the Python optimize and integrate modules are imported and functions in those modules called from the Julia REPL. One difference imposed on the package is that calls using the Python object notation are not possible from Julia, so these are referenced using an array-style notation po[:atr] rather po.atr, where po is a PyObject and atr is an attribute. It is also easy to use the Python matplotlib module to display simple (and complex) graphs. @pyimport matplotlib.pyplot as plt x = linspace(0,2*pi,1000); y = sin(3*x + 4*cos(2*x)); plt.plot(x, y, color="red", linewidth=2.0, linestyle="--") 1-element Array{Any,1}: PyObject<matplotlib.lines.Line2D object at 0x0000000027652358> plt.show() Notice that keywords can also be passed such as the color, line width and the preceding style. Simple statistics using dataframes Julia implements various approaches for handling data held on disk. These may be 'normal' files such as text files, CSV and other delimited files, or in SQL and NoSQL databases. Also there is an implementation of dataframe support similar to that provided in R and via the pandas module in Python. The following looks at the Apple share prices from 2000 to 200, using a CSV file with provides opening, closing, high and low prices together with trading volumes over the day. using DataFrames, StatsBase aapl = readtable("AAPL-Short.csv");   naapl = size(aapl)[1] m1 = int(mean((aapl[:Volume]))); # => 6306547 The data is read into a DataFrame and we can estimate the mean (m1). For the volume, it is possible to cast it as an integer as this makes more sense. We can do this by creating a weighting vector. using Match wts = zeros(naapl); for i in 1:naapl    dt = aapl[:Date][i]    wts[i] = @match dt begin          r"^2000" => 1.0          r"^2001" => 2.0          r"^2002" => 4.0          _       => 0.0    end end;   wv = WeightVec(wts); m2 = int(mean(aapl[:Volume], wv)); # => 6012863 Computing weighted statistical metrics it is possible to 'trim' off the outliers from each end of the data. Returning to the closing prices: mean(aapl[:Close]);         # => 37.1255 mean(aapl[:Close], wv);     # => 26.9944 trimmean(aapl[:Close], 0.1); # => 34.3951 trimean() is the trimmed mean where 5 percent is taken from each end. std(aapl[:Close]);           # => 34.1186 skewness(aapl[:Close])       # => 1.6911 kurtosis(aapl[:Close])       # => 1.3820 As well as second moments such as standard deviation, StatsBase provides a generic moments() function and specific instances based on these such as for skewness (third) and kurtosis (fourth). It is also possible to provide some summary statistics: summarystats(aapl[:Close])   Summary Stats: Mean:         37.125505 Minimum:     13.590000 1st Quartile: 17.735000 Median:       21.490000 3rd Quartile: 31.615000 Maximum:     144.190000 The first and third quartiles related to the 25 percent and 75 percent percentiles for a finer granularity we can use the percentile() function. percentile(aapl[:Close],5); # => 14.4855 percentile(aapl[:Close],95); # => 118.934 MySQL access using PyCall We have seen previously that Python can be used for plotting via the PyPlot package which interfaces with matplotlib. In fact the ability to easily call Python modules is a very powerful feature in Julia and we can use this as an alternative method for connecting to databases. Any database which can be manipulated by Python is also available to Julia. In particular since the DBD driver for MySQL is not fully DBT compliant, let's look this approach to running some queries. Our current MySQL setup already has the Chinook dataset loaded some we will execute a query to list the Genre table. In Python we will first need to download the MySQL Connector module. For Anaconda this needs to be using the source (independent) distribution, rather than a binary package and the installation performed using the setup.py file. The query (in Python) to list the Genre table would be: import mysql.connector as mc cnx = mc.connect(user="malcolm", password="mypasswd") csr = cnx.cursor() qry = "SELECT * FROM Chinook.Genre" csr.execute(qry) for vals in csr:    print(vals)   (1, u'Rock') (2, u'Jazz') (3, u'Metal') (4, u'Alternative & Punk') (5, u'Rock And Roll') ... ... csr.close() cnx.close() We can execute the same in Julia by using the PyCall to the mysql.connector module and the form of the coding is remarkably similar: using PyCall @pyimport mysql.connector as mc   cnx = mc.connect (user="malcolm", password="mypasswd"); csr = cnx[:cursor]() query = "SELECT * FROM Chinook.Genre" csr[:execute](query)   for vals in csr    id   = vals[1]    genre = vals[2]    @printf “ID: %2d, %sn” id genre end ID: 1, Rock ID: 2, Jazz ID: 3, Metal ID: 4, Alternative & Punk ID: 5, Rock And Roll ... ... csr[:close]() cnx[:close]() Note that the form of the call is a little different from the corresponding Python method, since Julia is not object-oriented the methods for a Python object are constructed as an array of symbols. For example the Python csr.execute(qry) routine is called in Julia as csr[:execute](qry). Also be aware that although Python arrays are zero-based this is translated to one-based by PyCall, so the first values is referenced as vals[1]. Scientific programming with Julia Julia was originally developed as a replacement for MATLAB with a focus on scientific programming. There are modules which are concerned with linear algebra, signal processing, mathematical calculus, optimization problems, and stochastic simulation. The following is a subject dear to my heart: the solution of differential equations. Differential equations are those which involve terms which involve the rates of change of variates as well as the variates themselves. They arise naturally in a number of fields, notably dynamics and when the changes are with respect to one dependent variable, often time, the systems are called ordinary differential equations. If more than a single dependent variable is involved, then they are termed partial differential equations. Julia supports the solution of ordinary differential equations thorough a couple of packages ODE and Sundials. The former (ODE) consists of routines written solely in Julia whereas Sundials is a wrapper package around a shared library. ODE exports a set of adaptive solvers; adaptive meaning that the 'step' size of the algorithm changes algorithmically to reduce the error estimate to be below a certain threshold. The calls take the form odeXY, where X is the order of the solver and Y the error control. ode23: Third order adaptive solver with third order error control ode45: Fourth order adaptive solver with fifth order error control ode78: Seventh order adaptive solver with eighth order error control To solve the explicit ODE defined as a vectorize set of equations dy/dt = F(t,y), all routines of which have the same basic form: (tout, yout) = odeXY(F, y0, tspan). As an example, I will look at it as a linear three-species food chain model where the lowest-level prey x is preyed upon by a mid-level species y, which, in turn, is preyed upon by a top level predator z. This is an extension of the Lotka-Volterra system from to three species. Examples might be three-species ecosystems such as mouse-snake-owl, vegetation-rabbits-foxes, and worm-sparrow-falcon. x' = a*x − b*x*y y' = −c*y + d*x*y − e*y*z z' = −f*z + g*y*z #for a,b,c,d,e,f g > 0 Where a, b, c, d are in the two-species Lotka-Volterra equations: e represents the effect of predation on species y by species z f represents the natural death rate of the predator z in the absence of prey g represents the efficiency and propagation rate of the predator z in the presence of prey This translates to the following set of linear equations: x[1] = p[1]*x[1] - p[2]*x[1]*x[2] x[2] = -p[3]*x[2] + p[4]*x[1]*x[2] - p[5]*x[2]*x[3] x[3] = -p[6]*x[3] + p[7]*x[2]*x[3] It is slightly over specified since one of the parameters can be removed by rescaling the timescale. We define the function F as follows: function F(t,x,p) d1 = p[1]*x[1] - p[2]*x[1]*x[2] d2 = -p[3]*x[2] + p[4]*x[1]*x[2] - p[5]*x[2]*x[3] d3 = -p[6]*x[3] + p[7]*x[2]*x[3] [d1, d2, d3] end This takes the time range, vectors of the independent variables and of the coefficients and returns a vector of the derivative estimates: p = ones(7); # Choose all parameters as 1.0 x0 = [0.5, 1.0, 2.0]; # Setup the initial conditions tspan = [0.0:0.1:10.0]; # and the time range Solve the equations by calling the ode23 routine. This returns a matrix of the solutions in a columnar order which we extract and display using Winston: (t,x) = ODE.ode23((t,x) -> F(t,x,pp), x0, tspan);   n = length(t); y1 = zeros(n); [y1[i] = x[i][1] for i = 1:n]; y2 = zeros(n); [y2[i] = x[i][2] for i = 1:n]; y3 = zeros(n); [y3[i] = x[i][3] for i = 1:n];   using Winston plot(t,y1,"b.",t,y2,"g-.",t,y3,"r--") This is shown in the following figure: Graphics with Gadlfy Julia now provides a vast array of graphics packages. The "popular" ones may be thought of as Winston, PyPlot and Gadfly and there is also an interface to the increasingly more popular online system Plot.ly. Gadfly is a large and complex package and provides great flexibility in the range and breadth of the visualizations possible in Julia. It is equivalent to the R module ggplot2 and similarly is based on the seminal work The Grammar of Graphics by Leland Wilkinson. The package was written by Daniel Jones and the source on GitHub contains numerous examples of visual displays together with the accompanying code. An entire text could be devoted just to Gadfly so I can only point out some of the main features here and encourage the reader interested in print standard graphics in Julia to refer to the online website at http://gadflyjl.org. The standard call is to the plot() function with creates a graph on the display device via a browser either directly or under the control of IJulia if that is being used as an IDE. It is possible to assign the result of plot() to a variable and invoke this using display(), In this way output can be written to files including: SVG, SVGJS/D3 PDF, and PNG. dd = plot(x = rand(10), y = rand(10)); draw(SVG(“random-pts.svg”, 15cm, 12cm) , dd); Notice that if writing to a backend, the display size is provided, this can be specified in units of cm and inch. Gadfly works well with C libraries of cairo, pango and, fontconfig installed. It will produce SVG and SVGJS graphics but for PNG, PostScript (PS) and PDF cairo is required. Also complex text layouts are more accurate when pango and fontconfig are available. The plot() call can operate on three different data sources: Dataframes Functions and expressions Arrays and collections Unless otherwise specified the type of graph produced is a scatter diagram. The ability to work directly with data frames is especially useful. To illustrate this let's look at the GCSE result set. Recall that this is available as part of the RDatasets suite of source data. using Gadfly, RDatasets, DataFrames; set_default_plot_size(20cm, 12cm); mlmf = dataset("mlmRev","Gcsemv") df = mlmf[complete_cases(mlmf), :] After extracting the data we need to operate with values with do not have any NA values, so we use the complete_cases() function to create a subset of the original data. names(df) 5-element Array{Symbol,1}: ; # => [ :School, :Student, :Gender, :Written, :Course ] If we wish to view the data values for the exam and course work results and at the same time differentiate between boys and girls, this can be displayed by: plot(df, x="Course", y="Written", color="Gender") The JuliaGPU community group Many Julia modules build on the work of other authors working within the same field of study and these have classified as community groups (http://julialang.org/community). Probably the most prolific is the statistics group: JuliaStats (http://juliastats.github.io). One of the main themes in my professional career has been working with hardware to speed up the computing process. In my work on satellite data I worked with the STAR-100 array processor, and once back in the UK, used Silicon Graphics for 3D rendering of medical data . Currently I am interested in using NVIDIA GPUs in financial scenarios and risk calculations. Much of this work has been coded in C, with domain specific languages to program the ancillary hardware. It is now possible to do much of this in Julia with packages contained in the JuliaGPU group. This has routines for both CUDA and OpenCL, at present covering: Basic runtime: CUDA.jl, CUDArt.jl, OpenCL.jl BLAS integration: CUBLAS.jl, CLBLAS FFT operations: CUFFT.jl, CLFFT.jl The CU*-style routines only applies to NVIDIA cards and requires the CUDA SDK to be installed, whereas CL*-functions can be used with variety of GPU s. The CLFFT and CLBLAS do require some additional libraries to be present but we can use OpenCL as is. The following is output from a Lenovo Z50 laptop with an i7 processor and both Intel and NVIDIA graphics chips. julia> using OpenCL julia> OpenCL.devices() OpenCL.Platform(Intel(R) HDGraphics 4400) OpenCL.Platform(Intel(R) Core(TM) i7-4510U CPU) OpenCL.Platform(GeForce 840M on NVIDIA CUDA) To do some calculations we need to define a kernel to be loaded on the GPU. The following multiplies two 1024x1024 matrices of Gaussian random numbers: import OpenCL const cl = OpenCL const kernel_source = """ __kernel void mmul( const int Mdim, const int Ndim, const int Pdim, __global float* A, __global float* B, __global float* C) {    int k;    int i = get_global_id(0);    int j = get_global_id(1);    float tmp;    if ((i < Ndim) && (j < Mdim)) {      tmp = 0.0f;      for (k = 0; k < Pdim; k++)        tmp += A[i*Ndim + k] * B[k*Pdim + j];        C[i*Ndim+j] = tmp;    } } """ The kernel is expressed as a string and the OpenCL DSL has a C-like syntax: const ORDER = 1024; # Order of the square matrices A, B and C const TOL   = 0.001; # Tolerance used in floating point comps const COUNT = 3;     # Number of runs   sizeN = ORDER * ORDER; h_A = float32(randn(ORDER)); # Fill array with random numbers h_B = float32(randn(ORDER)); # --- ditto -- h_C = Array(Float32, ORDER); # Array to hold the results   ctx   = cl.Context(cl.devices()[3]); queue = cl.CmdQueue(ctx, :profile);   d_a = cl.Buffer(Float32, ctx, (:r,:copy), hostbuf = h_A); d_b = cl.Buffer(Float32, ctx, (:r,:copy), hostbuf = h_B); d_c = cl.Buffer(Float32, ctx, :w, length(h_C)); We now create the Open CL context and some data space on the GPU for the three arrays d_A, d_B, and D_C. Then we copy the data in the host arrays h_A and h_B to the device and then load the kernel onto the GPU. prg = cl.Program(ctx, source=kernel_source) |> cl.build! mmul = cl.Kernel(prg, "mmul"); The following loop runs the kernel COUNT times to give an accurate estimate of the elapsed time for the operation. This includes the cl-copy!() operation which copies the results back from the device to the host (Julia) program. for i in 1:COUNT fill!(h_C, 0.0); global_range = (ORDER. ORDER); mmul_ocl = mmul[queue, global_range]; evt = mmul_ocl(int32(ORDER), int32(ORDER), int32(ORDER), d_a, d_b, d_c); run_time = evt[:profile_duration] / 1e9; cl.copy!(queue, h_C, d_c); mflops = 2.0 * Ndims^3 / (1000000.0 * run_time); @printf “%10.8f seconds at %9.5f MFLOPSn” run_time mflops end 0.59426405 seconds at 3613.686 MFLOPS 0.59078856 seconds at 3634.957 MFLOPS 0.57401651 seconds at 3741.153 MFLOPS This compares with the figures for running this natively, without the GPU processor: 7.060888678 seconds at 304.133 MFLOPS That is using the GPU gives a 12-fold increase in the performance of matrix calculation. Summary This article has introduced some of the main features which sets Julia apart from other similar programming languages. I began with a quick look some Julia code by developing a trajectory used in estimating the price of a financial option which was displayed graphically. Continuing with the graphics theme we presented some code to generating a Julia set and to write this to disk as a PGM formatted file. The type system and use of multiple dispatch is discussed next. This a major difference for the user between Julia and object-orientated languages such as R and Python and is central to what gives Julia the power to generate fast machine-level code via LLVM compilation. We then turned to a series of topics from the Julia armory: Working with Python: The ability to call C and Fortran, seamlessly, has been a central feature of Julia since its initial development by the addition of interoperability with Python has opened up a new series of possibilities, leading to the development of the IJulia interface and its integration in the Jupyter project. Simple statistics using DataFrames :As an example of working with data highlighted the Julia implementation of data frames by looking at Apple share prices and applying some simple statistics. MySQL Access using PyCall: Returns to another usage of Python interoperability to illustrate an unconventional method to interface to a MySQL database. Scientific programming with Julia: The case of solution of the ordinary differential equations is presented here by looking at the Lotka-Volterras equation but unusually we develop a solution for the three species model. Graphics with Gadfly: Julia has a wide range of options when developing data visualizations. Gadfly is one of the ‘heavyweights’ and a dataset is extracted from the RDataset.jl package, containing UK GCSE results and the comparison between written and course work results is displayed using Gadfly, categorized by gender. Finally we showcased the work of Julia community groups by looking at an example from the JuliaGPU group by utilizing the OpenCL package to check on the set of supported devices. We then selected an NVIDIA GeForce chip, in order to run execute a simple kernel which multiplied a pair of matrices via the GPU. This was timed against conventional evaluation against native Julia coding in order to illustrate the speedup involved in this approach from parallelizing matrix operations. Resources for Article: Further resources on this subject: Pricing the Double-no-touch option [article] Basics of Programming in Julia [article] SQL Server Analysis Services Administering and Monitoring Analysis Services [article]
Read more
  • 0
  • 0
  • 2486

article-image-deployment-and-maintenance
Packt
20 Jul 2015
21 min read
Save for later

Deployment and Maintenance

Packt
20 Jul 2015
21 min read
 In this article by Sandro Pasquali, author of Deploying Node.js, we will learn about the following: Automating the deployment of applications, including a look at the differences between continuous integration, delivery, and deployment Using Git to track local changes and triggering deployment actions via webhooks when appropriate Using Vagrant to synchronize your local development environment with a deployed production server Provisioning a server with Ansible Note that application deployment is a complex topic with many dimensions that are often considered within unique sets of needs. This article is intended as an introduction to some of the technologies and themes you will encounter. Also, note that the scaling issues are part and parcel of deployment. (For more resources related to this topic, see here.) Using GitHub webhooks At the most basic level, deployment involves automatically validating, preparing, and releasing new code into production environments. One of the simplest ways to set up a deployment strategy is to trigger releases whenever changes are committed to a Git repository through the use of webhooks. Paraphrasing the GitHub documentation, webhooks provide a way for notifications to be delivered to an external web server whenever certain actions occur on a repository. In this section, we'll use GitHub webhooks to create a simple continuous deployment workflow, adding more realistic checks and balances. We'll build a local development environment that lets developers work with a clone of the production server code, make changes, and see the results of those changes immediately. As this local development build uses the same repository as the production build, the build process for a chosen environment is simple to configure, and multiple production and/or development boxes can be created with no special effort. The first step is to create a GitHub (www.github.com) account if you don't already have one. Basic accounts are free and easy to set up. Now, let's look at how GitHub webhooks work. Enabling webhooks Create a new folder and insert the following package.json file: {"name": "express-webhook","main": "server.js","dependencies": {"express": "~4.0.0","body-parser": "^1.12.3"}} This ensures that Express 4.x is installed and includes the body-parser package, which is used to handle POST data. Next, create a basic server called server.js: var express = require('express');var app = express();var bodyParser = require('body-parser');var port = process.env.PORT || 8082;app.use(bodyParser.json());app.get('/', function(req, res) {res.send('Hello World!');});app.post('/webhook', function(req, res) {// We'll add this next});app.listen(port);console.log('Express server listening on port ' + port); Enter the folder you've created, and build and run the server with npm install; npm start. Visit localhost:8082/ and you should see "Hello World!" in your browser. Whenever any file changes in a given repository, we want GitHub to push information about the change to /webhook. So, the first step is to create a GitHub repository for the Express server mentioned in the code. Go to your GitHub account and create a new repository with the name 'express-webhook'. The following screenshot shows this: Once the repository is created, enter your local repository folder and run the following commands: git initgit add .git commit -m "first commit"git remote add origin git@github.com:<your username>/express-webhook You should now have a new GitHub repository and a local linked version. The next step is to configure this repository to broadcast the push event on the repository. Navigate to the following URL: https://github.com/<your_username>/express-webhook/settings From here, navigate to Webhooks & Services | Add webhook (you may need to enter your password again). You should now see the following screen: This is where you set up webhooks. Note that the push event is already set as default, and, if asked, you'll want to disable SSL verification for now. GitHub needs a target URL to use POST on change events. If you have your local repository in a location that is already web accessible, enter that now, remembering to append the /webhook route, as in http://www.example.com/webhook. If you are building on a local machine or on another limited network, you'll need to create a secure tunnel that GitHub can use. A free service to do this can be found at http://localtunnel.me/. Follow the instructions on that page, and use the custom URL provided to configure your webhook. Other good forwarding services can be found at https://forwardhq.com/ and https://meetfinch.com/. Now that webhooks are enabled, the next step is to test the system by triggering a push event. Create a new file called readme.md (add whatever you'd like to it), save it, and then run the following commands: git add readme.mdgit commit -m "testing webhooks"git push origin master This will push changes to your GitHub repository. Return to the Webhooks & Services section for the express-webhook repository on GitHub. You should see something like this: This is a good thing! GitHub noticed your push and attempted to deliver information about the changes to the webhook endpoint you set, but the delivery failed as we haven't configured the /webhook route yet—that's to be expected. Inspect the failed delivery payload by clicking on the last attempt—you should see a large JSON file. In that payload, you'll find something like this: "committer": {"name": "Sandro Pasquali","email": "spasquali@gmail.com","username": "sandro-pasquali"},"added": ["readme.md"],"removed": [],"modified": [] It should now be clear what sort of information GitHub will pass along whenever a push event happens. You can now configure the /webhook route in the demonstration Express server to parse this data and do something with that information, such as sending an e-mail to an administrator. For example, use the following code: app.post('/webhook', function(req, res) {console.log(req.body);}); The next time your webhook fires, the entire JSON payload will be displayed. Let's take this to another level, breaking down the autopilot application to see how webhooks can be used to create a build/deploy system. Implementing a build/deploy system using webhooks To demonstrate how to build a webhook-powered deployment system, we're going to use a starter kit for application development. Go ahead and use fork on the repository at https://github.com/sandro-pasquali/autopilot.git. You now have a copy of the autopilot repository, which includes scaffolding for common Gulp tasks, tests, an Express server, and a deploy system that we're now going to explore. The autopilot application implements special features depending on whether you are running it in production or in development. While autopilot is a little too large and complex to fully document here, we're going to take a look at how major components of the system are designed and implemented so that you can build your own or augment existing systems. Here's what we will examine: How to create webhooks on GitHub programmatically How to catch and read webhook payloads How to use payload data to clone, test, and integrate changes How to use PM2 to safely manage and restart servers when code changes If you haven't already used fork on the autopilot repository, do that now. Clone the autopilot repository onto a server or someplace else where it is web-accessible. Follow the instructions on how to connect and push to the fork you've created on GitHub, and get familiar with how to pull and push changes, commit changes, and so on. PM2 delivers a basic deploy system that you might consider for your project (https://github.com/Unitech/PM2/blob/master/ADVANCED_README.md#deployment). Install the cloned autopilot repository with npm install; npm start. Once npm has installed dependencies, an interactive CLI application will lead you through the configuration process. Just hit the Enter key for all the questions, which will set defaults for a local development build (we'll build in production later). Once the configuration is complete, a new development server process controlled by PM2 will have been spawned. You'll see it listed in the PM2 manifest under autopilot-dev in the following screenshot: You will make changes in the /source directory of this development build. When you eventually have a production server in place, you will use git push on the local changes to push them to the autopilot repository on GitHub, triggering a webhook. GitHub will use POST on the information about the change to an Express route that we will define on our server, which will trigger the build process. The build runner will pull your changes from GitHub into a temporary directory, install, build, and test the changes, and if all is well, it will replace the relevant files in your deployed repository. At this point, PM2 will restart, and your changes will be immediately available. Schematically, the flow looks like this: To create webhooks on GitHub programmatically, you will need to create an access token. The following diagram explains the steps from A to B to C: We're going to use the Node library at https://github.com/mikedeboer/node-github to access GitHub. We'll use this package to create hooks on Github using the access token you've just created. Once you have an access token, creating a webhook is easy: var GitHubApi = require("github");github.authenticate({type: "oauth",token: <your token>});github.repos.createHook({"user": <your github username>,"repo": <github repo name>,"name": "web","secret": <any secret string>,"active": true,"events": ["push"],"config": {"url": "http://yourserver.com/git-webhook","content_type": "json"}}, function(err, resp) {...}); Autopilot performs this on startup, removing the need for you to manually create a hook. Now, we are listening for changes. As we saw previously, GitHub will deliver a payload indicating what has been added, what has been deleted, and what has changed. The next step for the autopilot system is to integrate these changes. It is important to remember that, when you use webhooks, you do not have control over how often GitHub will send changesets—if more than one person on your team can push, there is no predicting when those pushes will happen. The autopilot system uses Redis to manage a queue of requests, executing them in order. You will need to manage multiple changes in a way. For now, let's look at a straightforward way to build, test, and integrate changes. In your code bundle, visit autopilot/swanson/push.js. This is a process runner on which fork has been used by buildQueue.js in that same folder. The following information is passed to it: The URL of the GitHub repository that we will clone The directory to clone that repository into (<temp directory>/<commit hash>) The changeset The location of the production repository that will be changed Go ahead and read through the code. Using a few shell scripts, we will clone the changed repository and build it using the same commands you're used to—npm install, npm test, and so on. If the application builds without errors, we need only run through the changeset and replace the old files with the changed files. The final step is to restart our production server so that the changes reach our users. Here is where the real power of PM2 comes into play. When the autopilot system is run in production, PM2 creates a cluster of servers (similar to the Node cluster module). This is important as it allows us to restart the production server incrementally. As we restart one server node in the cluster with the newly pushed content, the other clusters continue to serve old content. This is essential to keeping a zero-downtime production running. Hopefully, the autopilot implementation will give you a few ideas on how to improve this process and customize it to your own needs. Synchronizing local and deployed builds One of the most important (and often difficult) parts of the deployment process is ensuring that the environment an application is being developed, built, and tested within perfectly simulates the environment that application will be deployed into. In this section, you'll learn how to emulate, or virtualize, the environment your deployed application will run within using Vagrant. After demonstrating how this setup can simplify your local development process, we'll use Ansible to provision a remote instance on DigitalOcean. Developing locally with Vagrant For a long while, developers would work directly on running servers or cobble together their own version of the production environment locally, often writing ad hoc scripts and tools to smoothen their development process. This is no longer necessary in a world of virtual machines. In this section, we will learn how to use Vagrant to emulate a production environment within your development environment, advantageously giving you a realistic box to work on testing code for production and isolating your development process from your local machine processes. By definition, Vagrant is used to create a virtual box emulating a production environment. So, we need to install Vagrant, a virtual machine, and a machine image. Finally, we'll need to write the configuration and provisioning scripts for our environment. Go to http://www.vagrantup.com/downloads and install the right Vagrant version for your box. Do the same with VirtualBox here at https://www.virtualbox.org/wiki/Downloads. You now need to add a box to run. For this example, we're going to use Centos 7.0, but you can choose whichever you'd prefer. Create a new folder for this project, enter it, and run the following command: vagrant box add chef/centos-7.0 Usefully, the creators of Vagrant, HashiCorp, provide a search service for Vagrant boxes at https://atlas.hashicorp.com/boxes/search. You will be prompted to choose your virtual environment provider—select virtualbox. All relevant files and machines will now be downloaded. Note that these boxes are very large and may take time to download. You'll now create a configuration file for Vagrant called Vagrantfile. As with npm, the init command quickly sets up a base file. Additionally, we'll need to inform Vagrant of the box we'll be using: vagrant init chef/centos-7.0 Vagrantfile is written in Ruby and defines the Vagrant environment. Open it up now and scan it. There is a lot of commentary, and it makes a useful read. Note the config.vm.box = "chef/centos-7.0" line, which was inserted during the initialization process. Now you can start Vagrant: vagrant up If everything went as expected, your box has been booted within Virtualbox. To confirm that your box is running, use the following code: vagrant ssh If you see a prompt, you've just set up a virtual machine. You'll see that you are in the typical home directory of a CentOS environment. To destroy your box, run vagrant destroy. This deletes the virtual machine by cleaning up captured resources. However, the next vagrant up command will need to do a lot of work to rebuild. If you simply want to shut down your machine, use vagrant halt. Vagrant is useful as a virtualized, production-like environment for developers to work within. To that end, it must be configured to emulate a production environment. In other words, your box must be provisioned by telling Vagrant how it should be configured and what software should be installed whenever vagrant up is run. One strategy for provisioning is to create a shell script that configures our server directly and point the Vagrant provisioning process to that script. Add the following line to Vagrantfile: config.vm.provision "shell", path: "provision.sh" Now, create that file with the following contents in the folder hosting Vagrantfile: # install nvmcurl https://raw.githubusercontent.com/creationix/nvm/v0.24.1/install.sh | bash# restart your shell with nvm enabledsource ~/.bashrc# install the latest Node.jsnvm install 0.12# ensure server default versionnvm alias default 0.12 Destroy any running Vagrant boxes. Run Vagrant again, and you will notice in the output the execution of the commands in our provisioning shell script. When this has been completed, enter your Vagrant box as the root (Vagrant boxes are automatically assigned the root password "vagrant"): vagrant sshsu You will see that Node v0.12.x is installed: node -v It's standard to allow password-less sudo for the Vagrant user. Run visudo and add the following line to the sudoers configuration file: vagrant ALL=(ALL) NOPASSWD: ALL Typically, when you are developing applications, you'll be modifying files in a project directory. You might bind a directory in your Vagrant box to a local code editor and develop in that way. Vagrant offers a simpler solution. Within your VM, there is a /vagrant folder that maps to the folder that Vagrantfile exists within, and these two folders are automatically synced. So, if you add the server.js file to the right folder on your local machine, that file will also show up in your VM's /vagrant folder. Go ahead and create a new test file either in your local folder or in your VM's /vagrant folder. You'll see that file synchronized to both locations regardless of where it was originally created. Let's clone our express-webhook repository from earlier in this article into our Vagrant box. Add the following lines to provision.sh: # install various packages, particularly for gityum groupinstall "Development Tools" -yyum install gettext-devel openssl-devel perl-CPAN perl-devel zlib-devel-yyum install git -y# Move to shared folder, clone and start servercd /vagrantgit clone https://github.com/sandro-pasquali/express-webhookcd express-webhooknpm i; npm start Add the following to Vagrantfile, which will map port 8082 on the Vagrant box (a guest port representing the port our hosted application listens on) to port 8000 on our host machine: config.vm.network "forwarded_port", guest: 8082, host: 8000 Now, we need to restart the Vagrant box (loading this new configuration) and re-provision it: vagrant reloadvagrant provision This will take a while as yum installs various dependencies. When provisioning is complete, you should see this as the last line: ==> default: Express server listening on port 8082 Remembering that we bound the guest port 8082 to the host port 8000, go to your browser and navigate to localhost:8000. You should see "Hello World!" displayed. Also note that in our provisioning script, we cloned to the (shared) /vagrant folder. This means the clone of express-webhook should be visible in the current folder, which will allow you to work on the more easily accessible codebase, knowing it will be automatically synchronized with the version on your Vagrant box. Provisioning with Ansible Configuring your machines by hand, as we've done previously, doesn't scale well. For one, it can be overly difficult to set and manage environment variables. Also, writing your own provisioning scripts is error-prone and no longer necessary given the existence of provisioning tools, such as Ansible. With Ansible, we can define server environments using an organized syntax rather than ad hoc scripts, making it easier to distribute and modify configurations. Let's recreate the provision.sh script developed earlier using Ansible playbooks: Playbooks are Ansible's configuration, deployment, and orchestration language. They can describe a policy you want your remote systems to enforce or a set of steps in a general IT process. Playbooks are expressed in the YAML format (a human-readable data serialization language). To start with, we're going to change Vagrantfile's provisioner to Ansible. First, create the following subdirectories in your Vagrant folder: provisioningcommontasks These will be explained as we proceed through the Ansible setup. Next, create the following configuration file and name it ansible.cfg: [defaults]roles_path = provisioninglog_path = ./ansible.log This indicates that Ansible roles can be found in the /provisioning folder, and that we want to keep a provisioning log in ansible.log. Roles are used to organize tasks and other functions into reusable files. These will be explained shortly. Modify the config.vm.provision definition to the following: config.vm.provision "ansible" do |ansible|ansible.playbook = "provisioning/server.yml"ansible.verbose = "vvvv"end This tells Vagrant to defer to Ansible for provisioning instructions, and that we want the provisioning process to be verbose—we want to get feedback when the provisioning step is running. Also, we can see that the playbook definition, provisioning/server.yml, is expected to exist. Create that file now: ---- hosts: allsudo: yesroles:- commonvars:env:user: 'vagrant'nvm:version: '0.24.1'node_version: '0.12'build:repo_path: 'https://github.com/sandro-pasquali'repo_name: 'express-webhook' Playbooks can contain very complex rules. This simple file indicates that we are going to provision all available hosts using a single role called common. In more complex deployments, an inventory of IP addresses could be set under hosts, but, here, we just want to use a general setting for our one server. Additionally, the provisioning step will be provided with certain environment variables following the forms env.user, nvm.node_version, and so on. These variables will come into play when we define the common role, which will be to provision our Vagrant server with the programs necessary to build, clone, and deploy express-webhook. Finally, we assert that Ansible should run as an administrator (sudo) by default—this is necessary for the yum package manager on CentOS. We're now ready to define the common role. With Ansible, folder structures are important and are implied by the playbook. In our case, Ansible expects the role location (./provisioning, as defined in ansible.cfg) to contain the common folder (reflecting the common role given in the playbook), which itself must contain a tasks folder containing a main.yml file. These last two naming conventions are specific and required. The final step is creating the main.yml file in provisioning/common/tasks. First, we replicate the yum package loaders (see the file in your code bundle for the full list): ---- name: Install necessary OS programsyum: name={{ item }} state=installedwith_items:- autoconf- automake...- git Here, we see a few benefits of Ansible. A human-readable description of yum tasks is provided to a looping structure that will install every item in the list. Next, we run the nvm installer, which simply executes the auto-installer for nvm: - name: Install nvmsudo: noshell: "curl https://raw.githubusercontent.com/creationix/nvm/v{{ nvm.version }}/install.sh | bash" Note that, here, we're overriding the playbook's sudo setting. This can be done on a per-task basis, which gives us the freedom to move between different permission levels while provisioning. We are also able to execute shell commands while at the same time interpolating variables: - name: Update .bashrcsudo: nolineinfile: >dest="/home/{{ env.user }}/.bashrc"line="source /home/{{ env.user }}/.nvm/nvm.sh" Ansible provides extremely useful tools for file manipulation, and we will see here a very common one—updating the .bashrc file for a user. The lineinfile directive makes the addition of aliases, among other things, straightforward. The remainder of the commands follow a similar pattern to implement, in a structured way, the provisioning directives we need for our server. All the files you will need are in your code bundle in the vagrant/with_ansible folder. Once you have them installed, run vagrant up to see Ansible in action. One of the strengths of Ansible is the way it handles contexts. When you start your Vagrant build, you will notice that Ansible gathers facts, as shown in the following screenshot: Simply put, Ansible analyzes the context it is working in and only executes what is necessary to execute. If one of your tasks has already been run, the next time you try vagrant provision, that task will not run again. This is not true for shell scripts! In this way, editing playbooks and reprovisioning does not consume time redundantly changing what has already been changed. Ansible is a powerful tool that can be used for provisioning and much more complex deployment tasks. One of its great strengths is that it can run remotely—unlike most other tools, Ansible uses SSH to connect to remote servers and run operations. There is no need to install it on your production boxes. You are encouraged to browse the Ansible documentation at http://docs.ansible.com/index.html to learn more. Summary In this article, you learned how to deploy a local build into a production-ready environment and the powerful Git webhook tool was demonstrated as a way of creating a continuous integration environment. Resources for Article: Further resources on this subject: Node.js Fundamentals [Article] API with MongoDB and Node.js [Article] So, what is Node.js? [Article]
Read more
  • 0
  • 0
  • 3015

article-image-getting-started-nginx
Packt
20 Jul 2015
10 min read
Save for later

Getting Started with Nginx

Packt
20 Jul 2015
10 min read
In this article by the author, Valery Kholodkov, of the book, Nginx Essentials, we learn to start digging a bit deeper into Nginx, we will quickly go through most common distributions that contain prebuilt packages for Nginx. Installing Nginx Before you can dive into specific features of Nginx, you need to learn how to install Nginx on your system. It is strongly recommended that you use prebuilt binary packages of Nginx if they are available in your distribution. This ensures best integration of Nginx with your system and reuse of best practices incorporated into the package by the package maintainer. Prebuilt binary packages of Nginx automatically maintain dependencies for you and package maintainers are usually fast to include security patches, so you don't get any complaints from security officers. In addition to that, the package usually provides a distribution-specific startup script, which doesn't come out of the box. Refer to your distribution package directory to find out if you have a prebuilt package for Nginx. Prebuilt Nginx packages can also be found under the download link on the official Nginx.org site. Installing Nginx on Ubuntu The Ubuntu Linux distribution contains a prebuilt package for Nginx. To install it, simply run the following command: $ sudo apt-get install nginx The preceding command will install all the required files on your system, including the logrotate script and service autorun scripts. The following table describes the Nginx installation layout that will be created after running this command as well as the purpose of the selected files and folders: Description Path/Folder Nginx configuration files /etc/nginx Main configuration file /etc/nginx/nginx.conf Virtual hosts configuration files (including default one) /etc/nginx/sites-enabled Custom configuration files /etc/nginx/conf.d Log files (both access and error log) /var/log/nginx Temporary files /var/lib/nginx Default virtual host files /usr/share/nginx/html Default virtual host files will be placed into /usr/share/nginx/html. Please keep in mind that this directory is only for the default virtual host. For deploying your web application, use folders recommended by Filesystem Hierarchy Standard (FHS). Now you can start the Nginx service with the following command: $ sudo service nginx start This will start Nginx on your system. Alternatives The prebuilt Nginx package on Ubuntu has a number of alternatives. Each of them allows you to fine tune the Nginx installation for your system. Installing Nginx on Red Hat Enterprise Linux or CentOS/Scientific Linux Nginx is not provided out of the box in Red Hat Enterprise Linux or CentOS/Scientific Linux. Instead, we will use the Extra Packages for Enterprise Linux (EPEL) repository. EPEL is a repository that is maintained by Red Hat Enterprise Linux maintainers, but contains packages that are not a part of the main distribution for various reasons. You can read more about EPEL at https://fedoraproject.org/wiki/EPEL. To enable EPEL, you need to download and install the repository configuration package: For RHEL or CentOS/SL 7, use the following link: http://download.fedoraproject.org/pub/epel/7/x86_64/repoview/epel-release.html For RHEL/CentOS/SL 6 use the following link: http://download.fedoraproject.org/pub/epel/6/i386/repoview/epel-release.html If you have a newer/older RHEL version, please take a look at the How can I use these extra packages? section in the original EPEL wiki at the following link: https://fedoraproject.org/wiki/EPEL Now that you are ready to install Nginx, use the following command: # yum install nginx The preceding command will install all the required files on your system, including the logrotate script and service autorun scripts. The following table describes the Nginx installation layout that will be created after running this command and the purpose of the selected files and folders: Description Path/Folder Nginx configuration files /etc/nginx Main configuration file /etc/nginx/nginx.conf Virtual hosts configuration files (including default one) /etc/nginx/conf.d Custom configuration files /etc/nginx/conf.d Log files (both access and error log) /var/log/nginx Temporary files /var/lib/nginx Default virtual host files /usr/share/nginx/html Default virtual host files will be placed into /usr/share/nginx/html. Please keep in mind that this directory is only for the default virtual host. For deploying your web application, use folders recommended by FHS. By default, the Nginx service will not autostart on system startup, so let's enable it. Refer to the following table for the commands corresponding to your CentOS version: Function Cent OS 6 Cent OS 7 Enable Nginx startup at system startup chkconfig nginx on systemctl enable nginx Manually start Nginx service nginx start systemctl start nginx Manually stop Nginx service nginx stop systemctl start nginx Installing Nginx from source files Traditionally, Nginx is distributed in the source code. In order to install Nginx from the source code, you need to download and compile the source files on your system. It is not recommended that you install Nginx from the source code. Do this only if you have a good reason, such as the following scenarios: You are a software developer and want to debug or extend Nginx You feel confident enough to maintain your own package A package from your distribution is not good enough for you You want to fine-tune your Nginx binary. In either case, if you are planning to use this way of installing for real use, be prepared to sort out challenges such as dependency maintenance, distribution, and application of security patches. In this section, we will be referring to the configuration script. Configuration script is a shell script similar to one generated by autoconf, which is required to properly configure the Nginx source code before it can be compiled. This configuration script has nothing to do with the Nginx configuration file that we will be discussing later. Downloading the Nginx source files The primary source for Nginx for an English-speaking audience is Nginx.org. Open https://nginx.org/en/download.html in your browser and choose the most recent stable version of Nginx. Download the chosen archive into a directory of your choice (/usr/local or /usr/src are common directories to use for compiling software): $ wget -q http://nginx.org/download/nginx-1.7.9.tar.gz Extract the files from the downloaded archive and change to the directory corresponding to the chosen version of Nginx: $ tar xf nginx-1.7.9.tar.gz$ cd nginx-1.7.9 To configure the source code, we need to run the ./configure script included in the archive: $ ./configurechecking for OS+ Linux 3.13.0-36-generic i686checking for C compiler ... found+ using GNU C compiler[...] This script will produce a lot of output and, if successful, will generate a Makefile file for the source files. Notice that we showed the non-privileged user prompt $ instead of the root # in the previous command lines. You are encouraged to configure and compile software as a regular user and only install as root. This will prevent a lot of problems related to access restriction while working with the source code. Troubleshooting The troubleshooting step, although very simple, has a couple of common pitfalls. The basic installation of Nginx requires the presence of OpenSSL and Perl-compatible Regex (PCRE) developer packages in order to compile. If these packages are not properly installed or not installed in locations where the Nginx configuration script is able to locate them, the configuration step might fail. Then, you have to choose between disabling the affected Nginx built-in modules (rewrite or SSL, installing required packages properly, or pointing the Nginx configuration script to the actual location of those packages if they are installed. Building Nginx You can build the source files now using the following command: $ make You'll see a lot of output on compilation. If build is successful, you can install the Nginx file on your system. Before doing that, make sure you escalate your privileges to the super user so that the installation script can install the necessary files into the system areas and assign necessary privileges. Once successful, run the make install command: # make install The preceding command will install all the necessary files on your system. The following table lists all locations of the Nginx files that will be created after running this command and their purposes: Description Path/Folder Nginx configuration files /usr/local/nginx/conf Main configuration file /usr/local/nginx/conf/nginx.conf Log files (both access and error log) /usr/local/nginx/logs Temporary files /usr/local/nginx Default virtual host files /usr/local/nginx/html Unlike installations from prebuilt packages, installation from source files does not harness Nginx folders for the custom configuration files or virtual host configuration files. The main configuration file is also very simple in its nature. You have to take care of this yourself. Nginx must be ready to use now. To start Nginx, change your working directory to the /usr/local/nginx directory and run the following command: # sbin/nginx This will start Nginx on your system with the default configuration. Troubleshooting This stage works flawlessly most of the time. A problem can occur in the following situations: You are using nonstandard system configuration. Try to play with the options in the configuration script in order to overcome the problem. You compiled in third-party modules and they are out of date or not maintained. Switch off third-party modules that break your build or contact the developer for assistance. Copying the source code configuration from prebuilt packages Occasionally you might want to amend Nginx binary from a prebuilt packages with your own changes. In order to do that you need to reproduce the build tree that was used to compile Nginx binary for the prebuilt package. But how would you know what version of Nginx and what configuration script options were used at the build time? Fortunately, Nginx has a solution for that. Just run the existing Nginx binary with the -V command-line option. Nginx will print the configure-time options. This is shown in the following: $ /usr/sbin/nginx -Vnginx version: nginx/1.4.6 (Ubuntu)built by gcc 4.8.2 (Ubuntu 4.8.2-19ubuntu1)TLS SNI support enabledconfigure arguments: --with-cc-opt='-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2' --with-ld-opt='-Wl,-Bsymbolic-functions -Wl,-z,relro' … Using the output of the preceding command, reproduce the entire build environment, including the Nginx source tree of the corresponding version and modules that were included into the build. Here, the output of the Nginx -V command is trimmed for simplicity. In reality, you will be able to see and copy the entire command line that was passed to the configuration script at the build time. You might even want to reproduce the version of the compiler used in order to produce a binary-identical Nginx executable file (we will discuss this later when discussing how to troubleshoot crashes). Once this is done, run the ./configure script of your Nginx source tree with options from the output of the -V option (with necessary alterations) and follow the remaining steps of the build procedure. You will get an altered Nginx executable on the objs/ folder of the source tree. Summary Here, you learned how to install Nginx from a number of available sources, the structure of Nginx installation and the purpose of various files, the elements and structure of the Nginx configuration file, and how to create a minimal working Nginx configuration file. You also learned about some best practices for Nginx configuration.
Read more
  • 0
  • 0
  • 23443

article-image-sprites-camera-actions
Packt
20 Jul 2015
20 min read
Save for later

Sprites, Camera, Actions!

Packt
20 Jul 2015
20 min read
In this article by, Stephen Haney, author of the book Game Development with Swift, we will focus on building great gameplay experiences while SpriteKit performs the mechanical work of the game loop. To draw an item to the screen, we create a new instance of a SpriteKit node. These nodes are simple; we attach a child node to our scene, or to existing nodes, for each item we want to draw. Sprites, particle emitters, and text labels are all considered nodes in SpriteKit. The topics in this article include: Drawing your first sprite Animation: movement, scaling, and rotation Working with textures Organizing art into texture atlases For this article, you need to first install Xcode, and then create a project. The project automatically creates the GameScene.swift file as the default file to store the scene of your new game. (For more resources related to this topic, see here.) Drawing your first sprite It is time to write some game code – fantastic! Open your GameScene.swift file and find the didMoveToView function. Recall that this function fires every time the game switches to this scene. We will use this function to get familiar with the SKSpriteNode class. You will use SKSpriteNode extensively in your game, whenever you want to add a new 2D graphic entity. The term sprite refers to a 2D graphic or animation that moves around the screen independently from the background. Over time, the term has developed to refer to any game object on the screen in a 2D game. We will create and draw your first sprite in this article: a happy little bee. Building a SKSpriteNode class Let's begin by drawing a blue square to the screen. The SKSpriteNode class can draw both texture graphics and solid blocks of color. It is often helpful to prototype your new game ideas with blocks of color before you spend time with artwork. To draw the blue square, add an instance of SKSpriteNode to the game: override func didMoveToView(view: SKView) {// Instantiate a constant, mySprite, instance of SKSpriteNode// The SKSpriteNode constructor can set color and size// Note: UIColor is a UIKit class with built-in color presets// Note: CGSize is a type we use to set node sizeslet mySprite = SKSpriteNode(color: UIColor.blueColor(), size:CGSize(width: 50, height: 50))// Assign our sprite a position in points, relative to its// parent node (in this case, the scene)mySprite.position = CGPoint(x: 300, y: 300)// Finally, we need to add our sprite node into the node tree.// Call the SKScene's addChild function to add the node// Note: In Swift, 'self' is an automatic property// on any type instance, exactly equal to the instance itself// So in this instance, it refers to the GameScene instanceself.addChild(mySprite)} Go ahead and run the project. You should see a similar small blue square appear in your simulator: Swift allows you to define variables as constants, which can be assigned a value only once. For best performance, use let to declare constants whenever possible. Declare your variables with var when you need to alter the value later in your code. Adding animation to your Toolkit Before we dive back in to sprite theory, we should have some fun with our blue square. SpriteKit uses action objects to move sprites around the screen. Consider this example: if our goal is to move the square across the screen, we must first create a new action object to describe the animation. Then, we instruct our sprite node to execute the action. I will illustrate this concept with many examples in the article. For now, add this code in the didMoveToView function, below the self.addChild(mySprite) line: // Create a new constant for our action instance// Use the moveTo action to provide a goal position for a node// SpriteKit will tween to the new position over the course of the// duration, in this case 5 secondslet demoAction = SKAction.moveTo(CGPoint(x: 100, y: 100),duration: 5)// Tell our square node to execute the action!mySprite.runAction(demoAction) Run the project. You will see our blue square slide across the screen towards the (100,100) position. This action is re-usable; any node in your scene can execute this action to move to the (100,100) position. As you can see, SpriteKit does a lot of the heavy lifting for us when we need to animate node properties. Inbetweening, or tweening, uses the engine to animate smoothly between a start frame and an end frame. Our moveTo animation is a tween; we provide the start frame (the sprite's original position) and the end frame (the new destination position). SpriteKit generates the smooth transition between our values. Let's try some other actions. The SKAction.moveTo function is only one of many options. Try replacing the demoAction line with this code: let demoAction = SKAction.scaleTo(4, duration: 5) Run the project. You will see our blue square grow to four times its original size. Sequencing multiple animations We can execute actions together simultaneously or one after the each other with action groups and sequences. For instance, we can easily scale our sprite larger and spin it at the same time. Delete all of our action code so far and replace it with this code: // Scale up to 4x initial scalelet demoAction1 = SKAction.scaleTo(4, duration: 5)// Rotate 5 radianslet demoAction2 = SKAction.rotateByAngle(5, duration: 5)// Group the actionslet actionGroup = SKAction.group([demoAction1, demoAction2])// Execute the group!mySprite.runAction(actionGroup) When you run the project, you will see a spinning, growing square. Terrific! If you want to run these actions in sequence (rather than at the same time) change SKAction.group to SKAction.sequence: // Group the actions into a sequencelet actionSequence = SKAction.sequence([demoAction1, demoAction2])// Execute the sequence!mySprite.runAction(actionSequence) Run the code and watch as your square first grows and then spins. Good. You are not limited to two actions; we can group or sequence as many actions together as we need. We have only used a few actions so far; feel free to explore the SKAction class and try out different action combinations before moving on. Recapping your first sprite Congratulations, you have learned to draw a non-textured sprite and animate it with SpriteKit actions. Next, we will explore some important positioning concepts, and then add game art to our sprites. Before you move on, make sure your didMoveToView function matches with mine, and your sequenced animation is firing properly. Here is my code up to this point: override func didMoveToView(view: SKView) {// Instantiate a constant, mySprite, instance of SKSpriteNodelet mySprite = SKSpriteNode(color: UIColor.blueColor(), size:CGSize(width: 50, height: 50))// Assign our sprite a positionmySprite.position = CGPoint(x: 300, y: 300)// Add our sprite node into the node treeself.addChild(mySprite)// Scale up to 4x initial scalelet demoAction1 = SKAction.scaleTo(CGFloat(4), duration: 2)// Rotate 5 radianslet demoAction2 = SKAction.rotateByAngle(5, duration: 2)// Group the actions into a sequencelet actionSequence = SKAction.sequence([demoAction1,demoAction2])// Execute the sequence!mySprite.runAction(actionSequence)} The story on positioning SpriteKit uses a grid of points to position nodes. In this grid, the bottom left corner of the scene is (0,0), with a positive X-axis to the right and a positive Y-axis to the top. Similarly, on the individual sprite level, (0,0) refers to the bottom left corner of the sprite, while (1,1) refers to the top right corner. Alignment with anchor points Each sprite has an anchorPoint property, or an origin. The anchorPoint property allows you to choose which part of the sprite aligns to the sprite's overall position. The default anchor point is (0.5,0.5), so a new SKSpriteNode centers perfectly on its position. To illustrate this, let us examine the blue square sprite we just drew on the screen. Our sprite is 50 pixels wide and 50 pixels tall, and its position is (300,300). Since we have not modified the anchorPoint property, its anchor point is (0.5,0.5). This means the sprite will be perfectly centered over the (300,300) position on the scene's grid. Our sprite's left edge begins at 275 and the right edge terminates at 325. Likewise, the bottom starts at 275 and the top ends at 325. The following diagram illustrates our block's position on the grid: Why do we prefer centered sprites by default? You may think it simpler to position elements by their bottom left corner with an anchorPoint property setting of (0,0). However, the centered behavior benefits us when we scale or rotate sprites: When we scale a sprite with an anchorPoint property of (0,0) it will only expand up the y-axis and out the x-axis. Rotation actions will swing the sprite in wide circles around its bottom left corner. A centered sprite, with the default anchorPoint property of (0.5, 0.5), will expand or contract equally in all directions when scaled and will spin in place when rotated, which is usually the desired effect. There are some cases when you will want to change an anchor point. For instance, if you are drawing a rocket ship, you may want the ship to rotate around the front nose of its cone, rather than its center. Adding textures and game art You may want to take a screenshot of your blue box for your own enjoyment later. I absolutely love reminiscing over old screenshots of my finished games when they were nothing more than simple colored blocks sliding around the screen. Now it is time to move past that stage and attach some fun artwork to our sprite. Downloading the free assets I am providing a downloadable pack for all of the art assets. I recommend you use these assets so you will have everything you need for our demo game. Alternatively, you are certainly free to create your own art for your game if you prefer. These assets come from an outstanding public domain asset pack from Kenney Game Studio. I am providing a small subset of the asset pack that we will use in our game. Download the game art from this URL: http://www.thinkingswiftly.com/game-development-with-swift/assets More exceptional art If you like the art, you can download over 16,000 game assets in the same style for a small donation at http://kenney.itch.io/kenney-donation. I do not have an affiliation with Kenney; I just find it admirable that he has released so much public domain artwork for indie game developers. As CC0 assets, you can copy, modify, and distribute the art, even for commercial purposes, all without asking permission. You can read the full license here: https://creativecommons.org/publicdomain/zero/1.0/ Drawing your first textured sprite Let us use some of the graphics you just downloaded. We will start by creating a bee sprite. We will add the bee texture to our project, load the image onto a SKSpriteNode class, and then size the node for optimum sharpness on retina screens. Add the bee image to your project We need to add the image files to our Xcode project before we can use them in the game. Once we add the images, we can reference them by name in our code; SpriteKit is smart enough to find and implement the graphics. Follow these steps to add the bee image to the project: Right-click on your project in the project navigator and click on Add Files to "Pierre Penguin Escapes the Antarctic" (or the name of your game). Refer to this screenshot to find the correct menu item: Browse to the asset pack you downloaded and locate the bee.png image inside the Enemies folder. Check Copy items if needed, then click Add. You should now see bee.png in your project navigator. Loading images with SKSpriteNode It is quite easy to draw images to the screen with SKSpriteNode. Start by clearing out all of the code we wrote for the blue square inside the didMoveToView function in GameScene.swift. Replace didMoveToView with this code: override func didMoveToView(view: SKView) {// set the scene's background to a nice sky blue// Note: UIColor uses a scale from 0 to 1 for its colorsself.backgroundColor = UIColor(red: 0.4, green: 0.6, blue:0.95, alpha: 1.0);// create our bee sprite nodelet bee = SKSpriteNode(imageNamed: "bee.png")// size our bee nodebee.size = CGSize(width: 100, height: 100)// position our bee nodebee.position = CGPoint(x: 250, y: 250)// attach our bee to the scene's node treeself.addChild(bee)} Run the project and witness our glorious bee – great work! Designing for retina You may notice that our bee image is quite blurry. To take advantage of retina screens, assets need to be twice the pixel dimensions of their node's size property (for most retina screens), or three times the node size for the iPhone 6 Plus. Ignore the height for a moment; our bee node is 100 points wide but the PNG file is only 56 pixels wide. The PNG file needs to be 300 pixels wide to look sharp on the iPhone 6 Plus, or 200 pixels wide to look sharp on 2x retina devices. SpriteKit will automatically resize textures to fit their nodes, so one approach is to create a giant texture at the highest retina resolution (three times the node size) and let SpriteKit resize the texture down for lower density screens. However, there is a considerable performance penalty, and older devices can even run out of memory and crash from the huge textures. The ideal asset approach These double- and triple-sized retina assets can be confusing to new iOS developers. To solve this issue, Xcode normally lets you provide three image files for each texture. For example, our bee node is currently 100 points wide and 100 points tall. In a perfect world, you would provide the following images to Xcode: Bee.png (100 pixels by 100 pixels) Bee@2x.png (200 pixels by 200 pixels) Bee@3x.png (300 pixels by 300 pixels) However, there is currently an issue that prevents 3x textures from working correctly with texture atlases. Texture atlases group textures together and increase rendering performance dramatically (we will implement our first texture atlas in the next section). I hope that Apple will upgrade texture atlases to support 3x textures in Swift 2. For now, we need to choose between texture atlases and 3x assets for the iPhone 6 Plus. My solution for now In my opinion, texture atlases and their performance benefits are key features of SpriteKit. I will continue using texture atlases, thus serving 2x images to the iPhone 6 Plus (which still looks fairly sharp). This means that we will not be using any 3x assets. Further simplifying matters, Swift only runs on iOS7 and higher. The only non-retina devices that run iOS7 are the aging iPad 2 and iPad mini 1st generation. If these older devices are important for your finished games, you should create both standard and 2x images for your games. Otherwise, you can safely ignore non-retina assets with Swift. This means that we will only use double-sized images. The images in the downloadable asset bundle forgo the 2x suffix, since we are only using this size. Once Apple updates texture atlases to use 3x assets, I recommend that you switch to the methodology outlined in The ideal asset approach section for your games. Hands-on with retina in SpriteKit Our bee image illustrates how this all works: Because we set an explicit node size, SpriteKit automatically resizes the bee texture to fit our 100-point wide, 100-point tall sized node. This automatic size-to-fit is very handy, but notice that we have actually slightly distorted the aspect ratio of the image. If we do not set an explicit size, SpriteKit sizes the node (in points) to the match texture's dimensions (in pixels). Go ahead and delete the line that sets the size for our bee node and re-run the project. SpriteKit maintains the aspect ratio automatically, but the smaller bee is still fuzzy. That is because our new node is 56 points by 48 points, matching our PNG file's pixel dimensions of 56 pixels by 48 pixels . . . yet our PNG file needs to be 112 pixels by 96 pixels for a sharp image at this node size on 2x retina screens. We want a smaller bee anyway, so we will resize the node rather than generate larger artwork in this case. Set the size property of your bee node, in points, to half the size of the texture's pixel resolution: // size our bee in points:bee.size = CGSize(width: 28, height: 24) Run the project and you will see a smaller, crystal sharp bee, as in this screenshot: Great! The important concept here is to design your art files at twice the pixel resolution of your node point sizes to take advantage of 2x retina screens, or three times the point sizes to take full advantage of the iPhone 6 Plus. Now we will look at organizing and animating multiple sprite frames. Organizing your assets We will quickly overrun our project navigator with image files if we add all our textures as we did with our bee. Luckily, Xcode provides several solutions. Exploring Images.xcassets We can store images in an .xcassets file and refer to them easily from our code. This is a good place for our background images: Open Images.xcassets from your project navigator. We do not need to add any images here now but, in the future, you can drag image files directly into the image list, or right-click, then Import. Notice that the SpriteKit demo's spaceship image is stored here. We do not need it anymore, so we can right-click on it and choose Removed Selected Items to delete it. Collecting art into texture atlases We will use texture atlases for most of our in-game art. Texture atlases organize assets by collecting related artwork together. They also increase performance by optimizing all of the images inside each atlas as if they were one texture. SpriteKit only needs one draw call to render multiple images out of the same texture atlas. Plus, they are very easy to use! Follow these steps to build your bee texture atlas: We need to remove our old bee texture. Right-click on bee.png in the project navigator and choose Delete, then Move to Trash. Using Finder, browse to the asset pack you downloaded and locate the Enemies folder. Create a new folder inside Enemies and name it bee.atlas. Locate the bee.png and bee_fly.png images inside Enemies and copy them into your new bee.atlas folder. You should now have a folder named bee.atlas containing the two bee PNG files. This is all you need to do to create a new texture atlas – simply place your related images into a new folder with the .atlas suffix. Add the atlas to your project. In Xcode, right-click on the project folder in the project navigator and click Add Files…, as we did earlier for our single bee texture. Find the bee.atlas folder and select the folder itself. Check Copy items if needed, then click Add. The texture atlas will appear in the project navigator. Good work; we organized our bee assets into one collection and Xcode will automatically create the performance optimizations mentioned earlier. Updating our bee node to use the texture atlas We can actually run our project right now and see the same bee as before. Our old bee texture was bee.png, and a new bee.png exists in the texture atlas. Though we deleted the standalone bee.png, SpriteKit is smart enough to find the new bee.png in the texture atlas. We should make sure our texture atlas is working, and that we successfully deleted the old individual bee.png. In GameScene.swift, change our SKSpriteNode instantiation line to use the new bee_fly.png graphic in the texture atlas: // create our bee sprite// notice the new image name: bee_fly.pnglet bee = SKSpriteNode(imageNamed: "bee_fly.png") Run the project again. You should see a different bee image, its wings held lower than before. This is the second frame of the bee animation. Next, we will learn to animate between the two frames to create an animated sprite. Iterating through texture atlas frames We need to study one more texture atlas technique: we can quickly flip through multiple sprite frames to make our bee come alive with motion. We now have two frames of our bee in flight; it should appear to hover in place if we switch back and forth between these frames. Our node will run a new SKAction to animate between the two frames. Update your didMoveToView function to match mine (I removed some older comments to save space): override func didMoveToView(view: SKView) {self.backgroundColor = UIColor(red: 0.4, green: 0.6, blue:0.95, alpha: 1.0)// create our bee sprite// Note: Remove all prior arguments from this line:let bee = SKSpriteNode()bee.position = CGPoint(x: 250, y: 250)bee.size = CGSize(width: 28, height: 24)self.addChild(bee)// Find our new bee texture atlaslet beeAtlas = SKTextureAtlas(named:"bee.atlas")// Grab the two bee frames from the texture atlas in an array// Note: Check out the syntax explicitly declaring beeFrames// as an array of SKTextures. This is not strictly necessary,// but it makes the intent of the code more readable, so I// chose to include the explicit type declaration here:let beeFrames:[SKTexture] = [beeAtlas.textureNamed("bee.png"),beeAtlas.textureNamed("bee_fly.png")]// Create a new SKAction to animate between the frames oncelet flyAction = SKAction.animateWithTextures(beeFrames,timePerFrame: 0.14)// Create an SKAction to run the flyAction repeatedlylet beeAction = SKAction.repeatActionForever(flyAction)// Instruct our bee to run the final repeat action:bee.runAction(beeAction)} Run the project. You will see our bee flap its wings back and forth – cool! You have learned the basics of sprite animation with texture atlases. We will create increasingly complicated animations using this same technique later also. For now, pat yourself on the back. The result may seem simple, but you have unlocked a major building block towards your first SpriteKit game! Putting it all together First, we learned how to use actions to move, scale, and rotate our sprites. Then, we explored animating through multiple frames, bringing our sprite to life. Let us now combine these techniques to fly our bee back and forth across the screen, flipping the texture at each turn. Add this code at the bottom of the didMoveToView function, beneath the bee.runAction(beeAction) line: // Set up new actions to move our bee back and forth:let pathLeft = SKAction.moveByX(-200, y: -10, duration: 2)let pathRight = SKAction.moveByX(200, y: 10, duration: 2)// These two scaleXTo actions flip the texture back and forth// We will use these to turn the bee to face left and rightlet flipTextureNegative = SKAction.scaleXTo(-1, duration: 0)let flipTexturePositive = SKAction.scaleXTo(1, duration: 0)// Combine actions into a cohesive flight sequence for our beelet flightOfTheBee = SKAction.sequence([pathLeft,flipTextureNegative, pathRight, flipTexturePositive])// Last, create a looping action that will repeat foreverlet neverEndingFlight =SKAction.repeatActionForever(flightOfTheBee)// Tell our bee to run the flight path, and away it goes!bee.runAction(neverEndingFlight) Run the project. You will see the bee flying back and forth, flapping its wings. You have officially learned the fundamentals of animation in SpriteKit! We will build on this knowledge to create a rich, animated game world for our players. Summary You have gained foundational knowledge of sprites, nodes, and actions in SpriteKit and already taken huge strides towards your first game with Swift. You configured your project for landscape orientation, drew your first sprite, and then made it move, spin, and scale. You added a bee texture to your sprite, created an image atlas, and animated through the frames of flight. Terrific work! Resources for Article: Further resources on this subject: Network Development with Swift [Article] Installing OpenStack Swift [Article] Flappy Swift [Article]
Read more
  • 0
  • 0
  • 4331

article-image-writing-cassandra-hdfs-using-hadoop-map-reduce-job
Manu Mukerji
17 Jul 2015
5 min read
Save for later

Writing to Cassandra from HDFS using a Hadoop Map Reduce Job

Manu Mukerji
17 Jul 2015
5 min read
In this post I am going to walk through how to setup a Map Reduce Job that lets you write to Cassandra. Use cases covered here will include streaming analytics into Cassandra. I am assuming you have a Cassandra cluster and Hadoop cluster available before we start, even single instances or localhost will suffice. The code used for this example is available at https://github.com/manum/mr-cassandra. Let’s create the Cassandra Keyspace and Table we are going to use. You can run the following in cqlsh (the command line utility that lets you talk to Cassandra). The table keytable only has one column in it called key; it is where we will store the data. CREATE KEYSPACE keytest WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 3 }; CREATE TABLE keytable ( key varchar, PRIMARY KEY (key) ); Here is what it will look like after it has run: cqlsh> USE keytest; cqlsh:keytest> select * from keytable; key ---------- test1234 (1 rows) We can start by looking at CassandraHelper.java and CassandraTester.java. CassandraHelper Methods: getSession(): retrieves the current session object so that no additional ones are created public Session getSession() { LOG.info("Starting getSession()"); if (this.session == null && (this.cluster == null || this.cluster.isClosed())) { LOG.info("Cluster not started or closed"); } else if (this.session.isClosed()) { LOG.info("session is closed. Creating a session"); this.session = this.cluster.connect(); } return this.session; } createConnection(String): pass the host for the Cassandra server public void createConnection(String node) { this.cluster = Cluster.builder().addContactPoint(node).build(); Metadata metadata = cluster.getMetadata(); System.out.printf("Connected to cluster: %sn",metadata.getClusterName()); for ( Host host : metadata.getAllHosts() ) { System.out.printf("Datatacenter: %s; Host: %s; Rack: %sn", host.getDatacenter(), host.getAddress(), host.getRack()); } this.session = cluster.connect(); this.prepareQueries(); } closeConnection(): closes the connection after everything is completed. public void closeConnection() { cluster.close(); } prepareQueries(): This method prepares queries that are optimized on the server side. It is recommended to use prepared queries in cases where you are running the same query often or where the query does not change but the data might, i.e. when doing several inserts. private void prepareQueries() { LOG.info("Starting prepareQueries()"); this.preparedStatement = this.session.prepare(this.query); } addKey(String): Method to add data to the cluster, it also has try catch blocks to catch exceptions and tell you what is occurring. public void addKey(String key) { Session session = this.getSession(); if(key.length()>0) { try { session.execute(this.preparedStatement.bind(key) ); //session.executeAsync(this.preparedStatement.bind(key)); } catch (NoHostAvailableException e) { System.out.printf("No host in the %s cluster can be contacted to execute the query.n", session.getCluster()); Session.State st = session.getState(); for ( Host host : st.getConnectedHosts() ) { System.out.println("In flight queries::"+st.getInFlightQueries(host)); System.out.println("open connections::"+st.getOpenConnections(host)); } } catch (QueryExecutionException e) { System.out.println("An exception was thrown by Cassandra because it cannot " + "successfully execute the query with the specified consistency level."); } catch (IllegalStateException e) { System.out.println("The BoundStatement is not ready."); } } } CassandraTester: This class has a void main method in which you need to provide the host you want to connect to and it will result in writing the value "test1234" into Cassandra. MapReduceExample.java is the interesting file here. It has a Mapper Class, Reducer Class and a main method to initialize the job. Under the Mapper you will find setup() and cleanup() methods - called automatically by the Map Reduce framework for setup and cleanup operations - which you will use to connect to Cassandra and for cleaning up the connection afterwards. I modified the standard word count example for this so the program now counts lines instead and will write them all to Cassandra. The output of the reducer is basically lines and count. To run this example here is what you need to do: Clone the repo from https://github.com/manum/mr-cassandra Run mvn install to create a jar in the target/ folder scp the jar to your Hadoop cluster Copy over the test input (For this test I used the entire works of Shakespeare all-shakespeare.txt in git) To run the jar use the following command hadoop jar mr_cassandra-0.0.1-SNAPSHOT-jar-with-dependencies.jar com.example.com.mr_cassandra.MapReduceExample /user/ubuntu/all-shakespeare.txt /user/ubuntu/output/ If you run the above steps, it should kick off the job. After the job is complete go to cqlsh and run select * from keytable limit 10; cqlsh:keytest> select * from keytable limit 10; key ---------------------------------------------------------------- REGANtGood sir, no more; these are unsightly tricks: KINGtWe lost a jewel of her; and our esteem ROSALINDtAy, but when? tNow leaves him. tThy brother by decree is banished: DUCHESS OF YORKtI had a Richard too, and thou didst kill him; JULIETtWho is't that calls? is it my lady mother? ARTHURtO, save me, Hubert, save me! my eyes are out tFull of high feeding, madly hath broke loose tSwift-winged with desire to get a grave, (10 rows) cqlsh:keytest> About the author Manu Mukerji has a background in cloud computing and big data, handling billions of transactions per day in real time. He enjoys building and architecting scalable, highly available data solutions, and has extensive experience working in online advertising and social media. Twitter: @next2manu LinkedIn: https://www.linkedin.com/in/manumukerji/
Read more
  • 0
  • 0
  • 7698
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-getting-started-apache-spark
Packt
17 Jul 2015
7 min read
Save for later

Getting Started with Apache Spark

Packt
17 Jul 2015
7 min read
In this article by Rishi Yadav, the author of Spark Cookbook, we will cover the following recipes: Installing Spark from binaries Building the Spark source code with Maven (For more resources related to this topic, see here.) Introduction Apache Spark is a general-purpose cluster computing system to process big data workloads. What sets Spark apart from its predecessors, such as MapReduce, is its speed, ease-of-use, and sophisticated analytics. Apache Spark was originally developed at AMPLab, UC Berkeley, in 2009. It was made open source in 2010 under the BSD license and switched to the Apache 2.0 license in 2013. Toward the later part of 2013, the creators of Spark founded Databricks to focus on Spark's development and future releases. Talking about speed, Spark can achieve sub-second latency on big data workloads. To achieve such low latency, Spark makes use of the memory for storage. In MapReduce, memory is primarily used for actual computation. Spark uses memory both to compute and store objects. Spark also provides a unified runtime connecting to various big data storage sources, such as HDFS, Cassandra, HBase, and S3. It also provides a rich set of higher-level libraries for different big data compute tasks, such as machine learning, SQL processing, graph processing, and real-time streaming. These libraries make development faster and can be combined in an arbitrary fashion. Though Spark is written in Scala, and this book only focuses on recipes in Scala, Spark also supports Java and Python. Spark is an open source community project, and everyone uses the pure open source Apache distributions for deployments, unlike Hadoop, which has multiple distributions available with vendor enhancements. The following figure shows the Spark ecosystem: The Spark runtime runs on top of a variety of cluster managers, including YARN (Hadoop's compute framework), Mesos, and Spark's own cluster manager called standalone mode. Tachyon is a memory-centric distributed file system that enables reliable file sharing at memory speed across cluster frameworks. In short, it is an off-heap storage layer in memory, which helps share data across jobs and users. Mesos is a cluster manager, which is evolving into a data center operating system. YARN is Hadoop's compute framework that has a robust resource management feature that Spark can seamlessly use. Installing Spark from binaries Spark can be either built from the source code or precompiled binaries can be downloaded from http://spark.apache.org. For a standard use case, binaries are good enough, and this recipe will focus on installing Spark using binaries. Getting ready All the recipes in this book are developed using Ubuntu Linux but should work fine on any POSIX environment. Spark expects Java to be installed and the JAVA_HOME environment variable to be set. In Linux/Unix systems, there are certain standards for the location of files and directories, which we are going to follow in this book. The following is a quick cheat sheet: Directory Description /bin Essential command binaries /etc Host-specific system configuration /opt Add-on application software packages /var Variable data /tmp Temporary files /home User home directories How to do it... At the time of writing this, Spark's current version is 1.4. Please check the latest version from Spark's download page at http://spark.apache.org/downloads.html. Binaries are developed with a most recent and stable version of Hadoop. To use a specific version of Hadoop, the recommended approach is to build from sources, which will be covered in the next recipe. The following are the installation steps: Open the terminal and download binaries using the following command: $ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.4.0-bin-hadoop2.4.tgz Unpack binaries: $ tar -zxf spark-1.4.0-bin-hadoop2.4.tgz Rename the folder containing binaries by stripping the version information: $ sudo mv spark-1.4.0-bin-hadoop2.4 spark Move the configuration folder to the /etc folder so that it can be made a symbolic link later: $ sudo mv spark/conf/* /etc/spark Create your company-specific installation directory under /opt. As the recipes in this book are tested on infoobjects sandbox, we are going to use infoobjects as directory name. Create the /opt/infoobjects directory: $ sudo mkdir -p /opt/infoobjects Move the spark directory to /opt/infoobjects as it's an add-on software package: $ sudo mv spark /opt/infoobjects/ Change the ownership of the spark home directory to root: $ sudo chown -R root:root /opt/infoobjects/spark Change permissions of the spark home directory, 0755 = user:read-write-execute group:read-execute world:read-execute: $ sudo chmod -R 755 /opt/infoobjects/spark Move to the spark home directory: $ cd /opt/infoobjects/spark Create the symbolic link: $ sudo ln -s /etc/spark conf Append to PATH in .bashrc: $ echo "export PATH=$PATH:/opt/infoobjects/spark/bin" >> /home/hduser/.bashrc Open a new terminal. Create the log directory in /var: $ sudo mkdir -p /var/log/spark Make hduser the owner of the Spark log directory. $ sudo chown -R hduser:hduser /var/log/spark Create the Spark tmp directory: $ mkdir /tmp/spark Configure Spark with the help of the following command lines: $ cd /etc/spark$ echo "export HADOOP_CONF_DIR=/opt/infoobjects/hadoop/etc/hadoop">> spark-env.sh$ echo "export YARN_CONF_DIR=/opt/infoobjects/hadoop/etc/Hadoop">> spark-env.sh$ echo "export SPARK_LOG_DIR=/var/log/spark" >> spark-env.sh$ echo "export SPARK_WORKER_DIR=/tmp/spark" >> spark-env.sh Building the Spark source code with Maven Installing Spark using binaries works fine in most cases. For advanced cases, such as the following (but not limited to), compiling from the source code is a better option: Compiling for a specific Hadoop version Adding the Hive integration Adding the YARN integration Getting ready The following are the prerequisites for this recipe to work: Java 1.6 or a later version Maven 3.x How to do it... The following are the steps to build the Spark source code with Maven: Increase MaxPermSize for heap: $ echo "export _JAVA_OPTIONS="-XX:MaxPermSize=1G"" >> /home/hduser/.bashrc Open a new terminal window and download the Spark source code from GitHub: $ wget https://github.com/apache/spark/archive/branch-1.4.zip Unpack the archive: $ gunzip branch-1.4.zip Move to the spark directory: $ cd spark Compile the sources with these flags: Yarn enabled, Hadoop version 2.4, Hive enabled, and skipping tests for faster compilation: $ mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -DskipTests clean package Move the conf folder to the etc folder so that it can be made a symbolic link: $ sudo mv spark/conf /etc/ Move the spark directory to /opt as it's an add-on software package: $ sudo mv spark /opt/infoobjects/spark Change the ownership of the spark home directory to root: $ sudo chown -R root:root /opt/infoobjects/spark Change the permissions of the spark home directory 0755 = user:rwx group:r-x world:r-x: $ sudo chmod -R 755 /opt/infoobjects/spark Move to the spark home directory: $ cd /opt/infoobjects/spark Create a symbolic link: $ sudo ln -s /etc/spark conf Put the Spark executable in the path by editing .bashrc: $ echo "export PATH=$PATH:/opt/infoobjects/spark/bin" >> /home/hduser/.bashrc Create the log directory in /var: $ sudo mkdir -p /var/log/spark Make hduser the owner of the Spark log directory: $ sudo chown -R hduser:hduser /var/log/spark Create the Spark tmp directory: $ mkdir /tmp/spark Configure Spark with the help of the following command lines: $ cd /etc/spark$ echo "export HADOOP_CONF_DIR=/opt/infoobjects/hadoop/etc/hadoop">> spark-env.sh$ echo "export YARN_CONF_DIR=/opt/infoobjects/hadoop/etc/Hadoop">> spark-env.sh$ echo "export SPARK_LOG_DIR=/var/log/spark" >> spark-env.sh$ echo "export SPARK_WORKER_DIR=/tmp/spark" >> spark-env.sh Summary In this article, we learned what Apache Spark is, how we can install Spark from binaries, and how to build Spark source code with Maven. Resources for Article: Further resources on this subject: Big Data Analysis (R and Hadoop) [Article] YARN and Hadoop [Article] Hadoop and SQL [Article]
Read more
  • 0
  • 0
  • 2159

article-image-speeding-gradle-builds-android
Packt
16 Jul 2015
7 min read
Save for later

Speeding up Gradle builds for Android

Packt
16 Jul 2015
7 min read
In this article by Kevin Pelgrims, the author of the book, Gradle for Android, we will cover a few tips and tricks that will help speed up the Gradle builds. A lot of Android developers that start using Gradle complain about the prolonged compilation time. Builds can take longer than they do with Ant, because Gradle has three phases in the build lifecycle that it goes through every time you execute a task. This makes the whole process very configurable, but also quite slow. Luckily, there are several ways to speed up Gradle builds. Gradle properties One way to tweak the speed of a Gradle build is to change some of the default settings. You can enable parallel builds by setting a property in a gradle.properties file that is placed in the root of a project. All you need to do is add the following line: org.gradle.parallel=true Another easy win is to enable the Gradle daemon, which starts a background process when you run a build the first time. Any subsequent builds will then reuse that background process, thus cutting out the startup cost. The process is kept alive as long as you use Gradle, and is terminated after three hours of idle time. Using the daemon is particularly useful when you use Gradle several times in a short time span. You can enable the daemon in the gradle.properties file like this: org.gradle.daemon=true In Android Studio, the Gradle daemon is enabled by default. This means that after the first build from inside the IDE, the next builds are a bit faster. If you build from the command-line interface; however, the Gradle daemon is disabled, unless you enable it in the properties. To speed up the compilation itself, you can tweak parameters on the Java Virtual Machine (JVM). There is a Gradle property called jvmargs that enables you to set different values for the memory allocation pool for the JVM. The two parameters that have a direct influence on your build speed are Xms and Xmx. The Xms parameter is used to set the initial amount of memory to be used, while the Xmx parameter is used to set a maximum. You can manually set these values in the gradle.properties file like this: org.gradle.jvmargs=-Xms256m -Xmx1024m You need to set the desired amount and a unit, which can be k for kilobytes, m for megabytes, and g for gigabytes. By default, the maximum memory allocation (Xmx) is set to 256 MB, and the starting memory allocation (Xms) is not set at all. The optimal settings depend on the capabilities of your computer. The last property you can configure to influence build speed is org.gradle.configureondemand. This property is particularly useful if you have complex projects with several modules, as it tries to limit the time spent in the configuration phase, by skipping modules that are not required for the task that is being executed. If you set this property to true, Gradle will try to figure out which modules have configuration changes and which ones do not, before it runs the configuration phase. This is a feature that will not be very useful if you only have an Android app and a library in your project. If you have a lot of modules that are loosely coupled, though, this feature can save you a lot of build time. System-wide Gradle properties If you want to apply these properties system-wide to all your Gradle-based projects, you can create a gradle.properties file in the .gradle folder in your home directory. On Microsoft Windows, the full path to this directory is %UserProfile%.gradle, on Linux and Mac OS X it is ~/.gradle. It is a good practice to set these properties in your home directory, rather than on the project level. The reason for this is that you usually want to keep memory consumption down on build servers, and the build time is of less importance. Android Studio The Gradle properties you can change to speed up the compilation process are also configurable in the Android Studio settings. To find the compiler settings, open the Settings dialog, and then navigate to Build, Execution, Deployment | Compiler. On that screen, you can find settings for parallel builds, JVM options, configure on demand, and so on. These settings only show up for Gradle-based Android modules. Have a look at the following screenshot: Configuring these settings from Android Studio is easier than configuring them manually in the build configuration file, and the settings dialog makes it easy to find properties that influence the build process. Profiling If you want to find out which parts of the build are slowing the process down, you can profile the entire build process. You can do this by adding the --profile flag whenever you execute a Gradle task. When you provide this flag, Gradle creates a profiling report, which can tell you which parts of the build process are the most time consuming. Once you know where the bottlenecks are, you can make the necessary changes. The report is saved as an HTML file in your module in build/reports/profile. This is the report generated after executing the build task on a multimodule project: The profiling report shows an overview of the time spent in each phase while executing the task. Below that summary is an overview of how much time Gradle spent on the configuration phase for each module. There are two more sections in the report that are not shown in the screenshot. The Dependency Resolution section shows how long it took to resolve dependencies, per module. Lastly, the Task Execution section contains an extremely detailed task execution overview. This overview has the timing for every single task, ordered by execution time from high to low. Jack and Jill If you are willing to use experimental tools, you can enable Jack and Jill to speed up builds. Jack (Java Android Compiler Kit) is a new Android build toolchain that compiles Java source code directly to the Android Dalvik executable (dex) format. It has its own .jack library format and takes care of packaging and shrinking as well. Jill (Jack Intermediate Library Linker) is a tool that can convert .aar and .jar files to .jack libraries. These tools are still quite experimental, but they were made to improve build times and to simplify the Android build process. It is not recommended to start using Jack and Jill for production versions of your projects, but they are made available so that you can try them out. To be able to use Jack and Jill, you need to use build tools version 21.1.1 or higher, and the Android plugin for Gradle version 1.0.0 or higher. Enabling Jack and Jill is as easy as setting one property in the defaultConfig block: android {   buildToolsRevision '22.0.1'   defaultConfig {     useJack = true   }} You can also enable Jack and Jill on a certain build type or product flavor. This way, you can continue using the regular build toolchain, and have an experimental build on the side: android {   productFlavors {       regular {           useJack = false       }        experimental {           useJack = true       }   }} As soon as you set useJack to true, minification and obfuscation will not go through ProGuard anymore, but you can still use the ProGuard rules syntax to specify certain rules and exceptions. Use the same proguardFiles method that we mentioned before, when talking about ProGuard. Summary This article helped us lean different ways to speed up builds; we first saw how we can tweak the settings, configure Gradle and the JVM, we saw how to detect parts that are slowing down the process, and then we learned Jack and Jill tool. Resources for Article: Further resources on this subject: Android Tech Page Android Virtual Device Manager Apache Maven and m2eclipse  Saying Hello to Unity and Android
Read more
  • 0
  • 0
  • 8354

article-image-understanding-mutability-and-immutability-python-c-and-javascript
Packt
14 Jul 2015
15 min read
Save for later

Understanding mutability and immutability in Python, C#, and JavaScript

Packt
14 Jul 2015
15 min read
In this article by Gastón C. Hillar, author of the book, Learning Object-Oriented Programming, you will learn the concept of mutability and immutability in the programming languages such as, Python, C#, and JavaScript. What’s the difference? By default, any instance field or attribute works like a variable; therefore we can change their values. When we create an instance of a class that defines many public instance fields, we are creating a mutable object, that is, an object that can change its state. For example, let's think about a class named MutableVector3D that represents a mutable 3D vector with three public instance fields: X, Y, and Z. We can create a new MutableVector3D instance and initialize the X, Y, and Z attributes. Then, we can call the Sum method with their delta values for X, Y, and Z as arguments. The delta values specify the difference between the existing value and the new or desired value. So, for example, if we specify a positive value of 20 in the deltaX parameter, it means that we want to add 20 to the X value. The following lines show pseudocode in a neutral programming language that create a new MutableVector3D instance called myMutableVector, initialized with values for the X, Y, and Z fields. Then, the code calls the Sum method with the delta values for X, Y, and Z as arguments, as shown in the following code: myMutableVector = new MutableVector3D instance with X = 30, Y = 50 and Z = 70 myMutableVector.Sum(deltaX: 20, deltaY: 30, deltaZ: 15) The initial values for the myMutableVector field are 30 for X, 50 for Y, and 70 for Z. The Sum method changes the values of all the three fields; therefore, the object state mutates as follows: myMutableVector.X mutates from 30 to 30 + 20 = 50 myMutableVector.Y mutates from 50 to 50 + 30 = 80 myMutableVector.Z mutates from 70 to 70 + 15 = 85 The values for the myMutableVector field after the call to the Sum method are 50 for X, 80 for Y, and 85 for Z. We can say this method mutated the object's state; therefore, myMutableVector is a mutable object: an instance of a mutable class. Mutability is very important in object-oriented programming. In fact, whenever we expose fields and/or properties, we will create a class that will generate mutable instances. However, sometimes a mutable object can become a problem. In certain situations, we want to avoid objects to change their state. For example, when we work with a concurrent code, an object that cannot change its state solves many concurrency problems and avoids potential bugs. For example, we can create an immutable version of the previous MutableVector3D class to represent an immutable 3D vector. The new ImmutableVector3D class has three read-only properties: X, Y, and Z. Thus, there are only three getter methods without setter methods, and we cannot change the values of the underlying internal fields: m_X, m_Y, and m_Z. We can create a new ImmutableVector3D instance and initialize the underlying internal fields: m_X, m_Y, and m_Z. X, Y, and Z attributes. Then, we can call the Sum method with the delta values for X, Y, and Z as arguments. The following lines show the pseudocode in a neutral programming language that create a new ImmutableVector3D instance called myImmutableVector, which is initialized with values for X, Y, and Z as arguments. Then, the pseudocode calls the Sum method with the delta values for X, Y, and Z as arguments: myImmutableVector = new ImmutableVector3D instance with X = 30, Y = 50 and Z = 70 myImmutableSumVector = myImmutableVector.Sum(deltaX: 20, deltaY: 30, deltaZ: 15) However, this time the Sum method returns a new instance of the ImmutableVector3D class with the X, Y, and Z values initialized to the sum of X, Y, and Z and the delta values for X, Y, and Z. So, myImmutableSumVector is a new ImmutableVector3D instance initialized with X = 50, Y = 80, and Z = 85. The call to the Sum method generated a new instance and didn't mutate the existing object. The immutable version adds an overhead as compared with the mutable version because it's necessary to create a new instance of a class as a result of calling the Sum method. The mutable version just changed the values for the attributes and it wasn't necessary to generate a new instance. Obviously, the immutable version has a memory and a performance overhead. However, when we work with the concurrent code, it makes sense to pay for the extra overhead to avoid potential issues caused by mutable objects. Using methods to add behaviors to classes in Python So far, we have added instance methods to classes and used getter and setter methods combined with decorators to define properties. Now, we want to generate a class to represent the mutable version of a 3D vector. We will use properties with simple getter and setter methods for x, y, and z. The sum public instance method receives the delta values for x, y, and z and mutates an object, that is, the setter method changes the values of x, y, and z. Here is the initial code of the MutableVector3D class: class MutableVector3D: def __init__(self, x, y, z): self.__x = x self.__y = y self.__z = z def sum(self, delta_x, delta_y, delta_z): self.__x += delta_x self.__y += delta_y self.__z += delta_z @property def x(self): return self.__x @x.setter def x(self, x): self.__x = x @property def y(self): return self.__y @y.setter def y(self, y): self.__y = y @property def z(self): return self.__z @z.setter def z(self, z): self.__z = z It's a very common requirement to generate a 3D vector with all the values initialized to 0, that is, x = 0, y = 0, and z = 0. A 3D vector with these values is known as an origin vector. We can add a class method to the MutableVector3D class named origin_vector to generate a new instance of the class initialized with all the values initialized to 0. It's necessary to add the @classmethod decorator before the class method header. Instead of receiving self as the first argument, a class method receives the current class; the parameter name is usually named cls. The following code defines the origin_vector class method: @classmethod def origin_vector(cls): return cls(0, 0, 0) The preceding method returns a new instance of the current class (cls) with 0 as the initial value for the three elements. The class method receives cls as the only argument; therefore, it will be a parameterless method when we call it because Python includes a class as a parameter under the hood. The following command calls the origin_vector class method to generate a 3D vector, calls the sum method for the generated instance, and prints the values for the three elements: mutableVector3D = MutableVector3D.origin_vector() mutableVector3D.sum(5, 10, 15) print(mutableVector3D.x, mutableVector3D.y, mutableVector3D.z) Now, we want to generate a class to represent the immutable version of a 3D vector. In this case, we will use read-only properties for x, y, and z. The sum public instance method receives the delta values for x, y, and z and returns a new instance of the same class with the values of x, y, and z initialized with the results of the sum. Here is the code of the ImmutableVector3D class: class ImmutableVector3D: def __init__(self, x, y, z): self.__x = x self.__y = y self.__z = z def sum(self, delta_x, delta_y, delta_z): return type(self)(self.__x + delta_x, self.__y + delta_y, self.__z + delta_z) @property def x(self): return self.__x @property def y(self): return self.__y @property def z(self): return self.__z @classmethod def equal_elements_vector(cls, initial_value): return cls(initial_value, initial_value, initial_value) @classmethod def origin_vector(cls): return cls.equal_elements_vector(0) Note that the sum method uses type(self) to generate and return a new instance of the current class. In this case, the origin_vector class method returns the results of calling the equal_elements_vector class method with 0 as an argument. Remember that the cls argument refers to the actual class. The equal_elements_vector class method receives an initial_value argument for all the elements of the 3D vector, creates an instance of the actual class, and initializes all the elements with the received unique value. The origin_vector class method demonstrates how we can call another class method in a class method. The following command calls the origin_vector class method to generate a 3D vector, calls the sum method for the generated instance, and prints the values for the three elements of the new instance returned by the sum method: vector0 = ImmutableVector3D.origin_vector() vector1 = vector0.sum(5, 10, 15) print(vector1.x, vector1.y, vector1.z) As explained previously, we can change the values of the private attributes; therefore, the ImmutableVector3D class isn't 100 percent immutable. However, we are all adults and don't expect the users of a class with read-only properties to change the values of private attributes hidden under difficult to access names. Using methods to add behaviors to classes in C# So far, we have added instance methods to a class in C#, and used getter and setter methods to define properties. Now, we want to generate a class to represent the mutable version of a 3D vector in C#. We will use auto-implemented properties for X, Y, and Z. The public Sum instance method receives the delta values for X, Y, and Z (deltaX, deltaY, and deltaZ) and mutates the object, that is, the method changes the values of X, Y, and Z. The following shows the initial code of the MutableVector3D class: class MutableVector3D { public double X { get; set; } public double Y { get; set; } public double Z { get; set; } public void Sum(double deltaX, double deltaY, double deltaZ) { this.X += deltaX; this.Y += deltaY; this.Z += deltaZ; } public MutableVector3D(double x, double y, double z) { this.X = x; this.Y = y; this.Z = z; } } It's a very common requirement to generate a 3D vector with all the values initialized to 0, that is, X = 0, Y = 0, and Z = 0. A 3D vector with these values is known as an origin vector. We can add a class method to the MutableVector3D class named OriginVector to generate a new instance of the class initialized with all the values initialized to 0. Class methods are also known as static methods in C#. It's necessary to add the static keyword after the public access modifier before the class method name. The following commands define the OriginVector static method: public static MutableVector3D OriginVector() { return new MutableVector3D(0, 0, 0); } The preceding method returns a new instance of the MutableVector3D class with 0 as the initial value for all the three elements. The following code calls the OriginVector static method to generate a 3D vector, calls the Sum method for the generated instance, and prints the values for all the three elements on the console output: var mutableVector3D = MutableVector3D.OriginVector(); mutableVector3D.Sum(5, 10, 15); Console.WriteLine(mutableVector3D.X, mutableVector3D.Y, mutableVector3D.Z) Now, we want to generate a class to represent the immutable version of a 3D vector. In this case, we will use read-only properties for X, Y, and Z. We will use auto-generated properties with private set. The Sum public instance method receives the delta values for X, Y, and Z (deltaX, deltaY, and deltaZ) and returns a new instance of the same class with the values of X, Y, and Z initialized with the results of the sum. The code for the ImmutableVector3D class is as follows: class ImmutableVector3D { public double X { get; private set; } public double Y { get; private set; } public double Z { get; private set; } public ImmutableVector3D Sum(double deltaX, double deltaY, double deltaZ) { return new ImmutableVector3D ( this.X + deltaX, this.Y + deltaY, this.Z + deltaZ); } public ImmutableVector3D(double x, double y, double z) { this.X = x; this.Y = y; this.Z = z; } public static ImmutableVector3D EqualElementsVector(double initialValue) { return new ImmutableVector3D(initialValue, initialValue, initialValue); } public static ImmutableVector3D OriginVector() { return ImmutableVector3D.EqualElementsVector(0); } } In the new class, the Sum method returns a new instance of the ImmutableVector3D class, that is, the current class. In this case, the OriginVector static method returns the results of calling the EqualElementsVector static method with 0 as an argument. The EqualElementsVector class method receives an initialValue argument for all the elements of the 3D vector, creates an instance of the actual class, and initializes all the elements with the received unique value. The OriginVector static method demonstrates how we can call another static method in a static method. The following code calls the OriginVector static method to generate a 3D vector, calls the Sum method for the generated instance, and prints all the values for the three elements of the new instance returned by the Sum method on the console output: var vector0 = ImmutableVector3D.OriginVector(); var vector1 = vector0.Sum(5, 10, 15); Console.WriteLine(vector1.X, vector1.Y, vector1.Z); C# doesn't allow users of the ImmutableVector3D class to change the values of X, Y, and Z properties. The code doesn't compile if you try to assign a new value to any of these properties. Thus, we can say that the ImmutableVector3D class is 100 percent immutable. Using methods to add behaviors to constructor functions in JavaScript So far, we have added methods to a constructor function that produced instance methods in a generated object. In addition, we used getter and setter methods combined with local variables to define properties. Now, we want to generate a constructor function to represent the mutable version of a 3D vector. We will use properties with simple getter and setter methods for x, y, and z. The sum public instance method receives the delta values for x, y, and z and mutates an object, that is, the method changes the values of x, y, and z. The following code shows the initial code of the MutableVector3D constructor function: function MutableVector3D(x, y, z) { var _x = x; var _y = y; var _z = z; Object.defineProperty(this, 'x', { get: function(){ return _x; }, set: function(val){ _x = val; } }); Object.defineProperty(this, 'y', { get: function(){ return _y; }, set: function(val){ _y = val; } }); Object.defineProperty(this, 'z', { get: function(){ return _z; }, set: function(val){ _z = val; } }); this.sum = function(deltaX, deltaY, deltaZ) { _x += deltaX; _y += deltaY; _z += deltaZ; } } It's a very common requirement to generate a 3D vector with all the values initialized to 0, that is, x = 0, y = 0, and, z = 0. A 3D vector with these values is known as an origin vector. We can add a function to the MutableVector3D constructor function named originVector to generate a new instance of a class with all the values initialized to 0. The following code defines the originVector function: MutableVector3D.originVector = function() { return new MutableVector3D(0, 0, 0); }; The method returns a new instance built in the MutableVector3D constructor function with 0 as the initial value for all the three elements. The following code calls the originVector function to generate a 3D vector, calls the sum method for the generated instance, and prints all the values for all the three elements: var mutableVector3D = MutableVector3D.originVector(); mutableVector3D.sum(5, 10, 15); console.log(mutableVector3D.x, mutableVector3D.y, mutableVector3D.z); Now, we want to generate a constructor function to represent the immutable version of a 3D vector. In this case, we will use read-only properties for x, y, and z. In this case, we will use the ImmutableVector3D.prototype property to define the sum method. The method receives the values of delta for x, y, and z, and returns a new instance with the values of x, y, and z initialized with the results of the sum. The following code shows the ImmutableVector3D constructor function and the additional code that defines all the other methods: function ImmutableVector3D(x, y, z) { var _x = x; var _y = y; var _z = z; Object.defineProperty(this, 'x', { get: function(){ return _x; } }); Object.defineProperty(this, 'y', { get: function(){ return _y; } }); Object.defineProperty(this, 'z', { get: function(){ return _z; } }); } ImmutableVector3D.prototype.sum = function(deltaX, deltaY, deltaZ) { return new ImmutableVector3D( this.x + deltaX, this.y + deltaY, this.z + deltaZ); }; ImmutableVector3D.equalElementsVector = function(initialValue) { return new ImmutableVector3D(initialValue, initialValue, initialValue); }; ImmutableVector3D.originVector = function() { return ImmutableVector3D.equalElementsVector(0); }; Again, note that the preceding code defines the sum method in the ImmutableVector3D.prototype method. This method will be available to all the instances generated in the ImmutableVector3D constructor function. The sum method generates and returns a new instance of ImmutableVector3D. In this case, the originVector method returns the results of calling the equalElementsVector method with 0 as an argument. The equalElementsVector method receives an initialValue argument for all the elements of the 3D vector, creates an instance of the actual class, and initializes all the elements with the received unique value. The originVector method demonstrates how we can call another function defined in the constructor function. The following code calls the originVector method to generate a 3D vector, calls the sum method for the generated instance, and prints the values for all the three elements of the new instance returned by the sum method: var vector0 = ImmutableVector3D.originVector(); var vector1 = vector0.sum(5, 10, 15); console.log(vector1.x, vector1.y, vector1.z); Summary In this article, you learned the concept of mutability and immutability in the programming languages such as, Python, C#, and JavaScript and used methods to add behaviors in each of the programming language. So, what’s next? Continue Learning Object-Oriented Programming with Gastón’s book – find out more here.
Read more
  • 0
  • 0
  • 2622

article-image-rest-apis-social-network-data-using-py2neo
Packt
14 Jul 2015
20 min read
Save for later

REST APIs for social network data using py2neo

Packt
14 Jul 2015
20 min read
In this article wirtten by Sumit Gupta, author of the book Building Web Applications with Python and Neo4j we will discuss and develop RESTful APIs for performing CRUD and search operations over our social network data, using Flask-RESTful extension and py2neo extension—Object-Graph Model (OGM). Let's move forward to first quickly talk about the OGM and then develop full-fledged REST APIs over our social network data. (For more resources related to this topic, see here.) ORM for graph databases py2neo – OGM We discussed about the py2neo in Chapter 4, Getting Python and Neo4j to Talk Py2neo. In this section, we will talk about one of the py2neo extensions that provides high-level APIs for dealing with the underlying graph database as objects and its relationships. Object-Graph Mapping (http://py2neo.org/2.0/ext/ogm.html) is one of the popular extensions of py2neo and provides the mapping of Neo4j graphs in the form of objects and relationships. It provides similar functionality and features as Object Relational Model (ORM) available for relational databases py2neo.ext.ogm.Store(graph) is the base class which exposes all operations with respect to graph data models. Following are important methods of Store which we will be using in the upcoming section for mutating our social network data: Store.delete(subj): It deletes a node from the underlying graph along with its associated relationships. subj is the entity that needs to be deleted. It raises an exception in case the provided entity is not linked to the server. Store.load(cls, node): It loads the data from the database node into cls, which is the entity defined by the data model. Store.load_related(subj, rel_type, cls): It loads all the nodes related to subj of relationship as defined by rel_type into cls and then further returns the cls object. Store.load_indexed(index_name, key,value, cls): It queries the legacy index, loads all the nodes that are mapped by key-value, and returns the associated object. Store.relate(subj, rel_type, obj, properties=None): It defines the relationship between two nodes, where subj and cls are two nodes connected by rel_type. By default, all relationships point towards the right node. Store.save(subj, node=None): It save and creates a given entity/node—subj into the graph database. The second argument is of type Node, which if given will not create a new node and will change the already existing node. Store.save_indexed(index_name,key,value,subj): It saves the given entity into the graph and also creates an entry into the given index for future reference. Refer to http://py2neo.org/2.0/ext/ogm.html#py2neo.ext.ogm.Store for the complete list of methods exposed by Store class. Let's move on to the next section where we will use the OGM for mutating our social network data model. OGM supports Neo4j version 1.9, so all features of Neo4j 2.0 and above are not supported such as labels. Social network application with Flask-RESTful and OGM In this section, we will develop a full-fledged application for mutating our social network data and will also talk about the basics of Flask-RESTful and OGM. Creating object model Perform the following steps to create the object model and CRUD/search functions for our social network data: Our social network data contains two kind of entities—Person and Movies. So as a first step let's create a package model and within the model package let's define a module SocialDataModel.py with two classes—Person and Movie: class Person(object):    def __init__(self, name=None,surname=None,age=None,country=None):        self.name=name        self.surname=surname        self.age=age        self.country=country   class Movie(object):    def __init__(self, movieName=None):        self.movieName=movieName Next, let's define another package operations and two python modules ExecuteCRUDOperations.py and ExecuteSearchOperations.py. The ExecuteCRUDOperations module will contain the following three classes: DeleteNodesRelationships: It will contain one method each for deleting People nodes and Movie nodes and in the __init__ method, we will establish the connection to the graph database. class DeleteNodesRelationships(object):    '''    Define the Delete Operation on Nodes    '''    def __init__(self,host,port,username,password):        #Authenticate and Connect to the Neo4j Graph Database        py2neo.authenticate(host+':'+port, username, password)        graph = Graph('http://'+host+':'+port+'/db/data/')        store = Store(graph)        #Store the reference of Graph and Store.        self.graph=graph        self.store=store      def deletePersonNode(self,node):        #Load the node from the Neo4j Legacy Index cls = self.store.load_indexed('personIndex', 'name', node.name, Person)          #Invoke delete method of store class        self.store.delete(cls[0])      def deleteMovieNode(self,node):        #Load the node from the Neo4j Legacy Index cls = self.store.load_indexed('movieIndex',   'name',node.movieName, Movie)        #Invoke delete method of store class            self.store.delete(cls[0]) Deleting nodes will also delete the associated relationships, so there is no need to have functions for deleting relationships. Nodes without any relationship do not make much sense for many business use cases, especially in a social network, unless there is a specific need or an exceptional scenario. UpdateNodesRelationships: It will contain one method each for updating People nodes and Movie nodes and, in the __init__ method, we will establish the connection to the graph database. class UpdateNodesRelationships(object):    '''      Define the Update Operation on Nodes    '''      def __init__(self,host,port,username,password):        #Write code for connecting to server      def updatePersonNode(self,oldNode,newNode):        #Get the old node from the Index        cls = self.store.load_indexed('personIndex', 'name', oldNode.name, Person)        #Copy the new values to the Old Node        cls[0].name=newNode.name        cls[0].surname=newNode.surname        cls[0].age=newNode.age        cls[0].country=newNode.country        #Delete the Old Node form Index        self.store.delete(cls[0])       #Persist the updated values again in the Index        self.store.save_unique('personIndex', 'name', newNode.name, cls[0])      def updateMovieNode(self,oldNode,newNode):          #Get the old node from the Index        cls = self.store.load_indexed('movieIndex', 'name', oldNode.movieName, Movie)        #Copy the new values to the Old Node        cls[0].movieName=newNode.movieName        #Delete the Old Node form Index        self.store.delete(cls[0])        #Persist the updated values again in the Index        self.store.save_ unique('personIndex', 'name', newNode.name, cls[0]) CreateNodesRelationships: This class will contain methods for creating People and Movies nodes and relationships and will then further persist them to the database. As with the other classes/ module, it will establish the connection to the graph database in the __init__ method: class CreateNodesRelationships(object):    '''    Define the Create Operation on Nodes    '''    def __init__(self,host,port,username,password):        #Write code for connecting to server    '''    Create a person and store it in the Person Dictionary.    Node is not saved unless save() method is invoked. Helpful in bulk creation    '''    def createPerson(self,name,surName=None,age=None,country=None):        person = Person(name,surName,age,country)        return person      '''    Create a movie and store it in the Movie Dictionary.    Node is not saved unless save() method is invoked. Helpful in bulk creation    '''    def createMovie(self,movieName):        movie = Movie(movieName)        return movie      '''    Create a relationships between 2 nodes and invoke a local method of Store class.    Relationship is not saved unless Node is saved or save() method is invoked.    '''    def createFriendRelationship(self,startPerson,endPerson):        self.store.relate(startPerson, 'FRIEND', endPerson)      '''    Create a TEACHES relationships between 2 nodes and invoke a local method of Store class.    Relationship is not saved unless Node is saved or save() method is invoked.    '''    def createTeachesRelationship(self,startPerson,endPerson):        self.store.relate(startPerson, 'TEACHES', endPerson)    '''    Create a HAS_RATED relationships between 2 nodes and invoke a local method of Store class.    Relationship is not saved unless Node is saved or save() method is invoked.    '''    def createHasRatedRelationship(self,startPerson,movie,ratings):      self.store.relate(startPerson, 'HAS_RATED', movie,{'ratings':ratings})    '''    Based on type of Entity Save it into the Server/ database    '''    def save(self,entity,node):        if(entity=='person'):            self.store.save_unique('personIndex', 'name', node.name, node)        else:            self.store.save_unique('movieIndex','name',node.movieName,node) Next we will define other Python module operations, ExecuteSearchOperations.py. This module will define two classes, each containing one method for searching Person and Movie node and of-course the __init__ method for establishing a connection with the server: class SearchPerson(object):    '''    Class for Searching and retrieving the the People Node from server    '''      def __init__(self,host,port,username,password):        #Write code for connecting to server      def searchPerson(self,personName):        cls = self.store.load_indexed('personIndex', 'name', personName, Person)        return cls;   class SearchMovie(object):    '''    Class for Searching and retrieving the the Movie Node from server    '''    def __init__(self,host,port,username,password):        #Write code for connecting to server      def searchMovie(self,movieName):        cls = self.store.load_indexed('movieIndex', 'name', movieName, Movie)        return cls; We are done with our data model and the utility classes that will perform the CRUD and search operation over our social network data using py2neo OGM. Now let's move on to the next section and develop some REST services over our data model. Creating REST APIs over data models In this section, we will create and expose REST services for mutating and searching our social network data using the data model created in the previous section. In our social network data model, there will be operations on either the Person or Movie nodes, and there will be one more operation which will define the relationship between Person and Person or Person and Movie. So let's create another package service and define another module MutateSocialNetworkDataService.py. In this module, apart from regular imports from flask and flask_restful, we will also import classes from our custom packages created in the previous section and create objects of model classes for performing CRUD and search operations. Next we will define the different classes or services which will define the structure of our REST Services. The PersonService class will define the GET, POST, PUT, and DELETE operations for searching, creating, updating, and deleting the Person nodes. class PersonService(Resource):    '''    Defines operations with respect to Entity - Person    '''    #example - GET http://localhost:5000/person/Bradley    def get(self, name):        node = searchPerson.searchPerson(name)        #Convert into JSON and return it back        return jsonify(name=node[0].name,surName=node[0].surname,age=node[0].age,country=node[0].country)      #POST http://localhost:5000/person    #{"name": "Bradley","surname": "Green","age": "24","country": "US"}    def post(self):          jsonData = request.get_json(cache=False)        attr={}        for key in jsonData:            attr[key]=jsonData[key]            print(key,' = ',jsonData[key] )        person = createOperation.createPerson(attr['name'],attr['surname'],attr['age'],attr['country'])        createOperation.save('person',person)          return jsonify(result='success')    #POST http://localhost:5000/person/Bradley    #{"name": "Bradley1","surname": "Green","age": "24","country": "US"}    def put(self,name):        oldNode = searchPerson.searchPerson(name)        jsonData = request.get_json(cache=False)        attr={}        for key in jsonData:            attr[key] = jsonData[key]            print(key,' = ',jsonData[key] )        newNode = Person(attr['name'],attr['surname'],attr['age'],attr['country'])          updateOperation.updatePersonNode(oldNode[0],newNode)          return jsonify(result='success')      #DELETE http://localhost:5000/person/Bradley1    def delete(self,name):        node = searchPerson.searchPerson(name)        deleteOperation.deletePersonNode(node[0])        return jsonify(result='success') The MovieService class will define the GET, POST, and DELETE operations for searching, creating, and deleting the Movie nodes. This service will not support the modification of Movie nodes because, once the Movie node is defined, it does not change in our data model. Movie service is similar to our Person service and leverages our data model for performing various operations. The RelationshipService class only defines POST which will create the relationship between the person and other given entity and can either be another Person or Movie. Following is the structure of the POST method: '''    Assuming that the given nodes are already created this operation    will associate Person Node either with another Person or Movie Node.      Request for Defining relationship between 2 persons: -        POST http://localhost:5000/relationship/person/Bradley        {"entity_type":"person","person.name":"Matthew","relationship": "FRIEND"}    Request for Defining relationship between Person and Movie        POST http://localhost:5000/relationship/person/Bradley        {"entity_type":"Movie","movie.movieName":"Avengers","relationship": "HAS_RATED"          "relationship.ratings":"4"}    '''    def post(self, entity,name):        jsonData = request.get_json(cache=False)        attr={}        for key in jsonData:            attr[key]=jsonData[key]            print(key,' = ',jsonData[key] )          if(entity == 'person'):            startNode = searchPerson.searchPerson(name)            if(attr['entity_type']=='movie'):                endNode = searchMovie.searchMovie(attr['movie.movieName'])                createOperation.createHasRatedRelationship(startNode[0], endNode[0], attr['relationship.ratings'])                createOperation.save('person', startNode[0])            elif (attr['entity_type']=='person' and attr['relationship']=='FRIEND'):                endNode = searchPerson.searchPerson(attr['person.name'])                createOperation.createFriendRelationship(startNode[0], endNode[0])                createOperation.save('person', startNode[0])            elif (attr['entity_type']=='person' and attr['relationship']=='TEACHES'):                endNode = searchPerson.searchPerson(attr['person.name'])                createOperation.createTeachesRelationship(startNode[0], endNode[0])                createOperation.save('person', startNode[0])        else:            raise HTTPException("Value is not Valid")          return jsonify(result='success') At the end, we will define our __main__ method, which will bind our services with the specific URLs and bring up our application: if __name__ == '__main__':    api.add_resource(PersonService,'/person','/person/<string:name>')    api.add_resource(MovieService,'/movie','/movie/<string:movieName>')    api.add_resource(RelationshipService,'/relationship','/relationship/<string:entity>/<string:name>')    webapp.run(debug=True) And we are done!!! Execute our MutateSocialNetworkDataService.py as a regular Python module and your REST-based services are up and running. Users of this app can use any REST-based clients such as SOAP-UI and can execute the various REST services for performing CRUD and search operations. Follow the comments provided in the code samples for the format of the request/response. In this section, we created and exposed REST-based services using Flask, Flask-RESTful, and OGM and performed CRUD and search operations over our social network data model. Using Neomodel in a Django app In this section, we will talk about the integration of Django and Neomodel. Django is a Python-based, powerful, robust, and scalable web-based application development framework. It is developed upon the Model-View-Controller (MVC) design pattern where developers can design and develop a scalable enterprise-grade application within no time. We will not go into the details of Django as a web-based framework but will assume that the readers have a basic understanding of Django and some hands-on experience in developing web-based and database-driven applications. Visit https://docs.djangoproject.com/en/1.7/ if you do not have any prior knowledge of Django. Django provides various signals or triggers that are activated and used to invoke or execute some user-defined functions on a particular event. The framework invokes various signals or triggers if there are any modifications requested to the underlying application data model such as pre_save(), post_save(), pre_delete, post_delete, and a few more. All the functions starting with pre_ are executed before the requested modifications are applied to the data model, and functions starting with post_ are triggered after the modifications are applied to the data model. And that's where we will hook our Neomodel framework, where we will capture these events and invoke our custom methods to make similar changes to our Neo4j database. We can reuse our social data model and the functions defined in ExploreSocialDataModel.CreateDataModel. We only need to register our event and things will be automatically handled by the Django framework. For example, you can register for the event in your Django model (models.py) by defining the following statement: signals.pre_save.connect(preSave, sender=Male) In the previous statement, preSave is the custom or user-defined method, declared in models.py. It will be invoked before any changes are committed to entity Male, which is controlled by the Django framework and is different from our Neomodel entity. Next, in preSave you need to define the invocations to the Neomodel entities and save them. Refer to the documentation at https://docs.djangoproject.com/en/1.7/topics/signals/ for more information on implementing signals in Django. Signals in Neomodel Neomodel also provides signals that are similar to Django signals and have the same behavior. Neomodel provides the following signals: pre_save, post_save, pre_delete, post_delete, and post_create. Neomodel exposes the following two different approaches for implementing signals: Define the pre..() and post..() methods in your model itself and Neomodel will automatically invoke it. For example, in our social data model, we can define def pre_save(self) in our Model.Male class to receive all events before entities are persisted in the database or server. Another approach is using Django-style signals, where we can define the connect() method in our Neomodel Model.py and it will produce the same results as in Django-based models: signals.pre_save.connect(preSave, sender=Male) Refer to http://neomodel.readthedocs.org/en/latest/hooks.html for more information on signals in Neomodel. In this section, we discussed about the integration of Django with Neomodel using Django signals. We also talked about the signals provided by Neomodel and their implementation approach. Summary Here we learned about creating web-based applications using Flask. We also used Flasks extensions such as Flask-RESTful for creating/exposing REST APIs for data manipulation. Finally, we created a full blown REST-based application over our social network data using Flask, Flask-RESTful, and py2neo OGM. We also learned about Neomodel and its various features and APIs provided to work with Neo4j. We also discussed about the integration of Neomodel with the Django framework. Resources for Article: Further resources on this subject: Firebase [article] Developing Location-based Services with Neo4j [article] Learning BeagleBone Python Programming [article]
Read more
  • 0
  • 0
  • 6793
article-image-fine-tune-nginx-configufine-tune-nginx-configurationfine-tune-nginx-configurationratio
Packt
14 Jul 2015
20 min read
Save for later

Fine-tune the NGINX Configuration

Packt
14 Jul 2015
20 min read
In this article by Rahul Sharma, author of the book NGINX High Performance, we will cover the following topics: NGINX configuration syntax Configuring NGINX workers Configuring NGINX I/O Configuring TCP Setting up the server (For more resources related to this topic, see here.) NGINX configuration syntax This section aims to cover it in good detail. The complete configuration file has a logical structure that is composed of directives grouped into a number of sections. A section defines the configuration for a particular NGINX module, for example, the http section defines the configuration for the ngx_http_core module. An NGINX configuration has the following syntax: Valid directives begin with a variable name and then state an argument or series of arguments separated by spaces. All valid directives end with a semicolon (;). Sections are defined with curly braces ({}). Sections can be nested in one another. The nested section defines a module valid under the particular section, for example, the gzip section under the http section. Configuration outside any section is part of the NGINX global configuration. The lines starting with the hash (#) sign are comments. Configurations can be split into multiple files, which can be grouped using the include directive. This helps in organizing code into logical components. Inclusions are processed recursively, that is, an include file can further have include statements. Spaces, tabs, and new line characters are not part of the NGINX configuration. They are not interpreted by the NGINX engine, but they help to make the configuration more readable. Thus, the complete file looks like the following code: #The configuration begins here global1 value1; #This defines a new section section { sectionvar1 value1; include file1;    subsection {    subsectionvar1 value1; } } #The section ends here global2 value2; # The configuration ends here NGINX provides the -t option, which can be used to test and verify the configuration written in the file. If the file or any of the included files contains any errors, it prints the line numbers causing the issue: $ sudo nginx -t This checks the validity of the default configuration file. If the configuration is written in a file other than the default one, use the -c option to test it. You cannot test half-baked configurations, for example, you defined a server section for your domain in a separate file. Any attempt to test such a file will throw errors. The file has to be complete in all respects. Now that we have a clear idea of the NGINX configuration syntax, we will try to play around with the default configuration. This article only aims to discuss the parts of the configuration that have an impact on performance. The NGINX catalog has large number of modules that can be configured for some purposes. This article does not try to cover all of them as the details are beyond the scope of the book. Please refer to the NGINX documentation at http://nginx.org/en/docs/ to know more about the modules. Configuring NGINX workers NGINX runs a fixed number of worker processes as per the specified configuration. In the following sections, we will work with NGINX worker parameters. These parameters are mostly part of the NGINX global context. worker_processes The worker_processes directive controls the number of workers: worker_processes 1; The default value for this is 1, that is, NGINX runs only one worker. The value should be changed to an optimal value depending on the number of cores available, disks, network subsystem, server load, and so on. As a starting point, set the value to the number of cores available. Determine the number of cores available using lscpu: $ lscpu Architecture:     x86_64 CPU op-mode(s):   32-bit, 64-bit Byte Order:     Little Endian CPU(s):       4 The same can be accomplished by greping out cpuinfo: $ cat /proc/cpuinfo | grep 'processor' | wc -l Now, set this value to the parameter: # One worker per CPU-core. worker_processes 4; Alternatively, the directive can have auto as its value. This determines the number of cores and spawns an equal number of workers. When NGINX is running with SSL, it is a good idea to have multiple workers. SSL handshake is blocking in nature and involves disk I/O. Thus, using multiple workers leads to improved performance. accept_mutex Since we have configured multiple workers in NGINX, we should also configure the flags that impact worker selection. The accept_mutex parameter available under the events section will enable each of the available workers to accept new connections one by one. By default, the flag is set to on. The following code shows this: events { accept_mutex on; } If the flag is turned to off, all of the available workers will wake up from the waiting state, but only one worker will process the connection. This results in the Thundering Herd phenomenon, which is repeated a number of times per second. The phenomenon causes reduced server performance as all the woken-up workers take up CPU time before going back to the wait state. This results in unproductive CPU cycles and nonutilized context switches. accept_mutex_delay When accept_mutex is enabled, only one worker, which has the mutex lock, accepts connections, while others wait for their turn. The accept_mutex_delay corresponds to the timeframe for which the worker would wait, and after which it tries to acquire the mutex lock and starts accepting new connections. The directive is available under the events section with a default value of 500 milliseconds. The following code shows this: events{ accept_mutex_delay 500ms; } worker_connections The next configuration to look at is worker_connections, with a default value of 512. The directive is present under the events section. The directive sets the maximum number of simultaneous connections that can be opened by a worker process. The following code shows this: events{    worker_connections 512; } Increase worker_connections to something like 1,024 to accept more simultaneous connections. The value of worker_connections does not directly translate into the number of clients that can be served simultaneously. Each browser opens a number of parallel connections to download various components that compose a web page, for example, images, scripts, and so on. Different browsers have different values for this, for example, IE works with two parallel connections while Chrome opens six connections. The number of connections also includes sockets opened with the upstream server, if any. worker_rlimit_nofile The number of simultaneous connections is limited by the number of file descriptors available on the system as each socket will open a file descriptor. If NGINX tries to open more sockets than the available file descriptors, it will lead to the Too many opened files message in the error.log. Check the number of file descriptors using ulimit: $ ulimit -n Now, increase this to a value more than worker_process * worker_connections. The value should be increased for the user that runs the worker process. Check the user directive to get the username. NGINX provides the worker_rlimit_nofile directive, which can be an alternative way of setting the available file descriptor rather modifying ulimit. Setting the directive will have a similar impact as updating ulimit for the worker user. The value of this directive overrides the ulimit value set for the user. The directive is not present by default. Set a large value to handle large simultaneous connections. The following code shows this: worker_rlimit_nofile 20960; To determine the OS limits imposed on a process, read the file /proc/$pid/limits. $pid corresponds to the PID of the process. multi_accept The multi_accept flag enables an NGINX worker to accept as many connections as possible when it gets the notification of a new connection. The purpose of this flag is to accept all connections in the listen queue at once. If the directive is disabled, a worker process will accept connections one by one. The following code shows this: events{    multi_accept on; } The directive is available under the events section with the default value off. If the server has a constant stream of incoming connections, enabling multi_accept may result in a worker accepting more connections than the number specified in worker_connections. The overflow will lead to performance loss as the previously accepted connections, part of the overflow, will not get processed. use NGINX provides several methods for connection processing. Each of the available methods allows NGINX workers to monitor multiple socket file descriptors, that is, when there is data available for reading/writing. These calls allow NGINX to process multiple socket streams without getting stuck in any one of them. The methods are platform-dependent, and the configure command, used to build NGINX, selects the most efficient method available on the platform. If we want to use other methods, they must be enabled first in NGINX. The use directive allows us to override the default method with the method specified. The directive is part of the events section: events { use select; } NGINX supports the following methods of processing connections: select: This is the standard method of processing connections. It is built automatically on platforms that lack more efficient methods. The module can be enabled or disabled using the --with-select_module or --without-select_module configuration parameter. poll: This is the standard method of processing connections. It is built automatically on platforms that lack more efficient methods. The module can be enabled or disabled using the --with-poll_module or --without-poll_module configuration parameter. kqueue: This is an efficient method of processing connections available on FreeBSD 4.1, OpenBSD 2.9+, NetBSD 2.0, and OS X. There are the additional directives kqueue_changes and kqueue_events. These directives specify the number of changes and events that NGINX will pass to the kernel. The default value for both of these is 512. The kqueue method will ignore the multi_accept directive if it has been enabled. epoll: This is an efficient method of processing connections available on Linux 2.6+. The method is similar to the FreeBSD kqueue. There is also the additional directive epoll_events. This specifies the number of events that NGINX will pass to the kernel. The default value for this is 512. /dev/poll: This is an efficient method of processing connections available on Solaris 7 11/99+, HP/UX 11.22+, IRIX 6.5.15+, and Tru64 UNIX 5.1A+. This has the additional directives, devpoll_events and devpoll_changes. The directives specify the number of changes and events that NGINX will pass to the kernel. The default value for both of these is 32. eventport: This is an efficient method of processing connections available on Solaris 10. The method requires necessary security patches to avoid kernel crash issues. rtsig: Real-time signals is a connection processing method available on Linux 2.2+. The method has some limitations. On older kernels, there is a system-wide limit of 1,024 signals. For high loads, the limit needs to be increased by setting the rtsig-max parameter. For kernel 2.6+, instead of the system-wide limit, there is a limit on the number of outstanding signals for each process. NGINX provides the worker_rlimit_sigpending parameter to modify the limit for each of the worker processes: worker_rlimit_sigpending 512; The parameter is part of the NGINX global configuration. If the queue overflows, NGINX drains the queue and uses the poll method to process the unhandled events. When the condition is back to normal, NGINX switches back to the rtsig method of connection processing. NGINX provides the rtsig_overflow_events, rtsig_overflow_test, and rtsig_overflow_threshold parameters to control how a signal queue is handled on overflows. The rtsig_overflow_events parameter defines the number of events passed to poll. The rtsig_overflow_test parameter defines the number of events handled by poll, after which NGINX will drain the queue. Before draining the signal queue, NGINX will look up how much it is filled. If the factor is larger than the specified rtsig_overflow_threshold, it will drain the queue. The rtsig method requires accept_mutex to be set. The method also enables the multi_accept parameter. Configuring NGINX I/O NGINX can also take advantage of the Sendfile and direct I/O options available in the kernel. In the following sections, we will try to configure parameters available for disk I/O. Sendfile When a file is transferred by an application, the kernel first buffers the data and then sends the data to the application buffers. The application, in turn, sends the data to the destination. The Sendfile method is an improved method of data transfer, in which data is copied between file descriptors within the OS kernel space, that is, without transferring data to the application buffers. This results in improved utilization of the operating system's resources. The method can be enabled using the sendfile directive. The directive is available for the http, server, and location sections. http{ sendfile on; } The flag is set to off by default. Direct I/O The OS kernel usually tries to optimize and cache any read/write requests. Since the data is cached within the kernel, any subsequent read request to the same place will be much faster because there's no need to read the information from slow disks. Direct I/O is a feature of the filesystem where reads and writes go directly from the applications to the disk, thus bypassing all OS caches. This results in better utilization of CPU cycles and improved cache effectiveness. The method is used in places where the data has a poor hit ratio. Such data does not need to be in any cache and can be loaded when required. It can be used to serve large files. The directio directive enables the feature. The directive is available for the http, server, and location sections: location /video/ { directio 4m; } Any file with size more than that specified in the directive will be loaded by direct I/O. The parameter is disabled by default. The use of direct I/O to serve a request will automatically disable Sendfile for the particular request. Direct I/O depends on the block size while doing a data transfer. NGINX has the directio_alignment directive to set the block size. The directive is present under the http, server, and location sections: location /video/ { directio 4m; directio_alignment 512; } The default value of 512 bytes works well for all boxes unless it is running a Linux implementation of XFS. In such a case, the size should be increased to 4 KB. Asynchronous I/O Asynchronous I/O allows a process to initiate I/O operations without having to block or wait for it to complete. The aio directive is available under the http, server, and location sections of an NGINX configuration. Depending on the section, the parameter will perform asynchronous I/O for the matching requests. The parameter works on Linux kernel 2.6.22+ and FreeBSD 4.3. The following code shows this: location /data { aio on; } By default, the parameter is set to off. On Linux, aio needs to be enabled with directio, while on FreeBSD, sendfile needs to be disabled for aio to take effect. If NGINX has not been configured with the --with-file-aio module, any use of the aio directive will cause the unknown directive aio error. The directive has a special value of threads, which enables multithreading for send and read operations. The multithreading support is only available on the Linux platform and can only be used with the epoll, kqueue, or eventport methods of processing requests. In order to use the threads value, configure multithreading in the NGINX binary using the --with-threads option. Post this, add a thread pool in the NGINX global context using the thread_pool directive. Use the same pool in the aio configuration: thread_pool io_pool threads=16; http{ ….....    location /data{      sendfile   on;      aio       threads=io_pool;    } } Mixing them up The three directives can be mixed together to achieve different objectives on different platforms. The following configuration will use sendfile for files with size smaller than what is specified in directio. Files served by directio will be read using asynchronous I/O: location /archived-data/{ sendfile on; aio on; directio 4m; } The aio directive has a sendfile value, which is available only on the FreeBSD platform. The value can be used to perform Sendfile in an asynchronous manner: location /archived-data/{ sendfile on; aio sendfile; } NGINX invokes the sendfile() system call, which returns with no data in the memory. Post this, NGINX initiates data transfer in an asynchronous manner. Configuring TCP HTTP is an application-based protocol, which uses TCP as the transport layer. In TCP, data is transferred in the form of blocks known as TCP packets. NGINX provides directives to alter the behavior of the underlying TCP stack. These parameters alter flags for an individual socket connection. TCP_NODELAY TCP/IP networks have the "small packet" problem, where single-character messages can cause network congestion on a highly loaded network. Such packets are 41 bytes in size, where 40 bytes are for the TCP header and 1 byte has useful information. These small packets have huge overhead, around 4000 percent and can saturate a network. John Nagle solved the problem (Nagle's algorithm) by not sending the small packets immediately. All such packets are collected for some amount of time and then sent in one go as a single packet. This results in improved efficiency of the underlying network. Thus, a typical TCP/IP stack waits for up to 200 milliseconds before sending the data packages to the client. It is important to note that the problem exists with applications such as Telnet, where each keystroke is sent over wire. The problem is not relevant to a web server, which severs static files. The files will mostly form full TCP packets, which can be sent immediately instead of waiting for 200 milliseconds. The TCP_NODELAY option can be used while opening a socket to disable Nagle's buffering algorithm and send the data as soon as it is available. NGINX provides the tcp_nodelay directive to enable this option. The directive is available under the http, server, and location sections of an NGINX configuration: http{ tcp_nodelay on; } The directive is enabled by default. NGINX use tcp_nodelay for connections with the keep-alive mode. TCP_CORK As an alternative to Nagle's algorithm, Linux provides the TCP_CORK option. The option tells the TCP stack to append packets and send them when they are full or when the application instructs to send the packet by explicitly removing TCP_CORK. This results in an optimal amount of data packets being sent and, thus, improves the efficiency of the network. The TCP_CORK option is available as the TCP_NOPUSH flag on FreeBSD and Mac OS. NGINX provides the tcp_nopush directive to enable TCP_CORK over the connection socket. The directive is available under the http, server, and location sections of an NGINX configuration: http{ tcp_nopush on; } The directive is disabled by default. NGINX uses tcp_nopush for requests served with sendfile. Setting them up The two directives discussed previously do mutually exclusive things; the former makes sure that the network latency is reduced, while the latter tries to optimize the data packets sent. An application should set both of these options to get efficient data transfer. Enabling tcp_nopush along with sendfile makes sure that while transferring a file, the kernel creates the maximum amount of full TCP packets before sending them over wire. The last packet(s) can be partial TCP packets, which could end up waiting with TCP_CORK being enabled. NGINX make sure it removes TCP_CORK to send these packets. Since tcp_nodelay is also set then, these packets are immediately sent over the network, that is, without any delay. Setting up the server The following configuration sums up all the changes proposed in the preceding sections: worker_processes 3; worker_rlimit_nofile 8000;   events { multi_accept on; use epoll; worker_connections 1024; }   http { sendfile on; aio on; directio 4m; tcp_nopush on; tcp_nodelay on; # Rest Nginx configuration removed for brevity } It is assumed that NGINX runs on a quad core server. Thus three worker processes have been spanned to take advantage of three out of four available cores and leaving one core for other processes. Each of the workers has been configured to work with 1,024 connections. Correspondingly, the nofile limit has been increased to 8,000. By default, all worker processes operate with mutex; thus, the flag has not been set. Each worker processes multiple connections in one go using the epoll method. In the http section, NGINX has been configured to serve files larger than 4 MB using direct I/O, while efficiently buffering smaller files using Sendfile. TCP options have also been set up to efficiently utilize the available network. Measuring gains It is time to test the changes and make sure that they have given performance gain. Run a series of tests using Siege/JMeter to get new performance numbers. The tests should be performed with the same configuration to get a comparable output: $ siege -b -c 790 -r 50 -q http://192.168.2.100/hello   Transactions:               79000 hits Availability:               100.00 % Elapsed time:               24.25 secs Data transferred:           12.54 MB Response time:             0.20 secs Transaction rate:           3257.73 trans/sec Throughput:                 0.52 MB/sec Concurrency:               660.70 Successful transactions:   39500 Failed transactions:       0 Longest transaction:       3.45 Shortest transaction:       0.00 The results from Siege should be evaluated and compared to the baseline. Throughput: The transaction rate defines this as 3250 requests/second Error rate: Availability is reported as 100 percent; thus; the error rate is 0 percent Response time: The results shows a response time of 0.20 seconds Thus, these new numbers demonstrate performance improvement in various respects. After the server configuration is updated with all the changes, reperform all tests with increased numbers. The aim should be to determine the new baseline numbers for the updated configuration. Summary The article started with an overview of the NGINX configuration syntax. Going further, we discussed worker_connections and the related parameters. These allow you to take advantage of the available hardware. The article also talked about the different event processing mechanisms available on different platforms. The configuration discussed helped in processing more requests, thus improving the overall throughput. NGINX is primarily a web server; thus, it has to serve all kinds static content. Large files can take advantage of direct I/O, while smaller content can take advantage of Sendfile. The different disk modes make sure that we have an optimal configuration to serve the content. In the TCP stack, we discussed the flags available to alter the default behavior of TCP sockets. The tcp_nodelay directive helps in improving latency. The tcp_nopush directive can help in efficiently delivering the content. Both these flags lead to improved response time. In the last part of the article, we applied all the changes to our server and then did performance tests to determine the effectiveness of the changes done. In the next article, we will try to configure buffers, timeouts, and compression to improve the utilization of the available network. Resources for Article: Further resources on this subject: Using Nginx as a Reverse Proxy [article] Nginx proxy module [article] Introduction to nginx [article]
Read more
  • 0
  • 0
  • 27721

article-image-building-mobile-games-craftyjs-and-phonegap-part-3
Robi Sen
13 Jul 2015
9 min read
Save for later

Building Mobile Games with Crafty.js and PhoneGap, Part 3

Robi Sen
13 Jul 2015
9 min read
In this post, we will build upon what we learned in our previous series on using Crafty.js, HTML5, JavaScript, and PhoneGap to make a mobile game. In this post we will add a trigger to call back our monster AI, letting the monsters know it’s their turn to move, so each time the player moves the monsters will also move. Structuring our code with components Before we begin updating our game, let’s clean up our code a little bit. First let’s abstract out some of the code into separate files so it’s easier to work, read, edit, and develop our project. Let’s make a couple of components. The first one will be called PlayerControls.js and will tell the system what direction to move an entity when we touch on the screen. To do this, first create a new directory under your project WWW directory called src. Then create a new directory in src called com . In the folder create a new file called PlayerControls.js. Now open the file and make it look like the following: // create a simple object that describes player movement Crafty.c("PlayerControls", { init: function() { //lets now make the hero move where ever we touch Crafty.addEvent(this, Crafty.stage.elem, 'mousedown', function(e) { // lets simulate a 8 way controller or old school joystick //build out the direction of the mouse point. Remember that y increases as it goes 'downward' if (e.clientX < (player.x+Crafty.viewport.x) && (e.clientX - (player.x+Crafty.viewport.x))< 32) { myx = -1; } else if (e.clientX > (player.x+Crafty.viewport.x) && (e.clientX - (player.x+Crafty.viewport.x)) > 32){ myx = 1; } else { myx = 0; } if (e.clientY < (player.y+Crafty.viewport.y) && (e.clientY - (player.y+Crafty.viewport.y))< 32) { myy= -1; } else if (e.clientY > (player.y+Crafty.viewport.y) && (e.clientY - (player.y+Crafty.viewport.y)) > 32){ myy= 1; } else { myy = 0;} // let the game know we moved and where too var direction = [myx,myy]; this.trigger('Slide',direction); Crafty.trigger('Turn'); lastclientY = e.clientY; lastclientX = e.clientX; console.log("my x direction is " + myx + " my y direction is " + myy) console.log('mousedown at (' + e.clientX + ', ' + e.clientY + ')'); }); } }); You will note that this is very similar to the PlayerControls  component in our current index.html. One of the major differences is now we are decoupling the actual movement of our player from the mouse/touch controls. So if you look at the new PlayerControls component you will notice that all it does is set the X and Y direction, relative to a player object, and pass those directions off to a new component we are going to make called Slide. You will also see that we are using crafty.trigger to trigger an event called turn. Later in our code we are going to detect that trigger to active a callback to our monster AI letting the monsters know it’s their turn to move, so each time the player moves the monsters will also move. So let’s create a new component called Slide.js and it will go in your com directory with PlayerControls.js. Now open the file and make it look like this: Crafty.c("Slide", { init: function() { this._stepFrames = 5; this._tileSize = 32; this._moving = false; this._vx = 0; this._destX = 0; this._sourceX = 0; this._vy = 0; this._destY = 0; this._sourceY = 0; this._frames = 0; this.bind("Slide", function(direction) { // Don't continue to slide if we're already moving if(this._moving) return false; this._moving = true; // Let's keep our pre-movement location this._sourceX = this.x; this._sourceY = this.y; // Figure out our destination this._destX = this.x + direction[0] * 32; this._destY = this.y + direction[1] * 32; // Get our x and y velocity this._vx = direction[0] * this._tileSize / this._stepFrames; this._vy = direction[1] * this._tileSize / this._stepFrames; this._frames = this._stepFrames; }).bind("EnterFrame",function(e) { if(!this._moving) return false; // If we'removing, update our position by our per-frame velocity this.x += this._vx; this.y += this._vy; this._frames--; if(this._frames == 0) { // If we've run out of frames, // move us to our destination to avoid rounding errors. this._moving = false; this.x = this._destX; this.y = this._destY; } this.trigger('Moved', {x: this.x, y: this.y}); }); }, slideFrames: function(frames) { this._stepFrames = frames; }, // A function we'll use later to // cancel our movement and send us back to where we started cancelSlide: function() { this.x = this._sourceX; this.y = this._sourceY; this._moving = false; } }); As you can see, it is pretty straightforward. Basically, it handles movement by accepting a direction as a 0 or 1 within X and Y axis’. It then moves any entity that inherits its behavior some number of pixels; in this case 32, which is the height and width of our floor tiles. Now let’s do a little more housekeeping. Let’s pull out the sprite code in to a Sprites.js file and the asset loading code into a Loading.js  file. So create to new files, Sprites.js and Loading.js respectively, in your com directory and edit them to looking like the following two listings. Sprites.js: Crafty.sprite(32,"assets/dungeon.png", { floor: [0,1], wall1: [18,0], stairs: [3,1] }); // This will create entities called hero1 and blob1 Crafty.sprite(32,"assets/characters.png", { hero: [11,4], goblin1: [8,14] }); Loading.js: Crafty.scene("loading", function() { //console.log("pants") Crafty.load(["assets/dungeon.png","assets/characters.png"], function() { Crafty.scene("main"); // Run the main scene console.log("Done loading"); }, function(e) { //progress }, function(e) { //somethig is wrong, error loading console.log("Error,failed to load", e) }); }); Okay, now that is done let’s redo our index.html to make it cleaner: <!DOCTYPE html> <html> <head></head> <body> <div id="game"></div> <script type="text/javascript" src="lib/crafty.js"></script> <script type="text/javascript" src="src/com/loading.js"></script> <script type="text/javascript" src="src/com/sprites.js"></script> <script type="text/javascript" src="src/com/Slide.js"></script> <script type="text/javascript" src="src/com/PlayerControls.js"></script> <script> // Initialize Crafty Crafty.init(500, 320); // Background Crafty.background('green'); Crafty.scene("main",function() { Crafty.background("#FFF"); player = Crafty.e("2D, Canvas,PlayerControls, Slide, hero") .attr({x:0, y:0}) goblin = Crafty.e("2D, Canvas, goblin1") .attr({x:50, y:50}); }); Crafty.scene("loading"); </script> </body> </html> Go ahead save the file and load it in your browser. Everything should work as expected but now our index file and directory is a lot cleaner and easier to work with. Now that this is done, let’s get to giving the monster the ability to move on its own. Monster fun – moving game agents We are up to the point that we are able to move the hero of our game around the game screen with mouse clicks/touches. Now we need to make things difficult for our hero and make the monster move as well. To do this we need to add a very simple component that will move the monster around after our hero moves. To do this create a file called AI.js in the com directory. Now open it and edit it to look like this:   Crafty.c("AI",{ _directions: [[0,-1], [0,1], [1,0], [-1,0]], init: function() { this._moveChance = 0.5; this.requires('Slide'); this.bind("Turn",function() { if(Math.random() < this._moveChance) { this.trigger("Slide", this._randomDirection()); } }); }, moveChance: function(val) { this._moveChance = val; }, _randomDirection: function() { return this._directions[Math.floor(Math.random()*4)]; } }); As you can see all AI.js does, when called, is feed random directions to slide. Now we will add the AI component to the goblin entity. To do this editing your index.html to look like the following: <!DOCTYPE html> <html> <head></head> <body> <div id="game"></div> <script type="text/javascript" src="lib/crafty.js"></script> <script type="text/javascript" src="src/com/loading.js"></script> <script type="text/javascript" src="src/com/sprites.js"></script> <script type="text/javascript" src="src/com/Slide.js"></script> <script type="text/javascript" src="src/com/AI.js"></script> <script type="text/javascript" src="src/com/PlayerControls.js"></script> <script> Crafty.init(500, 320); Crafty.background('green'); Crafty.scene("main",function() { Crafty.background("#FFF"); player = Crafty.e("2D, Canvas,PlayerControls, Slide, hero") .attr({x:0, y:0}) goblin = Crafty.e("2D, Canvas, AI, Slide, goblin1") .attr({x:50, y:50}); }); Crafty.scene("loading"); </script> </body> </html> Here you will note we added a new entity called goblin and added the components Slide and AI. Now save the file and load it. When you move your hero you should see the goblin move as well like in this screenshot: Summary While this was a long post, you have learned a lot. Now that we have the hero and goblin moving in our game, we will build a dungeon in part 4, enable our hero to fight goblins, and create a PhoneGap build for our game. About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as UnderArmour, Sony, CISCO, IBM, and many others to help build out new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems, allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 3998

article-image-learning-beaglebone-python-programming
Packt
10 Jul 2015
15 min read
Save for later

Learning BeagleBone Python Programming

Packt
10 Jul 2015
15 min read
In this In this article by Alexander Hiam, author of the book Learning BeagleBone Python Programming, we will go through the initial steps to get your BeagleBone Black set up. By the end of it, you should be ready to write your first Python program. We will cover the following topics: Logging in to your BeagleBone Connecting to the Internet Updating and installing software The basics of the PyBBIO and Adafruit_BBIO libraries (For more resources related to this topic, see here.) Initial setup If you've never turned on your BeagleBone Black, there will be a bit of initial setup required. You should follow the most up-to-date official instructions found at http://beagleboard.org/getting-started, but to summarize, here are the steps: Install the network-over-USB drivers for your PC's operating system. Plug in the USB cable between your PC and BeagleBone Black. Open Chrome or Firefox and navigate to http://192.168.7.2 (Internet Explorer is not fully supported and might not work properly). If all goes well, you should see a message on the web page served up by the BeagleBone indicating that it has successfully connected to the USB network: If you scroll down a little, you'll see a runnable Bonescript example, as in the following screenshot: If you press the run button you should see the four LEDs next to the Ethernet connector on your BeagleBone light up for 2 seconds and then return to their normal function of indicating system and network activity. What's happening here is the Javascript running in your browser is using the Socket.IO (http://socket.io) library to issue remote procedure calls to the Node.js server that's serving up the web page. The server then calls the Bonescript API (http://beagleboard.org/Support/BoneScript), which controls the GPIO pins connected to the LEDs. Updating your Debian image The GNU/Linux distributions for platforms such as the BeagleBone are typically provided as ISO images, which are single file copies of the flash memory with the distribution installed. BeagleBone images are flashed onto a microSD card that the BeagleBone can then boot from. It is important to update the Debian image on your BeagleBone to ensure that it has all the most up-to-date software and drivers, which can range from important security fixes to the latest and greatest features. First, grab the latest BeagleBone Black Debian image from http://beagleboard.org/latest-images. You should now have a .img.xz file, which is an ISO image with XZ compression. Before the image can be flashed from a Windows PC, you'll have to decompress it. Install 7-Zip (http://www.7-zip.org/), which will let you decompress the file from the context menu by right-clicking on it. You can install Win32 Disk Imager (http://sourceforge.net/projects/win32diskimager/) to flash the decompressed .img file to your microSD card. Plug the microSD card you want your BeagleBone Black to boot from into your PC and launch Win32 Disk Imager. Select the drive letter associated with your microSD card; this process will erase the target device, so make sure the correct device is selected: Next, press the browse button and select the decompressed .img file, then press Write: The image burning process will take a few minutes. Once it is complete, you can eject the microSD card, insert it into the BeagleBone Black and boot it up. You can then return to http://192.168.7.2 to make sure the new image was flashed successfully and the BeagleBone is able to boot. Connecting to your BeagleBone If you're running your BeagleBone with a monitor, keyboard, and mouse connected, you can use it like a standard desktop install of Debian. This book assumes you are running your BeagleBone headless (without a monitor). In that case, we will need a way to remotely connect to it. The Cloud9 IDE The BeagleBone Debian images include an instance of the Cloud9 IDE (https://c9.io) running on port 3000. To access it, simply navigate to your BeagleBone Black's IP address with the port appended after a colon, that is, http://192.168.7.2:3000. If it's your first time using Cloud9, you'll see the welcome screen, which lets you customize the look and feel: The left panel lets you organize, create, and delete files in your Cloud9 workspace. When you open a file for editing, it is shown in the center panel, and the lower panel holds a Bash shell and a Javascript REPL. Files and terminal instances can be opened in both the center and bottom panels. Bash instances start in the Cloud9 workspace, but you can use them to navigate anywhere on the BeagleBone's filesystem. If you've never used the Bash shell I'd encourage you to take a look at the Bash manual (https://www.gnu.org/software/bash/manual/), as well as walk through a tutorial or two. It can be very helpful and even essential at times, to be able to use Bash, especially with a platform such as BeagleBone without a monitor connected. Another great use for the Bash terminal in Cloud9 is for running the Python interactive interpreter, which you can launch in the terminal by running python without any arguments: SSH If you're a Linux user, or if you would prefer not to be doing your development through a web browser, you may want to use SSH to access your BeagleBone instead. SSH, or Secure Shell, is a protocol for securely gaining terminal access to a remote computer over a network. On Windows, you can download PuTTY from http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html, which can act as an SSH client. Run PuTTY, make sure SSH is selected, and enter your BeagleBone's IP address and the default SSH port of 22: When you press Open, PuTTY will open an SSH connection to your BeagleBone and give you a terminal window (the first time you connect to your BeagleBone it will ask you if you trust the SSH key; press Yes). Enter root as the username and press Enter to log in; you will be dropped into a Bash terminal: As in the Cloud9 IDE's terminals, from here, you can use the Linux tools to move around the filesystem, create and edit files, and so on, and you can run the Python interactive interpreter to try out and debug Python code. Connecting to the Internet Your BeagleBone Black won't be able to access the Internet with the default network-over-USB configuration, but there are a couple ways that you can connect your BeagleBone to the Internet. Ethernet The simplest option is to connect the BeagleBone to your network using an Ethernet cable between your BeagleBone and your router or a network switch. When the BeagleBone Black boots with an Ethernet connection, it will use DHCP to automatically request an IP address and register on your network. Once you have your BeagleBone registered on your network, you'll be able to log in to your router's interface from your web browser (usually found at http://192.168.1.1 or http://192.168.2.1) and find out the IP address that was assigned to your BeagleBone. Refer to your router's manual for more information. The current BeagleBone Black Debian images are configured to use the hostname beaglebone, so it should be pretty easy to find in your router's client list. If you are using a network on which you have no way of accessing this information through the router, you could use a tool such as Fing (http://www.overlooksoft.com) for Android or iPhone to scan the network and list the IP addresses of every device on it. Since this method results in your BeagleBone being assigned a new IP address, you'll need to use the new address to access the Getting Started pages and the Cloud9 IDE. Network forwarding If you don't have access to an Ethernet connection, or it's just more convenient to have your BeagleBone connected to your computer instead of your router, it is possible to forward your Internet connection to your BeagleBone over the USB network. On Windows, open your Network Connections window by navigating to it from the Control Panel or by opening the start menu, typing ncpa.cpl, and pressing Enter. Locate the Linux USB Ethernet network interface and take note of the name; in my case, its Local Area Network 4. This is the network interface used to connect to your BeagleBone: First, right-click on the network interface that you are accessing the Internet through, in my case, Wireless Network Connection, and select Properties. On the Sharing tab, check Allow other network users to connect through this computer's Internet connection, and select your BeagleBone's network interface from the dropdown: After pressing OK, Windows will assign the BeagleBone interface a static IP address, which will conflict with the static IP address of http://192.168.7.2 that the BeagleBone is configured to request on the USB network interface. To fix this, you'll want to right-click the Linux USB Ethernet interface and select Properties, then highlight Internet Protocol Version 4 (TCP/IPv4) and click on Properties: Select Obtain IP address automatically and click on OK; Your Windows PC is now forwarding its Internet connection to the BeagleBone, but the BeagleBone is still not configured properly to access the Internet. The problem is that the BeagleBone's IP routing table doesn't include 192.168.7.1 as a gateway, so it doesn't know the network path to the Internet. Access a Cloud9 or SSH terminal, and use the route tool to add the gateway, as shown in the following command: # route add default gw 192.168.7.1 Your BeagleBone should now have Internet access, which you can test by pinging a website: root@beaglebone:/var/lib/cloud9# ping -c 3 graycat.io PING graycat.io (198.100.47.208) 56(84) bytes of data. 64 bytes from 198.100.47.208.static.a2webhosting.com (198.100.47.208): icmp_req=1 ttl=55 time=45.6 ms 64 bytes from 198.100.47.208.static.a2webhosting.com (198.100.47.208): icmp_req=2 ttl=55 time=45.6 ms 64 bytes from 198.100.47.208.static.a2webhosting.com (198.100.47.208): icmp_req=3 ttl=55 time=46.0 ms   --- graycat.io ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 45.641/45.785/46.035/0.248 ms The IP routing will be reset at boot up, so if you reboot your BeagleBone, the Internet connection will stop working. This can be easily solved by using Cron, a Linux tool for scheduling the automatic running of commands. To add the correct gateway at boot, you'll need to edit the crontab file with the following command: # crontab –e This will open the crontab file in nano, which is a command line text editor. We can use the @reboot keyword to schedule the command to run after each reboot: @reboot /sbin/route add default gw 192.168.7.1 Press Ctrl + X to exit nano, then press Y, and then Enter to save the file. Your forwarded Internet connection should now remain after rebooting. Using the serial console If you are unable to use a network connection to your BeagleBone Black; for instance, if your network is too slow for Cloud9 or you can't find the BeagleBone's IP address, there is still hope! The BeagleBone Black includes a 6-pin male connector; labeled J1, right next to the P9 expansion header (we'll learn more about the P8 and P9 expansion headers soon!). You'll need a USB to 3.3 V TTL serial converter, for example, from Adafruit http://www.adafruit.com/products/70 or Logic Supply http://www.logicsupply.com/components/beaglebone/accessories/ls-ttl3vt. You'll need to download and install the FTDI virtual COM port driver for your operating system from http://www.ftdichip.com/Drivers/VCP.htm, then plug the connector into the J1 header such that the black wire lines up with the header's pin 1 indicator, as shown in the following screenshot: You can then use your favorite serial port terminal emulator, such as PuTTY or CoolTerm (http://freeware.the-meiers.org), and configure the serial port for a baud rate of 115200 with 1 stop bit and no parity. Once connected, press Enter and you should see a login prompt. Enter the user name root and you'll drop into a Bash shell. If you only need the console connection to find your IP address, you can do so using the following command: # ip addr Updating your software If this is the first time you've booted your BeagleBone Black, or if you've just flashed a new image, it's best to start by ensuring your installed software packages are all up to date. You can do so using Debian's apt package manager: # apt-get update && apt-get upgrade This process might take a few minutes. Next, use the pip Python package manager to update to the latest versions of the PyBBIO and Adafruit_BBIO libraries: # pip install --upgrade PyBBIO Adafruit_BBIO As both libraries are currently in active development, it's worth running this command from time to time to make sure you have all the latest features. The PyBBIO library The PyBBIO library was developed with Arduino users in mind. It emulates the structure of an Arduino (http://arduino.cc) program, as well as the Arduino API where appropriate. If you've never seen an Arduino program, it consists of a setup() function, which is called once when the program starts, and a loop() function, which is called repeatedly until the end of time (or until you turn off the Arduino). PyBBIO accomplishes a similar structure by defining a run() function that is passed two callable objects, one that is called once when the program starts, and another that is called repeatedly until the program stops. So the basic PyBBIO template looks like this: from bbio import *   def setup(): pinMode(GPIO1_16, OUTPUT)   def loop(): digitalWrite(GPIO1_16, HIGH) delay(500) digitalWrite(GPIO1_16, LOW) delay(500)   run(setup, loop) The first line imports everything from the PyBBIO library (the Python package is installed with the name bbio). Then, two functions are defined, and they are passed to run(), which tells the PyBBIO loop to begin. In this example, setup() will be called once, which configures the GPIO pin GPIO1_16 as a digital output with the pinMode() function. Then, loop() will be called until the PyBBIO loop is stopped, with each digitalWrite() call setting the GPIO1_16 pin to either a high (on) or low (off) state, and each delay() call causing the program to sleep for 500 milliseconds. The loop can be stopped by either pressing Ctrl + C or calling the stop() function. Any other error raised in your program will be caught, allowing PyBBIO to run any necessary cleanup, then it will be reraised. Don't worry if the program doesn't make sense yet, we'll learn about all that soon! Not everyone wants to use the Arduino style loop, and it's not always suitable depending on the program you're writing. PyBBIO can also be used in a more Pythonic way, for example, the above program can be rewritten as follows: import bbio   bbio.pinMode(bbio.GPIO1_16, bbio.OUTPUT) while True: bbio.digitalWrite(bbio.GPIO1_16, bbio.HIGH) bbio.delay(500) bbio.digitalWrite(bbio.GPIO1_16, bbio.LOW) bbio.delay(500) This still allows the bbio API to be used, but it is kept out of the global namespace. The Adafruit_BBIO library The Adafruit_BBIO library is structured differently than PyBBIO. While PyBBIO is structured such that, essentially, the entire API is accessed directly from the first level of the bbio package; Adafruit_BBIO instead has the package tree broken up by a peripheral subsystem. For instance, to use the GPIO API you have to import the GPIO package: from Adafruit_BBIO import GPIO Otherwise, to use the PWM API you would import the PWM package: from Adafruit_BBIO import PWM This structure follows a more standard Python library model, and can also save some space in your program's memory because you're only importing the parts you need (the difference is pretty minimal, but it is worth thinking about). The same program shown above using PyBBIO could be rewritten to use Adafruit_BBIO: from Adafruit_BBIO import GPIO import time   GPIO.setup("GPIO1_16", GPIO.OUT) try: while True:    GPIO.output("GPIO1_16", GPIO.HIGH)    time.sleep(0.5)    GPIO.output("GPIO1_16", GPIO.LOW)    time.sleep(0.5) except KeyboardInterrupt: GPIO.cleanup() Here the GPIO.setup() function is configuring the ping, and GPIO.output() is setting the state. Notice that we needed to import Python's built-in time library to sleep, whereas in PyBBIO we used the built-in delay() function. We also needed to explicitly catch KeyboardInterrupt (the Ctrl + C signal) to make sure all the cleanup is run before the program exits, whereas this is done automatically by PyBBIO. Of course, this means that you have much more control about when things such as initialization and cleanup happen using Adafruit_BBIO, which can be very beneficial depending on your program. There are some trade-offs, and the library you use should be chosen based on which model is better suited for your application. Summary In this article, you learned how to login to the BeagleBone Black, get it connected to the Internet, and update and install the software we need. We also looked at the basic structure of programs using the PyBBIO and Adafruit_BBIO libraries, and talked about some of the advantages of each. Resources for Article: Further resources on this subject: Overview of Chips [article] Getting Started with Electronic Projects [article] Beagle Boards [article]
Read more
  • 0
  • 0
  • 30643
article-image-creating-subtle-ui-details-using-midnightjs-wowjs-and-animatecss
Roberto González
10 Jul 2015
9 min read
Save for later

Creating subtle UI details using Midnight.js, Wow.js, and Animate.css

Roberto González
10 Jul 2015
9 min read
Creating animations in CSS or JavaScript is often annoying and/or time-consuming, so most people tend to pay a lot of attention to the content that’s below "the fold" ("the fold" is quickly becoming an outdated concept, but you know what I mean). I’ll be covering a few techniques to help you add some nice touches to your landing pages that only take a few minutes to implement and require pretty much no development work at all. To create a base for this project, I put together a bunch of photographs from https://unsplash.com/ with some text on top so we have something to work with. Download the files from http://aerolab.github.io/subtle-animations/assets/basics.zip and put them in a new folder. You can also check out the final result at http://aerolab.github.io/subtle-animations. Dynamically change your fixed headers using Midnight.js If you took a look at the demo site, you probably noticed that the minimalistic header we are using for "A How To Guide" becomes illegible in very light backgrounds. When this happens in most sites, we typically end up putting a background on the header, which usually improves legibility at the cost of making the design worse. Midnight.js is a jQuery plugin that changes your headers as you scroll, so the header always has a design that matches the content below it. This is particularly useful for minimalistic websites as they often use transparent headers. Implementation is quite simple as the setup is pretty much automatic. Start by adding a fixed header into the site. The example has one ready to go: <nav class="fixed"> <div class="container"> <span class="logo">A How To Guide</span> </div> </nav> Most of the setting up comes in specifying which header corresponds to which section. This is done by adding data-midnight="your-class" to any section or piece of content that requires a different design for the header. For the first section, we’ll be using a white header, so we’ll add data-midnight="white" to this section (it doesn’t have to be only a section, any large element works well). <section class="fjords" data-midnight="white"> <article> <h1>Adding Subtle UI Details</h1> <p>Using Midnight.js, Wow.js and Animate.css</p> </article> </section> In the next section, which is a photo of ships in very thick white fog, we’ll be using a darker header to help improve contrast. Let’s use data-midnight="gray" for the second one and data-midgnight="pink" for the last one, so it feels more in line with the content: <section class="ships" data-midnight="gray"> <article> <h1>Be quiet</h1> <p>I'm hunting wabbits</p> </article> </section> <section class="puppy" data-midnight="pink"> <article> <h1>OMG A PUPPY &lt;3</h1> </article> </section> Now we just need to add some css rules to change the look of the header in those cases. We’ll just be changing the color of the text for the moment, so open up css/styles.css and add the following rules: /* Styles for White, Gray and Pink headers */.midnightHeader.white { color: #fff; } .midnightHeader.gray { color: #999; } .midnightHeader.pink { color: #ffc0cb; } Last but not least, we need to include the necessary libraries. We’ll add two libraries right before the end of the body: jQuery and Midnight.js (they are included in the project files inside the js folder): <script src="js/jquery-1.11.1.min.js"></script> <script src="js/midnight.jquery.min.js"></script> Right after that, we start Midnight.js on document.ready, using $('nav.fixed').midnight() (you can change the selector to whatever you are using on your site): <script> $(document).ready(function(){ $('nav.fixed').midnight(); }); </script> If you check the site now, you’ll notice that the fixed header gracefully changes color when you start scrolling into the ships section. It’s a very subtle effect, but it helps keep your designs clean. Bonus Feature! It’s possible to completely change the markup of your header just for a specific section. It’s mostly used to add some visual details that require extra markup, but it can be used to completely alter your headers as necessary. In this case, we’ll be changing the “logo" from "A How To Guide" to "Shhhhhhhhh" on the ships section, and a bunch of hearts for the part of the puppy for additional bad comedy. To do this, we need to alter our fixed header a bit. First we need to identify the “default" header (all headers that don't have custom markup will be based on this one), and then add the markup we need for any custom headers, like the gray one. This is done by creating multiple copies of the header and wrapping them in .midnightHeader.default,.midnightHeader.gray and .midnightHeader.pink respectively: <nav class="fixed"> <div class="midnightHeader default"> <div class="container"> <span class="logo">A How To Guide</span> </div> </div> <div class="midnightHeader gray"> <div class="container"> <span class="logo">Shhhhhhhhh</span> </div> </div> <div class="midnightHeader pink"> <div class="container"> <span class="logo">❤❤❤ OMG PUPPIES ❤❤❤</span> </div> </div> </nav> If you test the site now, you’ll notice that the header not only changes color, but it also changes the "name" of the site to match the section, which gives you more freedom in terms of navigation and design. Simple animations with Wow.js and Animate.css Wow.js looks more like a toy than a serious plugin, but it’s actually a very powerful library that’s extremely easy to implement. Wow.js lets you animate things as they come into view. For instance, you can fade something in when you scroll to that section, letting users enjoy some extra UI candy. You can choose from a large set of animations from Animate.css so you don’t even have to touch the CSS (but you can still do that if you want). To get Wow.JS to work, we have to include just two things: Animate.css, which contains all the animations we need. Of course, you can create your own, or even tweak those to match your tastes. Just add a link to animate.css in the head of the document: <linkrel="stylesheet"href="css/animate.css"/> Wow.JS. This is simply just including the script and initializing it, which is done by adding the following just before the end of the document: <script src="js/wow.min.js"></script> <script>new WOW().init()</script> That’s it! To animate an element as soon as it gets into view, you just need to add the .wow class to that element, and then any animation from Animate.css (like .fadeInUp, .slideInLeft, or one of the many options available at http://daneden.github.io/animate.css/). For example, to make something fade in from the bottom of the screen, you just have to add wow fadeInUp. Let’s try this on the h1 our first section: <section class="fjords" data-midnight="white"> <article> <h1 class="wow fadeInUp">Adding Subtle UI Details</h1> <p>Using Midnight.js, Wow.js and Animate.css</p> </article> </section> If you feel like altering the animation slightly, you have quite a bit of control over how it behaves. For instance, let’s fade in the subtitle but do it a few milliseconds after the title, so it follows a sequence. We can use data-wow-delay="0.5s" to make the subtitle wait for half a second before making its appearance: <section class="fjords" data-midnight="white"> <article> <h1 class="wow fadeInUp">Adding Subtle UI Details</h1> <p class="wow fadeInUp" data-wow-delay="0.5s">Using Midnight.js, Wow.js and Animate.css</p> </article> </section> We can even tweak how long the animation takes by using data-wow-duration="1.5s" so it lasts a second and a half. This is particularly useful in the second section, combined with another delay: <section class="ships" data-midnight="gray"> <article> <h1 class="wow fadeIn" data-wow-duration="1.5s">Be quiet</h1> <p class="wow fadeIn" data-wow-delay="0.5s" data-wow-duration="1.5s">I'm hunting wabbits</p> </article> </section> We can even repeat an animation a few times. Let’s make the last title shake a few times as soon as it gets into view with data-wow-iteration="5". We'll take this opportunity to use all the properties, like data-wow-duration="0.5s" to make each shake last half a second, and we'll also add a large delay for the last piece so it appears after the main animation has finished: <section class="puppy"> <article> <h1 class="wow shake" data-wow-iteration="5" data-wow-duration="0.5s">OMG A PUPPY &lt;3</h1> <p class="wow fadeIn" data-wow-delay="2.5s">Ok, this one wasn't subtle at all</p> </article> </section> Summary That’s pretty much all there is to know about using Midnight.js, Wow.js and Animate.css! All you need to do now is find a project and experiment a bit with different animations. It’s a great tool to add some last-minute eye candy and - as long as you don’t overdo it - looks fantastic on most sites. I hope you enjoyed the article! About the author Roberto González is the co-founder of Aerolab, "an awesome place where we really push the barriers to create amazing, well coded design for the best digital products."He can be reached at @robertcode. From the 11th to 17th April, save 50% on top web development eBooks and 70% on our specially selected video courses. From Angular 2 to React and much more, find them all here.
Read more
  • 0
  • 0
  • 7221

article-image-clustering-and-other-unsupervised-learning-methods
Packt
09 Jul 2015
19 min read
Save for later

Clustering and Other Unsupervised Learning Methods

Packt
09 Jul 2015
19 min read
In this article by Ferran Garcia Pagans, author of the book Predictive Analytics Using Rattle and Qlik Sense, we will learn about the following: Define machine learning Introduce unsupervised and supervised methods Focus on K-means, a classic machine learning algorithm, in detail We'll create clusters of customers based on their annual money spent. This will give us a new insight. Being able to group our customers based on their annual money spent will allow us to see the profitability of each customer group and deliver more profitable marketing campaigns or create tailored discounts. Finally, we'll see hierarchical clustering, different clustering methods, and association rules. Association rules are generally used for market basket analysis. Machine learning – unsupervised and supervised learning Machine Learning (ML) is a set of techniques and algorithms that gives computers the ability to learn. These techniques are generic and can be used in various fields. Data mining uses ML techniques to create insights and predictions from data. In data mining, we usually divide ML methods into two main groups – supervisedlearning and unsupervisedlearning. A computer can learn with the help of a teacher (supervised learning) or can discover new knowledge without the assistance of a teacher (unsupervised learning). In supervised learning, the learner is trained with a set of examples (dataset) that contains the right answer; we call it the training dataset. We call the dataset that contains the answers a labeled dataset, because each observation is labeled with its answer. In supervised learning, you are supervising the computer, giving it the right answers. For example, a bank can try to predict the borrower's chance of defaulting on credit loans based on the experience of past credit loans. The training dataset would contain data from past credit loans, including if the borrower was a defaulter or not. In unsupervised learning, our dataset doesn't have the right answers and the learner tries to discover hidden patterns in the data. In this way, we call it unsupervised learning because we're not supervising the computer by giving it the right answers. A classic example is trying to create a classification of customers. The model tries to discover similarities between customers. In some machine learning problems, we don't have a dataset that contains past observations. These datasets are not labeled with the correct answers and we call them unlabeled datasets. In traditional data mining, the terms descriptive analytics and predictive analytics are used for unsupervised learning and supervised learning. In unsupervised learning, there is no target variable. The objective of unsupervised learning or descriptive analytics is to discover the hidden structure of data. There are two main unsupervised learning techniques offered by Rattle: Cluster analysis Association analysis Cluster analysis Sometimes, we have a group of observations and we need to split it into a number of subsets of similar observations. Cluster analysis is a group of techniques that will help you to discover these similarities between observations. Market segmentation is an example of cluster analysis. You can use cluster analysis when you have a lot of customers and you want to divide them into different market segments, but you don't know how to create these segments. Sometimes, especially with a large amount of customers, we need some help to understand our data. Clustering can help us to create different customer groups based on their buying behavior. In Rattle's Cluster tab, there are four cluster algorithms: KMeans EwKm Hierarchical BiCluster The two most popular families of cluster algorithms are hierarchical clustering and centroid-based clustering: Centroid-based clustering the using K-means algorithm I'm going to use K-means as an example of this family because it is the most popular. With this algorithm, a cluster is represented by a point or center called the centroid. In the initialization step of K-means, we need to create k number of centroids; usually, the centroids are initialized randomly. In the following diagram, the observations or objects are represented with a point and three centroids are represented with three colored stars: After this initialization step, the algorithm enters into an iteration with two operations. The computer associates each object with the nearest centroid, creating k clusters. Now, the computer has to recalculate the centroids' position. The new position is the mean of each attribute of every cluster member. This example is very simple, but in real life, when the algorithm associates the observations with the new centroids, some observations move from one cluster to the other. The algorithm iterates by recalculating centroids and assigning observations to each cluster until some finalization condition is reached, as shown in this diagram: The inputs of a K-means algorithm are the observations and the number of clusters, k. The final result of a K-means algorithm are k centroids that represent each cluster and the observations associated with each cluster. The drawbacks of this technique are: You need to know or decide the number of clusters, k. The result of the algorithm has a big dependence on k. The result of the algorithm depends on where the centroids are initialized. There is no guarantee that the result is the optimum result. The algorithm can iterate around a local optimum. In order to avoid a local optimum, you can run the algorithm many times, starting with different centroids' positions. To compare the different runs, you can use the cluster's distortion – the sum of the squared distances between each observation and its centroids. Customer segmentation with K-means clustering We're going to use the wholesale customer dataset we downloaded from the Center for Machine Learning and Intelligent Systems at the University of California, Irvine. You can download the dataset from here – https://archive.ics.uci.edu/ml/datasets/Wholesale+customers#. The dataset contains 440 customers (observations) of a wholesale distributor. It includes the annual spend in monetary units on six product categories – Fresh, Milk, Grocery, Frozen, Detergents_Paper, and Delicatessen. We've created a new field called Food that includes all categories except Detergents_Paper, as shown in the following screenshot: Load the new dataset into Rattle and go to the Cluster tab. Remember that, in unsupervised learning, there is no target variable. I want to create a segmentation based only on buying behavior; for this reason, I set Region and Channel to Ignore, as shown here: In the following screenshot, you can see the options Rattle offers for K-means. The most important one is Number of clusters; as we've seen, the analyst has to decide the number of clusters before running K-means: We have also seen that the initial position of the centroids can have some influence on the result of the algorithm. The position of the centroids is random, but we need to be able to reproduce the same experiment multiple times. When we're creating a model with K-means, we'll iteratively re-run the algorithm, tuning some options in order to improve the performance of the model. In this case, we need to be able to reproduce exactly the same experiment. Under the hood, R has a pseudo-random number generator based on a starting point called Seed. If you want to reproduce the exact same experiment, you need to re-run the algorithm using the same Seed. Sometimes, the performance of K-means depends on the initial position of the centroids. For this reason, sometimes you need to able to re-run the model using a different initial position for the centroids. To run the model with different initial positions, you need to run with a different Seed. After executing the model, Rattle will show some interesting information. The size of each cluster, the means of the variables in the dataset, the centroid's position, and the Within cluster sum of squares value. This measure, also called distortion, is the sum of the squared differences between each point and its centroid. It's a measure of the quality of the model. Another interesting option is Runs; by using this option, Rattle will run the model the specified number of times and will choose the model with the best performance based on the Within cluster sum of squares value. Deciding on the number of clusters can be difficult. To choose the number of clusters, we need a way to evaluate the performance of the algorithm. The sum of the squared distance between the observations and the associated centroid could be a performance measure. Each time we add a centroid to KMeans, the sum of the squared difference between the observations and the centroids decreases. The difference in this measure using a different number of centroids is the gain associated to the added centroids. Rattle provides an option to automate this test, called Iterative Clusters. If you set the Number of clusters value to 10 and check the Iterate Clusters option, Rattle will run KMeans iteratively, starting with 3 clusters and finishing with 10 clusters. To compare each iteration, Rattle provides an iteration plot. In the iteration plot, the blue line shows the sum of the squared differences between each observation and its centroid. The red line shows the difference between the current sum of squared distances and the sum of the squared distance of the previous iteration. For example, for four clusters, the red line has a very low value; this is because the difference between the sum of the squared differences with three clusters and with four clusters is very small. In the following screenshot, the peak in the red line suggests that six clusters could be a good choice. This is because there is an important drop in the Sum of WithinSS value at this point: In this way, to finish my model, I only need to set the Number of clusters to 3, uncheck the Re-Scale checkbox, and click on the Execute button: Finally, Rattle returns the six centroids of my clusters: Now we have the six centroids and we want Rattle to associate each observation with a centroid. Go to the Evaluate tab, select the KMeans option, select the Training dataset, mark All in the report type, and click on the Execute button as shown in the following screenshot. This process will generate a CSV file with the original dataset and a new column called kmeans. The content of this attribute is a label (a number) representing the cluster associated with the observation (customer), as shown in the following screenshot: After clicking on the Execute button, you will need to choose a folder to save the resulting file to and will have to type in a filename. The generated data inside the CSV file will look similar to the following screenshot: In the previous screenshot, you can see ten lines of the resulting file; note that the last column is kmeans. Preparing the data in Qlik Sense Our objective is to create the data model, but using the new CSV file with the kmeans column. We're going to update our application by replacing the customer data file with this new data file. Save the new file in the same folder as the original file, open the Qlik Sense application, and go to Data load editor. There are two differences between the original file and this one. In the original file, we added a line to create a customer identifier called Customer_ID, and in this second file we have this field in the dataset. The second difference is that in this new file we have the kmeans column. From Data load editor, go to the Wholesale customer data sheet, modify line 2, and add line 3. In line 2, we just load the content of Customer_ID, and in line 3, we load the content of the kmeans field and rename it to Cluster, as shown in the following screenshot. Finally, update the name of the file to be the new one and click on the Load data button: When the data load process finishes, open the data model viewer to check your data model, as shown here: Note that you have the same data model with a new field called Cluster. Creating a customer segmentation sheet in Qlik Sense Now we can add a sheet to the application. We'll add three charts to see our clusters and how our customers are distributed in our clusters. The first chart will describe the buying behavior of each cluster, as shown here: The second chart will show all customers distributed in a scatter plot, and in the last chart we'll see the number of customers that belong to each cluster, as shown here: I'll start with the chart to the bottom-right; it's a bar chart with Cluster as the dimension and Count([Customer_ID]) as the measure. This simple bar chart has something special – colors. Each customer's cluster has a special color code that we use in all charts. In this way, cluster 5 is blue in the three charts. To obtain this effect, we use this expression to define the color as color(fieldindex('Cluster', Cluster)), which is shown in the following screenshot: You can find this color trick and more in this interesting blog by Rob Wunderlich – http://qlikviewcookbook.com/. My second chart is the one at the top. I copied the previous chart and pasted it onto a free place. I kept the dimension but I changed the measure by using six new measures: Avg([Detergents_Paper]) Avg([Delicassen]) Avg([Fresh]) Avg([Frozen]) Avg([Grocery]) Avg([Milk]) I placed my last chart at the bottom-left. I used a scatter plot to represent all of my 440 customers. I wanted to show the money spent by each customer on food and detergents, and its cluster. I used the y axis to show the money spent on detergents and the x axis for the money spent on food. Finally, I used colors to highlight the cluster. The dimension is Customer_Id and the measures are Delicassen+Fresh+Frozen+Grocery+Milk (or Food) and [Detergents_Paper]. As the final step, I reused the color expression from the earlier charts. Now our first Qlik Sense application has two sheets – the original one is 100 percent Qlik Sense and helps us to understand our customers, channels, and regions. This new sheet uses clustering to give us a different point of view; this second sheet groups the customers by their similar buying behavior. All this information is useful to deliver better campaigns to our customers. Cluster 5 is our least profitable cluster, but is the biggest one with 227 customers. The main difference between cluster 5 and cluster 2 is the amount of money spent on fresh products. Can we deliver any offer to customers in cluster 5 to try to sell more fresh products? Select retail customers and ask yourself, who are our best retail customers? To which cluster do they belong? Are they buying all our product categories? Hierarchical clustering Hierarchical clustering tries to group objects based on their similarity. To explain how this algorithm works, we're going to start with seven points (or observations) lying in a straight line: We start by calculating the distance between each point. I'll come back later to the term distance; in this example, distance is the difference between two positions in the line. The points D and E are the ones with the smallest distance in between, so we group them in a cluster, as shown in this diagram: Now, we substitute point D and point E for their mean (red point) and we look for the two points with the next smallest distance in between. In this second iteration, the closest points are B and C, as shown in this diagram: We continue iterating until we've grouped all observations in the dataset, as shown here: Note that, in this algorithm, we can decide on the number of clusters after running the algorithm. If we divide the dataset into two clusters, the first cluster is point G and the second cluster is A, B, C, D, E, and F. This gives the analyst the opportunity to see the big picture before deciding on the number of clusters. The lowest level of clustering is a trivial one; in this example, seven clusters with one point in each one. The chart I've created while explaining the algorithm is a basic form of a dendrogram. The dendrogram is a tree diagram used in Rattle and in other tools to illustrate the layout of the clusters produced by hierarchical clustering. In the following screenshot, we can see the dendrogram created by Rattle for the wholesale customer dataset. In Rattle's dendrogram, the y axis represent all observations or customers in the dataset, and the x axis represents the distance between the clusters: Association analysis Association rules or association analysis is also an important topic in data mining. This is an unsupervised method, so we start with an unlabeled dataset. An unlabeled dataset is a dataset without a variable that gives us the right answer. Association analysis attempts to find relationships between different entities. The classic example of association rules is market basket analysis. This means using a database of transactions in a supermarket to find items that are bought together. For example, a person who buys potatoes and burgers usually buys beer. This insight could be used to optimize the supermarket layout. Online stores are also a good example of association analysis. They usually suggest to you a new item based on the items you have bought. They analyze online transactions to find patterns in the buyer's behavior. These algorithms assume all variables are categorical; they perform poorly with numeric variables. Association methods need a lot of time to be completed; they use a lot of CPU and memory. Remember that Rattle runs on R and the R engine loads all data into RAM memory. Suppose we have a dataset such as the following: Our objective is to discover items that are purchased together. We'll create rules and we'll represent these rules like this: Chicken, Potatoes → Clothes This rule means that when a customer buys Chicken and Potatoes, he tends to buy Clothes. As we'll see, the output of the model will be a set of rules. We need a way to evaluate the quality or interest of a rule. There are different measures, but we'll use only a few of them. Rattle provides three measures: Support Confidence Lift Support indicates how often the rule appears in the whole dataset. In our dataset, the rule Chicken, Potatoes → Clothes has a support of 48.57 percent (3 occurrences / 7 transactions). Confidence measures how strong rules or associations are between items. In this dataset, the rule Chicken, Potatoes → Clothes has a confidence of 1. The items Chicken and Potatoes appear three times in the dataset and the items Chicken, Potatoes, and Clothes appear three times in the dataset; and 3/3 = 1. A confidence close to 1 indicates a strong association. In the following screenshot, I've highlighted the options on the Associate tab we have to choose from before executing an association method in Rattle: The first option is the Baskets checkbox. Depending on the kind of input data, we'll decide whether or not to check this option. If the option is checked, such as in the preceding screenshot, Rattle needs an identification variable and a target variable. After this example, we'll try another example without this option. The second option is the minimum Support value; by default, it is set to 0.1. Rattle will not return rules with a lower Support value than the one you have set in this text box. If you choose a higher value, Rattle will only return rules that appear many times in your dataset. If you choose a lower value, Rattle will return rules that appear in your dataset only a few times. Usually, if you set a high value for Support, the system will return only the obvious relationships. I suggest you start with a high Support value and execute the methods many times with a lower value in each execution. In this way, in each execution, new rules will appear that you can analyze. The third parameter you have to set is Confidence. This parameter tells you how strong the rule is. Finally, the length is the number of items that contains a rule. A rule like Beer è Chips has length of two. The default option for Min Length is 2. If you set this variable to 2, Rattle will return all rules with two or more items in it. After executing the model, you can see the rules created by Rattle by clicking on the Show Rules button, as illustrated here: Rattle provides a very simple dataset to test the association rules in a file called dvdtrans.csv. Test the dataset to learn about association rules. Further learning In this article, we introduced supervised and unsupervised learning, the two main subgroups of machine learning algorithms; if you want to learn more about machine learning, I suggest you complete a MOOC course called Machine Learning at Coursera: https://www.coursera.org/learn/machine-learning The acronym MOOC stands for Massive Open Online Course; these are courses open to participation via the Internet. These courses are generally free. Coursera is one of the leading platforms for MOOC courses. Machine Learning is a great course designed and taught by Andrew Ng, Associate Professor at Stanford University; Chief Scientist at Baidu; and Chairman and Co-founder at Coursera. This course is really interesting. A very interesting book is Machine Learning with R by Brett Lantz, Packt Publishing. Summary In this article, we were introduced to machine learning, and supervised and unsupervised methods. We focused on unsupervised methods and covered centroid-based clustering, hierarchical clustering, and association rules. We used a simple dataset, but we saw how a clustering algorithm can complement a 100 percent Qlik Sense approach by adding more information. Resources for Article: Further resources on this subject: Qlik Sense's Vision [article] Securing QlikView Documents [article] Conozca QlikView [article]
Read more
  • 0
  • 0
  • 32729
Modal Close icon
Modal Close icon