How-To Tutorials

article-image-setting-software-infrastructure-cloud

23 Sep 2014

42 min read

Setting up of Software Infrastructure on the Cloud

23 Sep 2014

0
0
2848

article-image-windows-phone-8-applications

Packt

23 Sep 2014

17 min read

Windows Phone 8 Applications

Packt

23 Sep 2014

17 min read

0
0
1559

Packt

23 Sep 2014

15 min read

Installing RHEV Manager

Packt

23 Sep 2014

15 min read

This article by Pradeep Subramanian, author of Getting Started with Red Hat Enterprise Virtualization, describes setting up RHEV-M, including the installation, initial configuration, and connection to the administrator and user portal of the manager web interface. (For more resources related to this topic, see here.) Setting up the RHEL operating system for the manager Prior to starting the installation of RHEV-M, please make sure all the prerequisite are met to set up RHEV environment. Consider the following when setting up RHEL OS for RHEV-M: Install Red Hat Enterprise Linux 6 with latest minor update of 5, and during package selection step, select minimal or basic server as an option. Don't select any custom package. The hostname should be set to FQDN. Set up basic networking; use of static IP is recommended for your manager with a default gateway and primary and secondary DNS client configured. SELinux and iptables are enabled by default as part of the operating system installation. For more security, it's highly recommended to keep it on. To disable SELinux on Red Hat Enterprise Linux, please run the following command as the root user: # setenforce Permissive This command will switch off SELinux enforcement temporarily until the machine is rebooted. If you would like to permanently disable it, edit /etc/sysconfig/selinux and enter SELINUX=disabled. Registering with Red Hat Network To install RHEV-M, you need to first register your manager machine with Red Hat Network and subscribe to the relevant channels. You need to connect your machine to the Red Hat Network with a valid account with access to the relevant software channels to register your machine and deploy RHEV-M packages. If your environment does not have access to the Red Hat Network, you can perform an offline installation of RHEV-M. For more information, please refer to https://access.redhat.com/site/articles/216983. To register your machine with the Red Hat Network using RHN Classic, please run the following command from the shell and follow the onscreen instructions: # rhn_register This command will register your manager machine to the parent channel of your operating system version. It's strongly recommended to use Red Hat Subscription Manager to register and subscribe to the relevant channel. To use Red Hat Subscription Manager, please refer to the Subscribing to the Red Hat Enterprise Virtualization Manager Channels using Subscription Manager section from the RHEV 3.3 installation guide at https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html/Installation_Guide/index.html. After successful registration of your manager machine to the Red Hat Network, subscribe the manager machine using the following command to subscribe to the relevant channels. Then download and install the manager-related software packages. The following command will prompt you to enter your Red Hat Network login credentials: # rhn-channel -a -c rhel-x86_64-server-6-rhevm-3.3 -c rhel-x86_64-server-supplementary-6 -c jbappplatform-6-x86_64-server-6-rpm Username: "yourrhnlogin" Password: XXXX To cross-check whether your manager machine is registered with Red Hat Network and subscribed to the relevant channels, please run the following command. This will return all the channels mentioned earlier plus the base channel of your operating system version, as shown in the following yum command output: # yum repolist repo id repo name status jbappplatform-6-x86_64-server-6-rpm Red Hat JBoss EAP (v 6) for 6Server x86_64 1,415 rhel-x86_64-server-6 Red Hat Enterprise Linux Server (v. 6 for 64-bit x86_64) 12,662 rhel-x86_64-server-6-rhevm-3.3 Red Hat Enterprise Virtualization Manager (v.3.3 x86_64) 164 rhel-x86_64-server-supplementary-6 RHEL Server Supplementary (v. 6 64-bit x86_64) 370 You are now ready to start downloading and installing the software required to set up and run your RHEV-M. Installing the RHEV-Manager packages Update your base Red Hat Enterprise Linux operating system to the latest up-to-date version by running the following command: # yum -y upgrade Reboot the machine if the upgrade installed the latest version of the kernel. After a successful upgrade, run the following command to install RHEV-M and its dependent packages: # yum -y install rhevm There are a few conditions you need to consider before configuring RHEV-M: We need a working DNS for forward and reverse lookup of FQDN. We are going to use the Red Hat IdM server configured with the DNS role in the rest of the article for domain name resolution of the entire virtualization infrastructure. Refer to the Red Hat Identity Management Guide for more information on how to add forward and reverse zone records to the configured IdM DNS at https://access.redhat.com/documentation/en- US/Red_Hat_Enterprise_Linux/6/html/Identity_Management_Guide/Working_with_DNS.html. You can't install Identity Management software on the same box where the manager is going to be deployed due to some package conflicts. To store ISO images of operating systems in order to create a virtual machine, you need Network File Server (NFS) with a planned NFS export path. If your manager machine has sufficient storage space to host all your ISOs, you can set up the ISO domain while configuring the manager to set up the NFS share automatically through the installer to store all your ISO images. If you have an existing NFS server, it's recommended to use a dedicated export for the ISO domain to store the ISO images instead of using the manager server to serve the NFS service. Here we are going to use a dedicated local mount point named /rhev-iso-library on the RHEV Manager box to store our ISO images to provision the virtual machine. Note that the mount point should be empty and only contain the user and group ownership and permission sets before running the installer: # chown -R 36:36 /rhev-iso-library ; chmod 0755 /rhev-iso-library It will also be useful to have the following information at hand: Ports to be used for HTTP and HTTPS communication. FQDN of the manager. A reverse lookup is performed on your hostname. At the time of writing this article, RHEV supported only the PostgreSQL database for use with RHEV-M. You can use a local database or remote database setup. Here we are going to use the local database. In the case of a remote database setup, keep all database login credentials ready. Please refer to the Preparing a PostgreSQL Database for Use with Red Hat Enterprise Virtualization Manager section for detailed information on setting up a remote database to use with manager at https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html-single/Installation_Guide/index.html#Preparing_a_Postgres_Database_Server_for_use_with_Red_Hat_Enterprise_Virtualization_Manager. Password for internal admin account of RHEV-M. Organization name for the RHEV-M SSL certificate. Leave the default storage type to NFS for the initial default data center. We will create a new data center in the latter stage of our setup. Provide the file system path and display name for NFS ISO library configuration, so that the manager will configure NFS of the supplied filesystem path, and make it visible by the display name under the Storage tab section on administration portal of RHEV-M. Running the initial engine setup Once you're prepared with all the answers to the questions we discussed in the previous section, it's time to run the initial configuration script called engine-setup to perform the initial configuration and setting up of RHEV-M. The installer will ask you several questions, which have been discussed above, and based on your input, it will configure your RHEV-M. Leave the default settings as they are and press Enter if you feel the installer's default answers are appropriate to your setup. Once the installer takes in all your input, it will ask you for the final confirmation of your supplied configuration setting; type in OK and press Enter to continue the setup. For better understanding, please refer to the following output of the engine-setup installer while setting up a lab for this article. Log in to manager as the root user, and from the shell of your Manager machine, run the following engine-setup command: # engine-setup Once you execute this command, engine-setup performs the following set of tasks on the system: First check whether any updates are available for this system. Accept the default Yes and proceed further: Checking for product updates and update if available. Enter Default Yes. Set the hostname of the RHEV-M system. The administration portal web access will get bound to the FQDN entered here: Host fully qualified DNS name of this server [rhevmanager.example.com]: Set up the firewall rule on the manager system, and this will backup your existing firewall rule configured on the manager system if any: Do you want Setup to configure the firewall? (Yes, No) [Yes]: No Local will set up the PostgreSQL database instance on the manager system; optionally, you can choose Remote to use the existing remote PostgreSQL database instance to use with manager: Where is the database located? (Local, Remote) [Local]: If you selected Local, you will get an option to customize the PostgreSQL database setup by choosing the relevant option: Would you like Setup to automatically configure PostgreSQL, or prefer to perform that manually? (Automatic, Manual) [Automatic]: Set up the internal admin user password to access the manager web interface for initial setup of the virtualization infrastructure: Engine admin password: Confirm engine admin password: RHEV supports the use of clusters to manage Gluster storage bricks in addition to virtualization hosts. Choosing both will give the flexibility to use hypervisor hosts to host virtual machines as well as other sets of hypervisor hosts to manage Gluster storage bricks in your RHEV environment: Application mode (Both, Virt, Gluster) [Both]: Engine installer creates a data center named Default as part of the initial setup. The following step will ask you to select the type of storage to be used with the data center. Mixing storage domains of different types is not supported in the 3.3 release, but it is supported in the latest 3.4 release. Choose the default NFS option and proceed further. We are going to create a new data center, using the administration portal, from scratch after the engine setup and then select the storage type as ISCSI for the rest of this article: Default storage type: (NFS, FC, ISCSI, POSIXFS) [NFS]: The manager uses certificates to communicate securely with its hosts. Provide your organization's name for the certificate: Organization name for certificate [example.com]: The manager uses the Apache web server to present a landing page to users. The engine-setup script can make the landing page of the manager the default page presented by Apache: Do you wish to set the application as the default page of the web server? (Yes, No) [Yes]: By default, external SSL (HTTPS) communications with the manager are secured with the self-signed certificate created in the PKI configuration stage for secure communication with hosts. Another certificate may be chosen for external HTTPS connections without affecting how the manager communicates with hosts: Setup can configure apache to use SSL using a certificate issued from the internal CA. Do you wish Setup to configure that, or prefer to perform that manually? (Automatic, Manual) [Automatic]: Choose Yes to set up an NFS share on the manager system and provide the export path to be used to dump the ISO images in a later part. Finally, label the ISO domain with a name that will be unique and easily identifiable on the Storage tab of the administration portal: Configure an NFS share on this server to be used as an ISO Domain? (Yes, No) [Yes]: Local ISO domain path [/var/lib/exports/iso]: /rhev-iso-library Local ISO domain name [ISO_DOMAIN]: ISO_Datastore The engine-setup script can optionally configure a WebSocket proxy server in order to allow users to connect with virtual machines via the noVNC or HTML 5 consoles: Configure WebSocket Proxy on this machine? (Yes, No) [Yes]: The final step will ask you to provide proxy server credentials if the manager system is hosted behind the proxy server to access the Internet. RHEV supports vRed Hat Access Plugin, which will help you collect the logs and open a service request with Red Hat Global Support Services from the administration portal of the manager: Would you like transactions from the Red Hat Access Plugin sent from the RHEV Manager to be brokered through a proxy server? (Yes, No) [No]: Finally, if you feel all the input and configurations are satisfactory, press Enter to complete the engine setup. It will show you the configuration preview, and if you feel satisfied, press OK: Please confirm installation settings (OK, Cancel) [OK]: After the successful setup of RHEV-M, you can see the summary, which will show various bits of information such as how to access the admin portal of RHEV-M, the installed logs, the configured iptables firewall, the required ports, and so on. Connecting to the admin and user portal 006C Now access the admin portal, as shown in the following screenshot, using the following URLs: http://rhevmanager.example.com:80/ovirt-engine https://rhevmanager.example.com:443/ovirt-engine Use the user admin and password specified during the setup to log in to the oVirt engine (also called RHEV-M). Click on Administration Portal and log in using the credentials you set up for the admin account during the engine setup. Then click on User Portal and log in using the credentials you set up for the admin account during the engine setup. You will see a difference in the portal with a very trimmed-down user interface that is useful for self-service. We will see how to integrate the manager with other active directory services and efficiently use the user portal for self-service consumption later in the article. RHEV reporting RHEV bundles two optional components. The first is the history management database, which holds the historical information of various virtualization resources such as data centers, clusters, hosts, virtual machines, and others so that any other external application can consume them for reporting. The second optional component is the customized JasperServer and JasperReports. JasperServer is an open source reporting tool capable of generating and exporting reports in various formats such as PDF, Word, and CSV for end user consumption. To enable the reporting functionality, you need to install the specific components that we discussed. For simplicity, we are installing both the components at one go using the command described in the following section. Installing the RHEV history database and report server To install the history database and report servers, execute the following command: # yum install rhevm-dwh rhevm-reports Once you have installed the reporting components, you need to start with setting up the RHEV history database by using the following command: # rhevm-dwh-setup This will momentarily stop and start the oVirt engine service during the setup. Further, it will ask you to create a read-only user account to access the history database. Create it if you want to allow remote access to the history database and follow the onscreen instructions and finish the setup. Once the oVirt engine history database (also known as the RHEV Manager history database) is created, move on to setting up the report server. From the RHEV-M server, run the following command to set up the reporting server: # rhevm-reports-setup #setup will prompt to restart ovirt-engine service. In order to proceed the installer must stop the ovirt-engine service Would you like to stop the ovirt-engine service? (yes|no): #The command then performs a number of actions before prompting you to set the password for the Red Hat Enterprise Virtualization Manager Reports administrative users (rhevm-admin and superuser) Please choose a password for the reports admin user(s) (rhevm-admin and superuser): Downloading the example code You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. Follow the onscreen instructions and enter Yes to stop the oVirt-engine and set up a password for the default internal super user account called rhevm-admin to access and manage the report portal and proceed further with the setup. Note that this user is different from the internal admin account we set up during the engine setup of RHEV-M. The rhevm-admin user is used only for accessing and managing the report portal, not for the admin or user portal. Accessing the RHEV report portal After the successful installation and initial configuration setup of the report portal, you can access it by https://rhevmanager.example.com/rhevm-reports/login.html from your client machine. You can also access the report portal from the manager web interface by clicking on the Reports Portal hyperlink, which will redirect you to the report portal. Log in with rhevm-admin and the password credentials we set while running the RHEV-M report setup script in the previous section to generate reports and create and manage users to access the report portal. Initially, most of the report portal is empty since we are yet to set up and create the virtual infrastructure. It will take at least a day or two after the complete virtualization infrastructure setup to view various resources and generate reports. To learn more about using and gathering reports using the report portal, please refer to Reports, History Database Reports, and Dashboards at https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.3/html/Administration_Guide/chap-Reports_History_Database_Reports_and_Dashboards.html. Summary In this article, we discussed setting up our basic virtualization infrastructure, which includes installing RHEV-M and report server and connecting to various portals such as admin, user, and report portal. Resources for Article: Further resources on this subject: Designing a XenDesktop® Site [article] XenMobile™ Solutions Bundle [article] Installing Virtual Desktop Agent – server OS and desktop OS [article]

0
0
3307

article-image-using-socketio-and-express-together

Packt

23 Sep 2014

16 min read

Using Socket.IO and Express together

Packt

23 Sep 2014

16 min read

In this article by Joshua Johanan, the author of the book Building Scalable Apps with Redis and Node.js, tells us that Express application is just the foundation. We are going to add features until it is a fully usable app. We currently can serve web pages and respond to HTTP, but now we want to add real-time communication. It's very fortunate that we just spent most of this article learning about Socket.IO; it does just that! Let's see how we are going to integrate Socket.IO with an Express application. (For more resources related to this topic, see here.) We are going to use Express and Socket.IO side by side. Socket.IO does not use HTTP like a web application. It is event based, not request based. This means that Socket.IO will not interfere with Express routes that we have set up, and that's a good thing. The bad thing is that we will not have access to all the middleware that we set up for Express in Socket.IO. There are some frameworks that combine these two, but it still has to convert the request from Express into something that Socket.IO can use. I am not trying to knock down these frameworks. They simplify a complex problem and most importantly, they do it well (Sails is a great example of this). Our app, though, is going to keep Socket.IO and Express separated as much as possible with the least number of dependencies. We know that Socket.IO does not need Express, as all our examples have not used Express in any way. This has an added benefit in that we can break off our Socket.IO module and run it as its own application at a future point in time. The other great benefit is that we learn how to do it ourselves. We need to go into the directory where our Express application is. Make sure that our pacakage.json has all the additional packages for this article and run npm.install. The first thing we need to do is add our configuration settings. Adding Socket.IO to the config We will use the same config file that we created for our Express app. Open up config.js and change the file to what I have done in the following code: var config = {port: 3000,secret: 'secret',redisPort: 6379,redisHost: 'localhost',routes: { login: '/account/login', logout: '/account/logout'}};module.exports = config; We are adding two new attributes, redisPort and redisHost. This is because of how the redis package configures its clients. We also are removing the redisUrl attribute. We can configure all our clients with just these two Redis config options. Next, create a directory under the root of our project named socket.io. Then, create a file called index.js. This will be where we initialize Socket.IO and wire up all our event listeners and emitters. We are just going to use one namespace for our application. If we were to add multiple namespaces, I would just add them as files underneath the socket.io directory. Open up app.js and change the following lines in it: //variable declarations at the topVar io = require('./socket.io');//after all the middleware and routesvar server = app.listen(config.port);io.startIo(server); We will define the startIo function shortly, but let's talk about our app.listen change. Previously, we had the app.listen execute, and we did not capture it in a variable; now we are. Socket.IO listens using Node's http.createServer. It does this automatically if you pass in a number into its listen function. When Express executes app.listen, it returns an instance of the HTTP server. We capture that, and now we can pass the http server to Socket.IO's listen function. Let's create that startIo function. Open up index.js present in the socket.io location and add the following lines of code to it: var io = require('socket.io');var config = require('../config');var socketConnection = function socketConnection(socket){socket.emit('message', {message: 'Hey!'});};exports.startIo = function startIo(server){io = io.listen(server);var packtchat = io.of('/packtchat');packtchat.on('connection', socketConnection);return io;}; We are exporting the startIo function that expects a server object that goes right into Socket.IO's listen function. This should start Socket.IO serving. Next, we get a reference to our namespace and listen on the connection event, sending a message event back to the client. We also are loading our configuration settings. Let's add some code to the layout and see whether our application has real-time communication. We will need the Socket.IO client library, so link to it from node_modules like you have been doing, and put it in our static directory under a newly created js directory. Open layout.ejs present in the packtchatviews location and add the following lines to it: <script type="text/javascript" src="/js/socket.io.js"></script><script>var socket = io.connect("http://localhost:3000/packtchat");socket.on('message', function(d){console.log(d);});</script> We just listen for a message event and log it to the console. Fire up the node and load your application, http://localhost:3000. Check to see whether you get a message in your console. You should see your message logged to the console, as seen in the following screenshot: Success! Our application now has real-time communication. We are not done though. We still have to wire up all the events for our app. Who are you? There is one glaring issue. How do we know who is making the requests? Express has middleware that parses the session to see if someone has logged in. Socket.IO does not even know about a session. Socket.IO lets anyone connect that knows the URL. We do not want anonymous connections that can listen to all our events and send events to the server. We only want authenticated users to be able to create a WebSocket. We need to get Socket.IO access to our sessions. Authorization in Socket.IO We haven't discussed it yet, but Socket.IO has middleware. Before the connection event gets fired, we can execute a function and either allow the connection or deny it. This is exactly what we need. Using the authorization handler Authorization can happen at two places, on the default namespace or on a named namespace connection. Both authorizations happen through the handshake. The function's signature is the same either way. It will pass in the socket server, which has some stuff we need such as the connection's headers, for example. For now, we will add a simple authorization function to see how it works with Socket.IO. Open up index.js, present at the packtchatsocket.io location, and add a new function that will sit next to the socketConnection function, as seen in the following code: var io = require('socket.io');var socketAuth = function socketAuth(socket, next){return next();return next(new Error('Nothing Defined'));};var socketConnection = function socketConnection(socket){socket.emit('message', {message: 'Hey!'});};exports.startIo = function startIo(server){io = io.listen(server);var packtchat = io.of('/packtchat');packtchat.use(socketAuth);packtchat.on('connection', socketConnection);return io;}; I know that there are two returns in this function. We are going to comment one out, load the site, and then switch the lines that are commented out. The socket server that is passed in will have a reference to the handshake data that we will use shortly. The next function works just like it does in Express. If we execute it without anything, the middleware chain will continue. If it is executed with an error, it will stop the chain. Let's load up our site and test both by switching which return gets executed. We can allow or deny connections as we please now, but how do we know who is trying to connect? Cookies and sessions We will do it the same way Express does. We will look at the cookies that are passed and see if there is a session. If there is a session, then we will load it up and see what is in it. At this point, we should have the same knowledge about the Socket.IO connection that Express does about a request. The first thing we need to do is get a cookie parser. We will use a very aptly named package called cookie. This should already be installed if you updated your package.json and installed all the packages. Add a reference to this at the top of index.js present in the packtchatsocket.io location with all the other variable declarations: Var cookie = require('cookie'); And now we can parse our cookies. Socket.IO passes in the cookie with the socket object in our middleware. Here is how we parse it. Add the following code in the socketAuth function: var handshakeData = socket.request;var parsedCookie = cookie.parse(handshakeData.headers.cookie); At this point, we will have an object that has our connect.sid in it. Remember that this is a signed value. We cannot use it as it is right now to get the session ID. We will need to parse this signed cookie. This is where cookie-parser comes in. We will now create a reference to it, as follows: var cookieParser = require('cookie-parser'); We can now parse the signed connect.sid cookie to get our session ID. Add the following code right after our parsing code: var sid = cookieParser.signedCookie (parsedCookie['connect.sid'], config.secret); This will take the value from our parsedCookie and using our secret passphrase, will return the unsigned value. We will do a quick check to make sure this was a valid signed cookie by comparing the unsigned value to the original. We will do this in the following way: if (parsedCookie['connect.sid'] === sid) return next(new Error('Not Authenticated')); This check will make sure we are only using valid signed session IDs. The following screenshot will show you the values of an example Socket.IO authorization with a cookie: Getting the session We now have a session ID so we can query Redis and get the session out. The default session store object of Express is extended by connect-redis. To use connect-redis, we use the same session package as we did with Express, express-session. The following code is used to create all this in index.js, present at packtchatsocket.io: //at the top with the other variable declarationsvar expressSession = require('express-session');var ConnectRedis = require('connect-redis')(expressSession);var redisSession = new ConnectRedis({host: config.redisHost, port: config.redisPort}); The final line is creating the object that will connect to Redis and get our session. This is the same command used with Express when setting the store option for the session. We can now get the session from Redis and see what's inside of it. What follows is the entire socketAuth function along with all our variable declarations: var io = require('socket.io'),connect = require('connect'),cookie = require('cookie'),expressSession = require('express-session'),ConnectRedis = require('connect-redis')(expressSession),redis = require('redis'),config = require('../config'),redisSession = new ConnectRedis({host: config.redisHost, port: config.redisPort});var socketAuth = function socketAuth(socket, next){var handshakeData = socket.request;var parsedCookie = cookie.parse(handshakeData.headers.cookie);var sid = connect.utils.parseSignedCookie(parsedCookie['connect.sid'], config.secret);if (parsedCookie['connect.sid'] === sid) return next(new Error('Not Authenticated'));redisSession.get(sid, function(err, session){ if (session.isAuthenticated) { socket.user = session.user; socket.sid = sid; return next(); } else return next(new Error('Not Authenticated'));});}; We can use redisSession and sid to get the session out of Redis and check its attributes. As far as our packages are concerned, we are just another Express app getting session data. Once we have the session data, we check the isAuthenticated attribute. If it's true, we know the user is logged in. If not, we do not let them connect yet. We are adding properties to the socket object to store information from the session. Later on, after a connection is made, we can get this information. As an example, we are going to change our socketConnection function to send the user object to the client. The following should be our socketConnection function: var socketConnection = function socketConnection(socket){socket.emit('message', {message: 'Hey!'});socket.emit('message', socket.user);}; Now, let's load up our browser and go to http://localhost:3000. Log in and then check the browser's console. The following screenshot will show that the client is receiving the messages: Adding application-specific events The next thing to do is to build out all the real-time events that Socket.IO is going to listen for and respond to. We are just going to create the skeleton for each of these listeners. Open up index.js, present in packtchatsocket.io, and change the entire socketConnection function to the following code: var socketConnection = function socketConnection(socket){socket.on('GetMe', function(){});socket.on('GetUser', function(room){});socket.on('GetChat', function(data){});socket.on('AddChat', function(chat){});socket.on('GetRoom', function(){});socket.on('AddRoom', function(r){});socket.on('disconnect', function(){});}; Most of our emit events will happen in response to a listener. Using Redis as the store for Socket.IO The final thing we are going to add is to switch Socket.IO's internal store to Redis. By default, Socket.IO uses a memory store to save any data you attach to a socket. As we know now, we cannot have an application state that is stored only on one server. We need to store it in Redis. Therefore, we add it to index.js, present in packtchatsocket.io. Add the following code to the variable declarations: Var redisAdapter = require('socket.io-redis'); An application state is a flexible idea. We can store the application state locally. This is done when the state does not need to be shared. A simple example is keeping the path to a local temp file. When the data will be needed by multiple connections, then it must be put into a shared space. Anything with a user's session will need to be shared, for example. The next thing we need to do is add some code to our startIo function. The following code is what our startIo function should look like: exports.startIo = function startIo(server){io = io.listen(server);io.adapter(redisAdapter({host: config.redisHost, port: config.redisPort}));var packtchat = io.of('/packtchat');packtchat.use(socketAuth);packtchat.on('connection', socketConnection);return io;}; The first thing is to start the server listening. Next, we will call io.set, which allows us to set configuration options. We create a new redisStore and set all the Redis attributes (redisPub, redisSub, and redisClient) to a new Redis client connection. The Redis client takes a port and the hostname. Socket.IO inner workings We are not going to completely dive into everything that Socket.IO does, but we will discuss a few topics. WebSockets This is what makes Socket.IO work. All web servers serve HTTP, that is, what makes them web servers. This works great when all you want to do is serve pages. These pages are served based on requests. The browser must ask for information before receiving it. If you want to have real-time connections, though, it is difficult and requires some workaround. HTTP was not designed to have the server initiate the request. This is where WebSockets come in. WebSockets allow the server and client to create a connection and keep it open. Inside of this connection, either side can send messages back and forth. This is what Socket.IO (technically, Engine.io) leverages to create real-time communication. Socket.IO even has fallbacks if you are using a browser that does not support WebSockets. The browsers that do support WebSockets at the time of writing include the latest versions of Chrome, Firefox, Safari, Safari on iOS, Opera, and IE 11. This means the browsers that do not support WebSockets are all the older versions of IE. Socket.IO will use different techniques to simulate a WebSocket connection. This involves creating an Ajax request and keeping the connection open for a long time. If data needs to be sent, it will send it in an Ajax request. Eventually, that request will close and the client will immediately create another request. Socket.IO even has an Adobe Flash implementation if you have to support really old browsers (IE 6, for example). It is not enabled by default. WebSockets also are a little different when scaling our application. Because each WebSocket creates a persistent connection, we may need more servers to handle Socket.IO traffic then regular HTTP. For example, when someone connects and chats for an hour, there will have only been one or two HTTP requests. In contrast, a WebSocket will have to be open for the entire hour. The way our code base is written, we can easily scale up more Socket.IO servers by themselves. Ideas to take away from this article The first takeaway is that for every emit, there needs to be an on. This is true whether the sender is the server or the client. It is always best to sit down and map out each event and which direction it is going. The next idea is that of note, which entails building our app out of loosely coupled modules. Our app.js kicks everything that deals with Express off. Then, it fires the startIo function. While it does pass over an object, we could easily create one and use that. Socket.IO just wants a basic HTTP server. In fact, you can just pass the port, which is what we used in our first couple of Socket.IO applications (Ping-Pong). If we wanted to create an application layer of Socket.IO servers, we could refactor this code out and have all the Socket.IO servers run on separate servers other than Express. Summary At this point, we should feel comfortable about using real-time events in Socket.IO. We should also know how to namespace our io server and create groups of users. We also learned how to authorize socket connections to only allow logged-in users to connect. Resources for Article: Further resources on this subject: Exploring streams [article] Working with Data Access and File Formats Using Node.js [article] So, what is Node.js? [article]

0
0
18817

How-To Tutorials

Packt

22 Sep 2014

18 min read

Improving Code Quality

Packt

22 Sep 2014

18 min read

In this article by Alexandru Vlăduţu, author of Mastering Web Application Development with Express, we are going to see how to test Express applications and how to improve the code quality of our code by leveraging existing NPM modules. (For more resources related to this topic, see here.) Creating and testing an Express file-sharing application Now, it's time to see how to develop and test an Express application with what we have learned previously. We will create a file-sharing application that allows users to upload files and password-protect them if they choose to. After uploading the files to the server, we will create a unique ID for that file, store the metadata along with the content (as a separate JSON file), and redirect the user to the file's information page. When trying to access a password-protected file, an HTTP basic authentication pop up will appear, and the user will have to only enter the password (no username in this case). The package.json file, so far, will contain the following code: { "name": "file-uploading-service", "version": "0.0.1", "private": true, "scripts": { "start": "node ./bin/www" }, "dependencies": { "express": "~4.2.0", "static-favicon": "~1.0.0", "morgan": "~1.0.0", "cookie-parser": "~1.0.1", "body-parser": "~1.0.0", "debug": "~0.7.4", "ejs": "~0.8.5", "connect-multiparty": "~1.0.5", "cuid": "~1.2.4", "bcrypt": "~0.7.8", "basic-auth-connect": "~1.0.0", "errto": "~0.2.1", "custom-err": "0.0.2", "lodash": "~2.4.1", "csurf": "~1.2.2", "cookie-session": "~1.0.2", "secure-filters": "~1.0.5", "supertest": "~0.13.0", "async": "~0.9.0" }, "devDependencies": { } } When bootstrapping an Express application using the CLI, a /bin/www file will be automatically created for you. The following is the version we have adopted to extract the name of the application from the package.json file. This way, in case we decide to change it we won't have to alter our debugging code because it will automatically adapt to the new name, as shown in the following code: #!/usr/bin/env node var pkg = require('../package.json'); var debug = require('debug')(pkg.name + ':main'); var app = require('../app'); app.set('port', process.env.PORT || 3000); var server = app.listen(app.get('port'), function() { debug('Express server listening on port ' + server.address().port); }); The application configurations will be stored inside config.json: { "filesDir": "files", "maxSize": 5 } The properties listed in the preceding code refer to the files folder (where the files will be updated), which is relative to the root and the maximum allowed file size. The main file of the application is named app.js and lives in the root. We need the connect-multiparty module to support file uploads, and the csurf and cookie-session modules for CSRF protection. The rest of the dependencies are standard and we have used them before. The full code for the app.js file is as follows: var express = require('express'); var path = require('path'); var favicon = require('static-favicon'); var logger = require('morgan'); var cookieParser = require('cookie-parser'); var session = require('cookie-session'); var bodyParser = require('body-parser'); var multiparty = require('connect-multiparty'); var Err = require('custom-err'); var csrf = require('csurf'); var ejs = require('secure-filters').configure(require('ejs')); var csrfHelper = require('./lib/middleware/csrf-helper'); var homeRouter = require('./routes/index'); var filesRouter = require('./routes/files'); var config = require('./config.json'); var app = express(); var ENV = app.get('env'); // view engine setup app.engine('html', ejs.renderFile); app.set('views', path.join(__dirname, 'views')); app.set('view engine', 'html'); app.use(favicon()); app.use(bodyParser.json()); app.use(bodyParser.urlencoded()); // Limit uploads to X Mb app.use(multiparty({ maxFilesSize: 1024 * 1024 * config.maxSize })); app.use(cookieParser()); app.use(session({ keys: ['rQo2#0s!qkE', 'Q.ZpeR49@9!szAe'] })); app.use(csrf()); // add CSRF helper app.use(csrfHelper); app.use('/', homeRouter); app.use('/files', filesRouter); app.use(express.static(path.join(__dirname, 'public'))); /// catch 404 and forward to error handler app.use(function(req, res, next) { next(Err('Not Found', { status: 404 })); }); /// error handlers // development error handler // will print stacktrace if (ENV === 'development') { app.use(function(err, req, res, next) { res.status(err.status || 500); res.render('error', { message: err.message, error: err }); }); } // production error handler // no stacktraces leaked to user app.use(function(err, req, res, next) { res.status(err.status || 500); res.render('error', { message: err.message, error: {} }); }); module.exports = app; Instead of directly binding the application to a port, we are exporting it, which makes our lives easier when testing with supertest. We won't need to care about things such as the default port availability or specifying a different port environment variable when testing. To avoid having to create the whole input when including the CSRF token, we have created a helper for that inside lib/middleware/csrf-helper.js: module.exports = function(req, res, next) { res.locals.csrf = function() { return "<input type='hidden' name='_csrf' value='" + req.csrfToken() + "' />"; } next(); }; For the password–protection functionality, we will use the bcrypt module and create a separate file inside lib/hash.js for the hash generation and password–compare functionality: var bcrypt = require('bcrypt'); var errTo = require('errto'); var Hash = {}; Hash.generate = function(password, cb) { bcrypt.genSalt(10, errTo(cb, function(salt) { bcrypt.hash(password, salt, errTo(cb, function(hash) { cb(null, hash); })); })); }; Hash.compare = function(password, hash, cb) { bcrypt.compare(password, hash, cb); }; module.exports = Hash; The biggest file of our application will be the file model, because that's where most of the functionality will reside. We will use the cuid() module to create unique IDs for files, and the native fs module to interact with the filesystem. The following code snippet contains the most important methods for models/file.js: function File(options, id) { this.id = id || cuid(); this.meta = _.pick(options, ['name', 'type', 'size', 'hash', 'uploadedAt']); this.meta.uploadedAt = this.meta.uploadedAt || new Date(); }; File.prototype.save = function(path, password, cb) { var _this = this; this.move(path, errTo(cb, function() { if (!password) { return _this.saveMeta(cb); } hash.generate(password, errTo(cb, function(hashedPassword) { _this.meta.hash = hashedPassword; _this.saveMeta(cb); })); })); }; File.prototype.move = function(path, cb) { fs.rename(path, this.path, cb); }; For the full source code of the file, browse the code bundle. Next, we will create the routes for the file (routes/files.js), which will export an Express router. As mentioned before, the authentication mechanism for password-protected files will be the basic HTTP one, so we will need the basic-auth-connect module. At the beginning of the file, we will include the dependencies and create the router: var express = require('express'); var basicAuth = require('basic-auth-connect'); var errTo = require('errto'); var pkg = require('../package.json'); var File = require('../models/file'); var debug = require('debug')(pkg.name + ':filesRoute'); var router = express.Router(); We will have to create two routes that will include the id parameter in the URL, one for displaying the file information and another one for downloading the file. In both of these cases, we will need to check if the file exists and require user authentication in case it's password-protected. This is an ideal use case for the router.param() function because these actions will be performed each time there is an id parameter in the URL. The code is as follows: router.param('id', function(req, res, next, id) { File.find(id, errTo(next, function(file) { debug('file', file); // populate req.file, will need it later req.file = file; if (file.isPasswordProtected()) { // Password – protected file, check for password using HTTP basic auth basicAuth(function(user, pwd, fn) { if (!pwd) { return fn(); } // ignore user file.authenticate(pwd, errTo(next, function(match) { if (match) { return fn(null, file.id); } fn(); })); })(req, res, next); } else { // Not password – protected, proceed normally next(); } })); }); The rest of the routes are fairly straightforward, using response.download() to send the file to the client, or using response.redirect() after uploading the file: router.get('/', function(req, res, next) { res.render('files/new', { title: 'Upload file' }); }); router.get('/:id.html', function(req, res, next) { res.render('files/show', { id: req.params.id, meta: req.file.meta, isPasswordProtected: req.file.isPasswordProtected(), hash: hash, title: 'Download file ' + req.file.meta.name }); }); router.get('/download/:id', function(req, res, next) { res.download(req.file.path, req.file.meta.name); }); router.post('/', function(req, res, next) { var tempFile = req.files.file; if (!tempFile.size) { return res.redirect('/files'); } var file = new File(tempFile); file.save(tempFile.path, req.body.password, errTo(next, function() { res.redirect('/files/' + file.id + '.html'); })); }); module.exports = router; The view for uploading a file contains a multipart form with a CSRF token inside (views/files/new.html): <%- include ../layout/header.html %> <form action="/files" method="POST" enctype="multipart/form-data"> <div class="form-group"> <label>Choose file:</label> <input type="file" name="file" /> </div> <div class="form-group"> <label>Password protect (leave blank otherwise):</label> <input type="password" name="password" /> </div> <div class="form-group"> <%- csrf() %> <input type="submit" /> </div> </form> <%- include ../layout/footer.html %> To display the file's details, we will create another view (views/files/show.html). Besides showing the basic file information, we will display a special message in case the file is password-protected, so that the client is notified that a password should also be shared along with the link: <%- include ../layout/header.html %> <p> <table> <tr> <th>Name</th> <td><%= meta.name %></td> </tr> <th>Type</th> <td><%= meta.type %></td> </tr> <th>Size</th> <td><%= meta.size %> bytes</td> </tr> <th>Uploaded at</th> <td><%= meta.uploadedAt %></td> </tr> </table> </p> <p> <a href="/files/download/<%- id %>">Download file</a> | <a href="/files">Upload new file</a> </p> <p> To share this file with your friends use the <a href="/files/<%- id %>">current link</a>. <% if (isPasswordProtected) { %> <br /> Don't forget to tell them the file password as well! <% } %> </p> <%- include ../layout/footer.html %> Running the application To run the application, we need to install the dependencies and run the start script: $ npm i $ npm start The default port for the application is 3000, so if we visit http://localhost:3000/files, we should see the following page: After uploading the file, we should be redirected to the file's page, where its details will be displayed: Unit tests Unit testing allows us to test individual parts of our code in isolation and verify their correctness. By making our tests focused on these small components, we decrease the complexity of the setup, and most likely, our tests should execute faster. Using the following command, we'll install a few modules to help us in our quest: $ npm i mocha should sinon––save-dev We are going to write unit tests for our file model, but there's nothing stopping us from doing the same thing for our routes or other files from /lib. The dependencies will be listed at the top of the file (test/unit/file-model.js): var should = require('should'); var path = require('path'); var config = require('../../config.json'); var sinon = require('sinon'); We will also need to require the native fs module and the hash module, because these modules will be stubbed later on. Apart from these, we will create an empty callback function and reuse it, as shown in the following code: // will be stubbing methods on these modules later on var fs = require('fs'); var hash = require('../../lib/hash'); var noop = function() {}; The tests for the instance methods will be created first: describe('models', function() { describe('File', function() { var File = require('../../models/file'); it('should have default properties', function() { var file = new File(); file.id.should.be.a.String; file.meta.uploadedAt.should.be.a.Date; }); it('should return the path based on the root and the file id', function() { var file = new File({}, '1'); file.path.should.eql(File.dir + '/1'); }); it('should move a file', function() { var stub = sinon.stub(fs, 'rename'); var file = new File({}, '1'); file.move('/from/path', noop); stub.calledOnce.should.be.true; stub.calledWith('/from/path', File.dir + '/1', noop).should.be.true; stub.restore(); }); it('should save the metadata', function() { var stub = sinon.stub(fs, 'writeFile'); var file = new File({}, '1'); file.meta = { a: 1, b: 2 }; file.saveMeta(noop); stub.calledOnce.should.be.true; stub.calledWith(File.dir + '/1.json', JSON.stringify(file.meta), noop).should.be.true; stub.restore(); }); it('should check if file is password protected', function() { var file = new File({}, '1'); file.meta.hash = 'y'; file.isPasswordProtected().should.be.true; file.meta.hash = null; file.isPasswordProtected().should.be.false; }); it('should allow access if matched file password', function() { var stub = sinon.stub(hash, 'compare'); var file = new File({}, '1'); file.meta.hash = 'hashedPwd'; file.authenticate('password', noop); stub.calledOnce.should.be.true; stub.calledWith('password', 'hashedPwd', noop).should.be.true; stub.restore(); }); We are stubbing the functionalities of the fs and hash modules because we want to test our code in isolation. Once we are done with the tests, we restore the original functionality of the methods. Now that we're done testing the instance methods, we will go on to test the static ones (assigned directly onto the File object): describe('.dir', function() { it('should return the root of the files folder', function() { path.resolve(__dirname + '/../../' + config.filesDir).should.eql(File.dir); }); }); describe('.exists', function() { var stub; beforeEach(function() { stub = sinon.stub(fs, 'exists'); }); afterEach(function() { stub.restore(); }); it('should callback with an error when the file does not exist', function(done) { File.exists('unknown', function(err) { err.should.be.an.instanceOf(Error).and.have.property('status', 404); done(); }); // call the function passed as argument[1] with the parameter `false` stub.callArgWith(1, false); }); it('should callback with no arguments when the file exists', function(done) { File.exists('existing-file', function(err) { (typeof err === 'undefined').should.be.true; done(); }); // call the function passed as argument[1] with the parameter `true` stub.callArgWith(1, true); }); }); }); }); To stub asynchronous functions and execute their callback, we use the stub.callArgWith() function provided by sinon, which executes the callback provided by the argument with the index <<number>> of the stub with the subsequent arguments. For more information, check out the official documentation at http://sinonjs.org/docs/#stubs. When running tests, Node developers expect the npm test command to be the command that triggers the test suite, so we need to add that script to our package.json file. However, since we are going to have different tests to be run, it would be even better to add a unit-tests script and make npm test run that for now. The scripts property should look like the following code: "scripts": { "start": "node ./bin/www", "unit-tests": "mocha --reporter=spec test/unit", "test": "npm run unit-tests" }, Now, if we run the tests, we should see the following output in the terminal: Functional tests So far, we have tested each method to check whether it works fine on its own, but now, it's time to check whether our application works according to the specifications when wiring all the things together. Besides the existing modules, we will need to install and use the following ones: supertest: This is used to test the routes in an expressive manner cheerio: This is used to extract the CSRF token out of the form and pass it along when uploading the file rimraf: This is used to clean up our files folder once we're done with the testing We will create a new file called test/functional/files-routes.js for the functional tests. As usual, we will list our dependencies first: var fs = require('fs'); var request = require('supertest'); var should = require('should'); var async = require('async'); var cheerio = require('cheerio'); var rimraf = require('rimraf'); var app = require('../../app'); There will be a couple of scenarios to test when uploading a file, such as: Checking whether a file that is uploaded without a password can be publicly accessible Checking that a password-protected file can only be accessed with the correct password We will create a function called uploadFile that we can reuse across different tests. This function will use the same supertest agent when making requests so it can persist the cookies, and will also take care of extracting and sending the CSRF token back to the server when making the post request. In case a password argument is provided, it will send that along with the file. The function will assert that the status code for the upload page is 200 and that the user is redirected to the file page after the upload. The full code of the function is listed as follows: function uploadFile(agent, password, done) { agent .get('/files') .expect(200) .end(function(err, res) { (err == null).should.be.true; var $ = cheerio.load(res.text); var csrfToken = $('form input[name=_csrf]').val(); csrfToken.should.not.be.empty; var req = agent .post('/files') .field('_csrf', csrfToken) .attach('file', __filename); if (password) { req = req.field('password', password); } req .expect(302) .expect('Location', /files/(.*).html/) .end(function(err, res) { (err == null).should.be.true; var fileUid = res.headers['location'].match(/files/(.*).html/)[1]; done(null, fileUid); }); }); } Note that we will use rimraf in an after function to clean up the files folder, but it would be best to have a separate path for uploading files while testing (other than the one used for development and production): describe('Files-Routes', function(done) { after(function() { var filesDir = __dirname + '/../../files'; rimraf.sync(filesDir); fs.mkdirSync(filesDir); When testing the file uploads, we want to make sure that without providing the correct password, access will not be granted to the file pages: describe("Uploading a file", function() { it("should upload a file without password protecting it", function(done) { var agent = request.agent(app); uploadFile(agent, null, done); }); it("should upload a file and password protect it", function(done) { var agent = request.agent(app); var pwd = 'sample-password'; uploadFile(agent, pwd, function(err, filename) { async.parallel([ function getWithoutPwd(next) { agent .get('/files/' + filename + '.html') .expect(401) .end(function(err, res) { (err == null).should.be.true; next(); }); }, function getWithPwd(next) { agent .get('/files/' + filename + '.html') .set('Authorization', 'Basic ' + new Buffer(':' + pwd).toString('base64')) .expect(200) .end(function(err, res) { (err == null).should.be.true; next(); }); } ], function(err) { (err == null).should.be.true; done(); }); }); }); }); }); It's time to do the same thing we did for the unit tests: make a script so we can run them with npm by using npm run functional-tests. At the same time, we should update the npm test script to include both our unit tests and our functional tests: "scripts": { "start": "node ./bin/www", "unit-tests": "mocha --reporter=spec test/unit", "functional-tests": "mocha --reporter=spec --timeout=10000 --slow=2000 test/functional", "test": "npm run unit-tests && npm run functional-tests" } If we run the tests, we should see the following output: Running tests before committing in Git It's a good practice to run the test suite before committing to git and only allowing the commit to pass if the tests have been executed successfully. The same applies for other version control systems. To achieve this, we should add the .git/hooks/pre-commit file, which should take care of running the tests and exiting with an error in case they failed. Luckily, this is a repetitive task (which can be applied to all Node applications), so there is an NPM module that creates this hook file for us. All we need to do is install the pre-commit module (https://www.npmjs.org/package/pre-commit) as a development dependency using the following command: $ npm i pre-commit ––save-dev This should automatically create the pre-commit hook file so that all the tests are run before committing (using the npm test command). The pre-commit module also supports running custom scripts specified in the package.json file. For more details on how to achieve that, read the module documentation at https://www.npmjs.org/package/pre-commit. Summary In this article, we have learned about writing tests for Express applications and in the process, explored a variety of helpful modules. Resources for Article: Further resources on this subject: Web Services Testing and soapUI [article] ExtGWT Rich Internet Application: Crafting UI Real Estate [article] Rendering web pages to PDF using Railo Open Source [article]

0
0
1842

How-To Tutorials

article-image-building-publishing-and-supporting-your-forcecom-application

Packt

22 Sep 2014

39 min read

Building, Publishing, and Supporting Your Force.com Application

Packt

22 Sep 2014

39 min read

0
0
3412

article-image-creating-our-first-universe

Packt

22 Sep 2014

18 min read

Creating Our First Universe

Packt

22 Sep 2014

18 min read

0
0
4864

article-image-handling-long-running-requests-play

Packt

22 Sep 2014

18 min read

Handling Long-running Requests in Play

Packt

22 Sep 2014

18 min read

In this article by Julien Richard-Foy, author of Play Framework Essentials, we will dive in the framework internals and explain how to leverage its reactive programming model to manipulate data streams. (For more resources related to this topic, see here.) Firstly, I would like to mention that the code called by controllers must be thread-safe. We also noticed that the result of calling an action has type Future[Result] rather than just Result. This article explains these subtleties and gives answers to questions such as "How are concurrent requests processed by Play applications?" More precisely, this article presents the challenges of stream processing and the way the Play framework solves them. You will learn how to consume, produce, and transform data streams in a non-blocking way using the Iteratee library. Then, you will leverage these skills to stream results and push real-time notifications to your clients. By the end of the article, you will be able to do the following: Produce, consume, and transform streams of data Process a large request body chunk by chunk Serve HTTP chunked responses Push real-time notifications using WebSockets or server-sent events Manage the execution context of your code Play application's execution model The streaming programming model provided by Play has been influenced by the execution model of Play applications, which itself has been influenced by the nature of the work a web application performs. So, let's start from the beginning: what does a web application do? For now, our example application does the following: the HTTP layer invokes some business logic via the service layer, and the service layer does some computations by itself and also calls the database layer. It is worth noting that in our configuration, the database system runs on the same machine as the web application but this is, however, not a requirement. In fact, there are chances that in real-world projects, your database system is decoupled from your HTTP layer and that both run on different machines. It means that while a query is executed on the database, the web layer does nothing but wait for the response. Actually, the HTTP layer is often waiting for some response coming from another system; it could, for example, retrieve some data from an external web service, or the business layer itself could be located on a remote machine. Decoupling the HTTP layer from the business layer or the persistence layer gives a finer control on how to scale the system (more details about that are given further in this article). Anyway, the point is that the HTTP layer may essentially spend time waiting. With that in mind, consider the following diagram showing how concurrent requests could be executed by a web application using a threaded execution model. That is, a model where each request is processed in its own thread. Threaded execution model Several clients (shown on the left-hand side in the preceding diagram) perform queries that are processed by the application's controller. On the right-hand side of the controller, the figure shows an execution thread corresponding to each action's execution. The filled rectangles represent the time spent performing computations within a thread (for example, for processing data or computing a result), and the lines represent the time waiting for some remote data. Each action's execution is distinguished by a particular color. In this fictive example, the action handling the first request may execute a query to a remote database, hence the line (illustrating that the thread waits for the database result) between the two pink rectangles (illustrating that the action performs some computation before querying the database and after getting the database result). The action handling the third request may perform a call to a distant web service and then a second one, after the response of the first one has been received; hence, the two lines between the green rectangles. And the action handling the last request may perform a call to a distant web service that streams a response of an infinite size, hence, the multiple lines between the purple rectangles. The problem with this execution model is that each request requires the creation of a new thread. Threads have an overhead at creation, because they consume memory (essentially because each thread has its own stack), and during execution, when the scheduler switches contexts. However, we can see that these threads spend a lot of time just waiting. If we could use the same thread to process another request while the current action is waiting for something, we could avoid the creation of threads, and thus save resources. This is exactly what the execution model used by Play—the evented execution model—does, as depicted in the following diagram: Evented execution model Here, the computation fragments are executed on two threads only. Note that the same action can have its computation fragments run by different threads (for example, the pink action). Also note that several threads are still in use, that's why the code must be thread-safe. The time spent waiting between computing things is the same as before, and you can see that the time required to completely process a request is about the same as with the threaded model (for instance, the second pink rectangle ends at the same position as in the earlier figure, same for the third green rectangle, and so on). A comparison between the threaded and evented models can be found in the master's thesis of Benjamin Erb, Concurrent Programming for Scalable Web Architectures, 2012. An online version is available at http://berb.github.io/diploma-thesis/. An attentive reader may think that I have cheated; the rectangles in the second figure are often thinner than their equivalent in the first figure. That's because, in the first model, there is an overhead for scheduling threads and, above all, even if you have a lot of threads, your machine still has a limited number of cores effectively executing the code of your threads. More precisely, if you have more threads than your number of cores, you necessarily have threads in an idle state (that is, waiting). This means, if we suppose that the machine executing the application has only two cores, in the first figure, there is even time spent waiting in the rectangles! Scaling up your server The previous section raises the question of how to handle a higher number of concurrent requests, as depicted in the following diagram: A server under an increasing load The previous section explained how to avoid wasting resources to leverage the computing power of your server. But actually, there is no magic; if you want to compute even more things per unit of time, you need more computing power, as depicted in the following diagram: Scaling using more powerful hardware One solution could be to have a more powerful server. But you could be smarter than that and avoid buying expensive hardware by studying the shape of the workload and make appropriate decisions at the software-level. Indeed, there are chances that your workload varies a lot over time, with peaks and holes of activity. This information suggests that if you wanted to buy more powerful hardware, its performance characteristics would be drawn by your highest activity peak, even if it occurs very occasionally. Obviously, this solution is not optimal because you would buy expensive hardware even if you actually needed it only one percent of the time (and more powerful hardware often also means more power-consuming hardware). A better way to handle the workload elasticity consists of adding or removing server instances according to the activity level, as depicted in the following diagram: Scaling using several server instances This architecture design allows you to finely (and dynamically) tune your server capacity according to your workload. That's actually the cloud computing model. Nevertheless, this architecture has a major implication on your code; you cannot assume that subsequent requests issued by the same client will be handled by the same server instance. In practice, it means that you must treat each request independently of each other; you cannot for instance, store a counter on a server instance to count the number of requests issued by a client (your server would miss some requests if one is routed to another server instance). In a nutshell, your server has to be stateless. Fortunately, Play is stateless, so as long as you don't explicitly have a mutable state in your code, your application is stateless. Note that the first implementation I gave of the shop was not stateless; indeed the state of the application was stored in the server's memory. Embracing non-blocking APIs In the first section of this article, I claimed the superiority of the evented execution model over the threaded execution model, in the context of web servers. That being said, to be fair, the threaded model has an advantage over the evented model: it is simpler to program with. Indeed, in such a case, the framework is responsible for creating the threads and the JVM is responsible for scheduling the threads, so that you don't even have to think about this at all, yet your code is concurrently executed. On the other hand, with the evented model, concurrency control is explicit and you should care about it. Indeed, the fact that the same execution thread is used to run several concurrent actions has an important implication on your code: it should not block the thread. Indeed, while the code of an action is executed, no other action code can be concurrently executed on the same thread. What does blocking mean? It means holding a thread for too long a duration. It typically happens when you perform a heavy computation or wait for a remote response. However, we saw that these cases, especially waiting for remote responses, are very common in web servers, so how should you handle them? You have to wait in a non-blocking way or implement your heavy computations as incremental computations. In all the cases, you have to break down your code into computation fragments, where the execution is managed by the execution context. In the diagram illustrating the evented execution model, computation fragments are materialized by the rectangles. You can see that rectangles of different colors are interleaved; you can find rectangles of another color between two rectangles of the same color. However, by default, the code you write forms a single block of execution instead of several computation fragments. It means that, by default, your code is executed sequentially; the rectangles are not interleaved! This is depicted in the following diagram: Evented execution model running blocking code The previous figure still shows both the execution threads. The second one handles the blue action and then the purple infinite action, so that all the other actions can only be handled by the first execution context. This figure illustrates the fact that while the evented model can potentially be more efficient than the threaded model, it can also have negative consequences on the performances of your application: infinite actions block an execution thread forever and the sequential execution of actions can lead to much longer response times. So, how can you break down your code into blocks that can be managed by an execution context? In Scala, you can do so by wrapping your code in a Future block: Future { // This is a computation fragment} The Future API comes from the standard Scala library. For Java users, Play provides a convenient wrapper named play.libs.F.Promise: Promise.promise(() -> {// This is a computation fragment}); Such a block is a value of type Future[A] or, in Java, Promise<A> (where A is the type of the value computed by the block). We say that these blocks are asynchronous because they break the execution flow; you have no guarantee that the block will be sequentially executed before the following statement. When the block is effectively evaluated depends on the execution context implementation that manages it. The role of an execution context is to schedule the execution of computation fragments. In the figure showing the evented model, the execution context consists of a thread pool containing two threads (represented by the two lines under the rectangles). Actually, each time you create an asynchronous value, you have to supply the execution context that will manage its evaluation. In Scala, this is usually achieved using an implicit parameter of type ExecutionContext. You can, for instance, use an execution context provided by Play that consists, by default, of a thread pool with one thread per processor: import play.api.libs.concurrent.Execution.Implicits.defaultContext In Java, this execution context is automatically used by default, but you can explicitly supply another one: Promise.promise(() -> { ... }, myExecutionContext); Now that you know how to create asynchronous values, you need to know how to manipulate them. For instance, a sequence of several Future blocks is concurrently executed; how do we define an asynchronous computation depending on another one? You can eventually schedule a computation after an asynchronous value has been resolved using the foreach method: val futureX = Future { 42 }futureX.foreach(x => println(x)) In Java, you can perform the same operation using the onRedeem method: Promise<Integer> futureX = Promise.promise(() -> 42);futureX.onRedeem((x) -> System.out.println(x)); More interestingly, you can eventually transform an asynchronous value using the map method: val futureIsEven = futureX.map(x => x % 2 == 0) The map method exists in Java too: Promise<Boolean> futureIsEven = futureX.map((x) -> x % 2 == 0); If the function you use to transform an asynchronous value returned an asynchronous value too, you would end up with an inconvenient Future[Future[A]] value (or a Promise<Promise<A>> value, in Java). So, use the flatMap method in that case: val futureIsEven = futureX.flatMap(x => Future { x % 2 == 0 }) The flatMap method is also available in Java: Promise<Boolean> futureIsEven = futureX.flatMap((x) -> {Promise.promise(() -> x % 2 == 0)}); The foreach, map, and flatMap functions (or their Java equivalent) all have in common to set a dependency between two asynchronous values; the computation they take as the parameter is always evaluated after the asynchronous computation they are applied to. Another method that is worth mentioning is zip: val futureXY: Future[(Int, Int)] = futureX.zip(futureY) The zip method is also available in Java: Promise<Tuple<Integer, Integer>> futureXY = futureX.zip(futureY); The zip method returns an asynchronous value eventually resolved to a tuple containing the two resolved asynchronous values. It can be thought of as a way to join two asynchronous values without specifying any execution order between them. If you want to join more than two asynchronous values, you can use the zip method several times (for example, futureX.zip(futureY).zip(futureZ).zip(…)), but an alternative is to use the Future.sequence function: val futureXs: Future[Seq[Int]] =Future.sequence(Seq(futureX, futureY, futureZ, …)) This function transforms a sequence of future values into a future sequence value. In Java, this function is named Promise.sequence. In the preceding descriptions, I always used the word eventually, and it has a reason. Indeed, if we use an asynchronous value to manipulate a result sent by a remote machine (such as a database system or a web service), the communication may eventually fail due to some technical issue (for example, if the network is down). For this reason, asynchronous values have error recovery methods; for example, the recover method: futureX.recover { case NonFatal(e) => y } The recover method is also available in Java: futureX.recover((throwable) -> y); The previous code resolves futureX to the value of y in the case of an error. Libraries performing remote calls (such as an HTTP client or a database client) return such asynchronous values when they are implemented in a non-blocking way. You should always be careful whether the libraries you use are blocking or not and keep in mind that, by default, Play is tuned to be efficient with non-blocking APIs. It is worth noting that JDBC is blocking. It means that the majority of Java-based libraries for database communication are blocking. Obviously, once you get a value of type Future[A] (or Promise<A>, in Java), there is no way to get the A value unless you wait (and block) for the value to be resolved. We saw that the map and flatMap methods make it possible to manipulate the future A value, but you still end up with a Future[SomethingElse] value (or a Promise<SomethingElse>, in Java). It means that if your action's code calls an asynchronous API, it will end up with a Future[Result] value rather than a Result value. In that case, you have to use Action.async instead of Action, as illustrated in this typical code example: val asynchronousAction = Action.async { implicit request => service.asynchronousComputation().map(result => Ok(result))} In Java, there is nothing special to do; simply make your method return a Promise<Result> object: public static Promise<Result> asynchronousAction() { service.asynchronousComputation().map((result) -> ok(result));} Managing execution contexts Because Play uses explicit concurrency control, controllers are also responsible for using the right execution context to run their action's code. Generally, as long as your actions do not invoke heavy computations or blocking APIs, the default execution context should work fine. However, if your code is blocking, it is recommended to use a distinct execution context to run it. An application with two execution contexts (represented by the black and grey arrows). You can specify in which execution context each action should be executed, as explained in this section Unfortunately, there is no non-blocking standard API for relational database communication (JDBC is blocking). It means that all our actions that invoke code executing database queries should be run in a distinct execution context so that the default execution context is not blocked. This distinct execution context has to be configured according to your needs. In the case of JDBC communication, your execution context should be a thread pool with as many threads as your maximum number of connections. The following diagram illustrates such a configuration: This preceding diagram shows two execution contexts, each with two threads. The execution context at the top of the figure runs database code, while the default execution context (on the bottom) handles the remaining (non-blocking) actions. In practice, it is convenient to use Akka to define your execution contexts as they are easily configurable. Akka is a library used for building concurrent, distributed, and resilient event-driven applications. This article assumes that you have some knowledge of Akka; if that is not the case, do some research on it. Play integrates Akka and manages an actor system that follows your application's life cycle (that is, it is started and shut down with the application). For more information on Akka, visit http://akka.io. Here is how you can create an execution context with a thread pool of 10 threads, in your application.conf file: jdbc-execution-context {thread-pool-executor { core-pool-size-factor = 10.0 core-pool-size-max = 10}} You can use it as follows in your code: import play.api.libs.concurrent.Akkaimport play.api.Play.currentimplicit val jdbc = Akka.system.dispatchers.lookup("jdbc-execution-context") The Akka.system expression retrieves the actor system managed by Play. Then, the execution context is retrieved using Akka's API. The equivalent Java code is the following: import play.libs.Akka;import akka.dispatch.MessageDispatcher;import play.core.j.HttpExecutionContext;MessageDispatcher jdbc = Akka.system().dispatchers().lookup("jdbc-execution-context"); Note that controllers retrieve the current request's information from a thread-local static variable, so you have to attach it to the execution context's thread before using it from a controller's action: play.core.j.HttpExecutionContext.fromThread(jdbc) Finally, forcing the use of a specific execution context for a given action can be achieved as follows (provided that my.execution.context is an implicit execution context): import my.execution.contextval myAction = Action.async {Future { … }} The Java equivalent code is as follows: public static Promise<Result> myAction() {return Promise.promise( () -> { … }, HttpExecutionContext.fromThread(myExecutionContext));} Does this feels like clumsy code? Buy the book to learn how to reduce the boilerplate! Summary This article detailed a lot of things on the internals of the framework. You now know that Play uses an evented execution model to process requests and serve responses and that it implies that your code should not block the execution thread. You know how to use future blocks and promises to define computation fragments that can be concurrently managed by Play's execution context and how to define your own execution context with a different threading policy, for example, if you are constrained to use a blocking API. Resources for Article: Further resources on this subject: Play! Framework 2 – Dealing with Content [article] So, what is Play? [article] Play Framework: Introduction to Writing Modules [article]

0
0
5480

How-To Tutorials

article-image-adding-real-time-functionality-using-socketio

Packt

22 Sep 2014

18 min read

Adding Real-time Functionality Using Socket.io

Packt

22 Sep 2014

18 min read

0
0
15879

How-To Tutorials

article-image-visualization-tool-understand-data

Packt

22 Sep 2014

23 min read

Visualization as a Tool to Understand Data

Packt

22 Sep 2014

23 min read

In this article by Nazmus Saquib, the author of Mathematica Data Visualization, we will look at a few simple examples that demonstrate the importance of data visualization. We will then discuss the types of datasets that we will encounter over the course of this book, and learn about the Mathematica interface to get ourselves warmed up for coding. (For more resources related to this topic, see here.) In the last few decades, the quick growth in the volume of information we produce and the capacity of digital information storage have opened a new door for data analytics. We have moved on from the age of terabytes to that of petabytes and exabytes. Traditional data analysis is now augmented with the term big data analysis, and computer scientists are pushing the bounds for analyzing this huge sea of data using statistical, computational, and algorithmic techniques. Along with the size, the types and categories of data have also evolved. Along with the typical and popular data domain in Computer Science (text, image, and video), graphs and various categorical data that arise from Internet interactions have become increasingly interesting to analyze. With the advances in computational methods and computing speed, scientists nowadays produce an enormous amount of numerical simulation data that has opened up new challenges in the field of Computer Science. Simulation data tends to be structured and clean, whereas data collected or scraped from websites can be quite unstructured and hard to make sense of. For example, let's say we want to analyze some blog entries in order to find out which blogger gets more follows and referrals from other bloggers. This is not as straightforward as getting some friends' information from social networking sites. Blog entries consist of text and HTML tags; thus, a combination of text analytics and tag parsing, coupled with a careful observation of the results would give us our desired outcome. Regardless of whether the data is simulated or empirical, the key word here is observation. In order to make intelligent observations, data scientists tend to follow a certain pipeline. The data needs to be acquired and cleaned to make sure that it is ready to be analyzed using existing tools. Analysis may take the route of visualization, statistics, and algorithms, or a combination of any of the three. Inference and refining the analysis methods based on the inference is an iterative process that needs to be carried out several times until we think that a set of hypotheses is formed, or a clear question is asked for further analysis, or a question is answered with enough evidence. Visualization is a very effective and perceptive method to make sense of our data. While statistics and algorithmic techniques provide good insights about data, an effective visualization makes it easy for anyone with little training to gain beautiful insights about their datasets. The power of visualization resides not only in the ease of interpretation, but it also reveals visual trends and patterns in data, which are often hard to find using statistical or algorithmic techniques. It can be used during any step of the data analysis pipeline—validation, verification, analysis, and inference—to aid the data scientist. How have you visualized your data recently? If you still have not, it is okay, as this book will teach you exactly that. However, if you had the opportunity to play with any kind of data already, I want you to take a moment and think about the techniques you used to visualize your data so far. Make a list of them. Done? Do you have 2D and 3D plots, histograms, bar charts, and pie charts in the list? If yes, excellent! We will learn how to style your plots and make them more interactive using Mathematica. Do you have chord diagrams, graph layouts, word cloud, parallel coordinates, isosurfaces, and maps somewhere in that list? If yes, then you are already familiar with some modern visualization techniques, but if you have not had the chance to use Mathematica as a data visualization language before, we will explore how visualization prototypes can be built seamlessly in this software using very little code. The aim of this book is to teach a Mathematica beginner the data-analysis and visualization powerhouse built into Mathematica, and at the same time, familiarize the reader with some of the modern visualization techniques that can be easily built with Mathematica. We will learn how to load, clean, and dissect different types of data, visualize the data using Mathematica's built-in tools, and then use the Mathematica graphics language and interactivity functions to build prototypes of a modern visualization. The importance of visualization Visualization has a broad definition, and so does data. The cave paintings drawn by our ancestors can be argued as visualizations as they convey historical data through a visual medium. Map visualizations were commonly used in wars since ancient times to discuss the past, present, and future states of a war, and to come up with new strategies. Astronomers in the 17th century were believed to have built the first visualization of their statistical data. In the 18th century, William Playfair invented many of the popular graphs we use today (line, bar, circle, and pie charts). Therefore, it appears as if many, since ancient times, have recognized the importance of visualization in giving some meaning to data. To demonstrate the importance of visualization in a simple mathematical setting, consider fitting a line to a given set of points. Without looking at the data points, it would be unwise to try to fit them with a model that seemingly lowers the error bound. It should also be noted that sometimes, the data needs to be changed or transformed to the correct form that allows us to use a particular tool. Visualizing the data points ensures that we do not fall into any trap. The following screenshot shows the visualization of a polynomial as a circle: Figure1.1 Fitting a polynomial In figure 1.1, the points are distributed around a circle. Imagine we are given these points in a Cartesian space (orthogonal x and y coordinates), and we are asked to fit a simple linear model. There is not much benefit if we try to fit these points to any polynomial in a Cartesian space; what we really need to do is change the parameter space to polar coordinates. A 1-degree polynomial in polar coordinate space (essentially a circle) would nicely fit these points when they are converted to polar coordinates, as shown in figure 1.1. Visualizing the data points in more complicated but similar situations can save us a lot of trouble. The following is a screenshot of Anscombe's quartet: Figure1.2 Anscombe's quartet, generated using Mathematica Downloading the color images of this book We also provide you a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/2999OT_coloredimages.PDF. Anscombe's quartet (figure 1.2), named after the statistician Francis Anscombe, is a classic example of how simple data visualization like plotting can save us from making wrong statistical inferences. The quartet consists of four datasets that have nearly identical statistical properties (such as mean, variance, and correlation), and gives rise to the same linear model when a regression routine is run on these datasets. However, the second dataset does not really constitute a linear relationship; a spline would fit the points better. The third dataset (at the bottom-left corner of figure 1.2) actually has a different regression line, but the outlier exerts enough influence to force the same regression line on the data. The fourth dataset is not even a linear relationship, but the outlier enforces the same regression line again. These two examples demonstrate the importance of "seeing" our data before we blindly run algorithms and statistics. Fortunately, for visualization scientists like us, the world of data types is quite vast. Every now and then, this gives us the opportunity to create new visual tools other than the traditional graphs, plots, and histograms. These visual signatures and tools serve the same purpose that the graph plotting examples previously just did—spy and investigate data to infer valuable insights—but on different types of datasets other than just point clouds. Another important use of visualization is to enable the data scientist to interactively explore the data. Two features make today's visualization tools very attractive—the ability to view data from different perspectives (viewing angles) and at different resolutions. These features facilitate the investigator in understanding both the micro- and macro-level behavior of their dataset. Types of datasets There are many different types of datasets that a visualization scientist encounters in their work. This book's aim is to prepare an enthusiastic beginner to delve into the world of data visualization. Certainly, we will not comprehensively cover each and every visualization technique out there. Our aim is to learn to use Mathematica as a tool to create interactive visualizations. To achieve that, we will focus on a general classification of datasets that will determine which Mathematica functions and programming constructs we should learn in order to visualize the broad class of data covered in this book. Tables The table is one of the most common data structures in Computer Science. You might have already encountered this in a computer science, database, or even statistics course, but for the sake of completeness, we will describe the ways in which one could use this structure to represent different kinds of data. Consider the following table as an example: Attribute 1 Attribute 2 … Item 1 Item 2 Item 3 When storing datasets in tables, each row in the table represents an instance of the dataset, and each column represents an attribute of that data point. For example, a set of two-dimensional Cartesian vectors can be represented as a table with two attributes, where each row represents a vector, and the attributes are the x and y coordinates relative to an origin. For three-dimensional vectors or more, we could just increase the number of attributes accordingly. Tables can be used to store more advanced forms of scientific, time series, and graph data. We will cover some of these datasets over the course of this book, so it is a good idea for us to get introduced to them now. Here, we explain the general concepts. Scalar fields There are many kinds of scientific dataset out there. In order to aid their investigations, scientists have created their own data formats and mathematical tools to analyze the data. Engineers have also developed their own visualization language in order to convey ideas in their community. In this book, we will cover a few typical datasets that are widely used by scientists and engineers. We will eventually learn how to create molecular visualizations and biomedical dataset exploration tools when we feel comfortable manipulating these datasets. In practice, multidimensional data (just like vectors in the previous example) is usually augmented with one or more characteristic variable values. As an example, let's think about how a physicist or an engineer would keep track of the temperature of a room. In order to tackle the problem, they would begin by measuring the geometry and the shape of the room, and put temperature sensors at certain places to measure the temperature. They will note the exact positions of those sensors relative to the room's coordinate system, and then, they will be all set to start measuring the temperature. Thus, the temperature of a room can be represented, in a discrete sense, by using a set of points that represent the temperature sensor locations and the actual temperature at those points. We immediately notice that the data is multidimensional in nature (the location of a sensor can be considered as a vector), and each data point has a scalar value associated with it (temperature). Such a discrete representation of multidimensional data is quite widely used in the scientific community. It is called a scalar field. The following screenshot shows the representation of a scalar field in 2D and 3D: Figure1.3 In practice, scalar fields are discrete and ordered Figure 1.3 depicts how one would represent an ordered scalar field in 2D or 3D. Each point in the 2D field has a well-defined x and y location, and a single temperature value gets associated with it. To represent a 3D scalar field, we can think of it as a set of 2D scalar field slices placed at a regular interval along the third dimension. Each point in the 3D field is a point that has {x, y, z} values, along with a temperature value. A scalar field can be represented using a table. We will denote each {x, y} point (for 2D) or {x, y, z} point values (for 3D) as a row, but this time, an additional attribute for the scalar value will be created in the table. Thus, a row will have the attributes {x, y, z, T}, where T is the temperature associated with the point defined by the x, y, and z coordinates. This is the most common representation of scalar fields. A widely used visualization technique to analyze scalar fields is to find out the isocontours or isosurfaces of interest. However, for now, let's take a look at the kind of application areas such analysis will enable one to pursue. Instead of temperature, one could think of associating regularly spaced points with any relevant scalar value to form problem-specific scalar fields. In an electrochemical simulation, it is important to keep track of the charge density in the simulation space. Thus, the chemist would create a scalar field with charge values at specific points. For an aerospace engineer, it is quite important to understand how air pressure varies across airplane wings; they would keep track of the pressure by forming a scalar field of pressure values. Scalar field visualization is very important in many other significant areas, ranging from from biomedical analysis to particle physics. In this book, we will cover how to visualize this type of data using Mathematica. Time series Another widely used data type is the time series. A time series is a sequence of data points that are measured usually over a uniform interval of time. Time series arise in many fields, but in today's world, they are mostly known for their applications in Economics and Finance. Other than these, they are frequently used in statistics, weather prediction, signal processing, astronomy, and so on. It is not the purpose of this book to describe the theory and mathematics of time series data. However, we will cover some of Mathematica's excellent capabilities for visualizing time series, and in the course of this book, we will construct our own visualization tool to view time series data. Time series can be easily represented using tables. Each row of the time series table will represent one point in the series, with one attribute denoting the time stamp—the time at which the data point was recorded, and the other attribute storing the actual data value. If the starting time and the time interval are known, then we can get rid of the time attribute and simply store the data value in each row. The actual timestamp of each value can be calculated using the initial time and time interval. Images and videos can be represented as tables too, with pixel-intensity values occupying each entry of the table. As we focus on visualization and not image processing, we will skip those types of data. Graphs Nowadays, graphs arise in all contexts of computer science and social science. This particular data structure provides a way to convert real-world problems into a set of entities and relationships. Once we have a graph, we can use a plethora of graph algorithms to find beautiful insights about the dataset. Technically, a graph can be stored as a table. However, Mathematica has its own graph data structure, so we will stick to its norm. Sometimes, visualizing the graph structure reveals quite a lot of hidden information. Graph visualization itself is a challenging problem, and is an active research area in computer science. A proper visualization layout, along with proper color maps and size distribution, can produce very useful outputs. Text The most common form of data that we encounter everywhere is text. Mathematica does not provide any specific visualization package for state-of-the-art text visualization methods. Cartographic data As mentioned before, map visualization is one of the ancient forms of visualization known to us. Nowadays, with the advent of GPS, smartphones, and publicly available country-based data repositories, maps are providing an excellent way to contrast and compare different countries, cities, or even communities. Cartographic data comes in various forms. A common form of a single data item is one that includes latitude, longitude, location name, and an attribute (usually numerical) that records a relevant quantity. However, instead of a latitude and longitude coordinate, we may be given a set of polygons that describe the geographical shape of the place. The attributable quantity may not be numerical, but rather something qualitative, like text. Thus, there is really no standard form that one can expect when dealing with cartographic data. Fortunately, Mathematica provides us with excellent data-mining and dissecting capabilities to build custom formats out of the data available to us. . Mathematica as a tool for visualization At this point, you might be wondering why Mathematica is suited for visualizing all the kinds of datasets that we have mentioned in the preceding examples. There are many excellent tools and packages out there to visualize data. Mathematica is quite different from other languages and packages because of the unique set of capabilities it presents to its user. Mathematica has its own graphics language, with which graphics primitives can be interactively rendered inside the worksheet. This makes Mathematica's capability similar to many widely used visualization languages. Mathematica provides a plethora of functions to combine these primitives and make them interactive. Speaking of interactivity, Mathematica provides a suite of functions to interactively display any of its process. Not only visualization, but any function or code evaluation can be interactively visualized. This is particularly helpful when managing and visualizing big datasets. Mathematica provides many packages and functions to visualize the kinds of datasets we have mentioned so far. We will learn to use the built-in functions to visualize structured and unstructured data. These functions include point, line, and surface plots; histograms; standard statistical charts; and so on. Other than these, we will learn to use the advanced functions that will let us build our own visualization tools. Another interesting feature is the built-in datasets that this software provides to its users. This feature provides a nice playground for the user to experiment with different datasets and visualization functions. From our discussion so far, we have learned that visualization tools are used to analyze very large datasets. While Mathematica is not really suited for dealing with petabytes or exabytes of data (and many other popularly used visualization tools are not suited for that either), often, one needs to build quick prototypes of such visualization tools using smaller sample datasets. Mathematica is very well suited to prototype such tools because of its efficient and fast data-handling capabilities, along with its loads of convenient functions and user-friendly interface. It also supports GPU and other high-performance computing platforms. Although it is not within the scope of this book, a user who knows how to harness the computing power of Mathematica can couple that knowledge with visualization techniques to build custom big data visualization solutions. Another feature that Mathematica presents to a data scientist is the ability to keep the workflow within one worksheet. In practice, many data scientists tend to do their data analysis with one package, visualize their data with another, and export and present their findings using something else. Mathematica provides a complete suite of a core language, mathematical and statistical functions, a visualization platform, and versatile data import and export features inside a single worksheet. This helps the user focus on the data instead of irrelevant details. By now, I hope you are convinced that Mathematica is worth learning for your data-visualization needs. If you still do not believe me, I hope I will be able to convince you again at the end of the book, when we will be done developing several visualization prototypes, each requiring only few lines of code! Getting started with Mathematica We will need to know a few basic Mathematica notebook essentials. Assuming you already have Mathematica installed on your computer, let's open a new notebook by navigating to File|New|Notebook, and do the following experiments. Creating and selecting cells In Mathematica, a chunk of code or any number of mathematical expressions can be written within a cell. Each cell in the notebook can be evaluated to see the output immediately below it. To start a new cell, simply start typing at the position of the blinking cursor. Each cell can be selected by clicking on the respective rightmost bracket. To select multiple cells, press Ctrl + right-mouse button in Windows or Linux (or cmd + right-mouse button on a Mac) on each of the cells. The following screenshot shows several cells selected together, along with the output from each cell: Figure1.4 Selecting and evaluating cells in Mathematica We can place a new cell in between any set of cells in order to change the sequence of instruction execution. Use the mouse to place the cursor in between two cells, and start typing your commands to create a new cell. We can also cut, copy, and paste cells by selecting them and applying the usual shortcuts (for example, Ctrl + C, Ctrl + X, and Ctrl + V in Windows/Linux, or cmd + C, cmd + X, and cmd + V in Mac) or using the Edit menu bar. In order to delete cell(s), select the cell(s) and press the Delete key. Evaluating a cell A cell can be evaluated by pressing Shift + Enter. Multiple cells can be selected and evaluated in the same way. To evaluate the full notebook, press Ctrl + A (to select all the cells) and then press Shift + Enter. In this case, the cells will be evaluated one after the other in the sequence in which they appear in the notebook. To see examples of notebooks filled with commands, code, and mathematical expressions, you can open the notebooks supplied with this article, which are the polar coordinates fitting and Anscombe's quartet examples, and select each cell (or all of them) and evaluate them. If we evaluate a cell that uses variables declared in a previous cell, and the previous cell was not already evaluated, then we may get errors. It is possible that Mathematica will treat the unevaluated variables as a symbolic expression; in that case, no error will be displayed, but the results will not be numeric anymore. Suppressing output from a cell If we don't wish to see the intermediate output as we load data or assign values to variables, we can add a semicolon (;) at the end of each line that we want to leave out from the output. Cell formatting Mathematica input cells treat everything inside them as mathematical and/or symbolic expressions. By default, every new cell you create by typing at the horizontal cursor will be an input expression cell. However, you can convert the cell to other formats for convenient typesetting. In order to change the format of cell(s), select the cell(s) and navigate to Format|Style from the menu bar, and choose a text format style from the available options. You can add mathematical symbols to your text by selecting Palettes|Basic Math Assistant. Note that evaluating a text cell will have no effect/output. Commenting We can write any comment in a text cell as it will be ignored during the evaluation of our code. However, if we would like to write a comment inside an input cell, we use the (* operator to open a comment and the *) operator to close it, as shown in the following code snippet: (* This is a comment *) The shortcut Ctrl + / (cmd + / in Mac) is used to comment/uncomment a chunk of code too. This operation is also available in the menu bar. Downloading the example code You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. Aborting evaluation We can abort the currently running evaluation of a cell by navigating to Evaluation|Abort Evaluation in the menu bar, or simply by pressing Alt + . (period). This is useful when you want to end a time-consuming process that you suddenly realize will not give you the correct results at the end of the evaluation, or end a process that might use up the available memory and shut down the Mathematica kernel. Further reading The history of visualization deserves a separate book, as it is really fascinating how the field has matured over the centuries, and it is still growing very strongly. Michael Friendly, from York University, published a historical development paper that is freely available online, titled Milestones in History of Data Visualization: A Case Study in Statistical Historiography. This is an entertaining compilation of the history of visualization methods. The book The Visual Display of Quantitative Information by Edward R. Tufte published by Graphics Press USA, is an excellent resource and a must-read for every data visualization practitioner. This is a classic book on the theory and practice of data graphics and visualization. Since we will not have the space to discuss the theory of visualization, the interested reader can consider reading this book for deeper insights. Summary In this article, we discussed the importance of data visualization in different contexts. We also introduced the types of dataset that will be visualized over the course of this book. The flexibility and power of Mathematica as a visualization package was discussed, and we will see the demonstration of these properties throughout the book with beautiful visualizations. Finally, we have taken the first step to writing code in Mathematica. Resources for Article: Further resources on this subject: Driving Visual Analyses with Automobile Data (Python) [article] Importing Dynamic Data [article] Interacting with Data for Dashboards [article]

0
0
9283

Packt

19 Sep 2014

11 min read

Mobility

Packt

19 Sep 2014

11 min read

0
0
2266

Packt

19 Sep 2014

24 min read

Creating a RESTful API

Packt

19 Sep 2014

24 min read

In this article by Jason Krol, the author of Web Development with MongoDB and NodeJS, we will review the following topics: (For more resources related to this topic, see here.) Introducing RESTful APIs Installing a few basic tools Creating a basic API server and sample JSON data Responding to GET requests Updating data with POST and PUT Removing data with DELETE Consuming external APIs from Node What is an API? An Application Programming Interface (API) is a set of tools that a computer system makes available that provides unrelated systems or software the ability to interact with each other. Typically, a developer uses an API when writing software that will interact with a closed, external, software system. The external software system provides an API as a standard set of tools that all developers can use. Many popular social networking sites provide developer's access to APIs to build tools to support those sites. The most obvious examples are Facebook and Twitter. Both have a robust API that provides developers with the ability to build plugins and work with data directly, without them being granted full access as a general security precaution. As you will see with this article, providing your own API is not only fairly simple, but also it empowers you to provide your users with access to your data. You also have the added peace of mind knowing that you are in complete control over what level of access you can grant, what sets of data you can make read-only, as well as what data can be inserted and updated. What is a RESTful API? Representational State Transfer (REST) is a fancy way of saying CRUD over HTTP. What this means is when you use a REST API, you have a uniform means to create, read, and update data using simple HTTP URLs with a standard set of HTTP verbs. The most basic form of a REST API will accept one of the HTTP verbs at a URL and return some kind of data as a response. Typically, a REST API GET request will always return some kind of data such as JSON, XML, HTML, or plain text. A POST or PUT request to a RESTful API URL will accept data to create or update. The URL for a RESTful API is known as an endpoint, and while working with these endpoints, it is typically said that you are consuming them. The standard HTTP verbs used while interfacing with REST APIs include: GET: This retrieves data POST: This submits data for a new record PUT: This submits data to update an existing record PATCH: This submits a date to update only specific parts of an existing record DELETE: This deletes a specific record Typically, RESTful API endpoints are defined in a way that they mimic the data models and have semantic URLs that are somewhat representative of the data models. What this means is that to request a list of models, for example, you would access an API endpoint of /models. Likewise, to retrieve a specific model by its ID, you would include that in the endpoint URL via /models/:Id. Some sample RESTful API endpoint URLs are as follows: GET http://myapi.com/v1/accounts: This returns a list of accounts GET http://myapi.com/v1/accounts/1: This returns a single account by Id: 1 POST http://myapi.com/v1/accounts: This creates a new account (data submitted as a part of the request) PUT http://myapi.com/v1/accounts/1: This updates an existing account by Id: 1 (data submitted as part of the request) GET http://myapi.com/v1/accounts/1/orders: This returns a list of orders for account Id: 1 GET http://myapi.com/v1/accounts/1/orders/21345: This returns the details for a single order by Order Id: 21345 for account Id: 1 It's not a requirement that the URL endpoints match this pattern; it's just common convention. Introducing Postman REST Client Before we get started, there are a few tools that will make life much easier when you're working directly with APIs. The first of these tools is called Postman REST Client, and it's a Google Chrome application that can run right in your browser or as a standalone-packaged application. Using this tool, you can easily make any kind of request to any endpoint you want. The tool provides many useful and powerful features that are very easy to use and, best of all, free! Installation instructions Postman REST Client can be installed in two different ways, but both require Google Chrome to be installed and running on your system. The easiest way to install the application is by visiting the Chrome Web Store at https://chrome.google.com/webstore/category/apps. Perform a search for Postman REST Client and multiple results will be returned. There is the regular Postman REST Client that runs as an application built into your browser, and then separate Postman REST Client (packaged app) that runs as a standalone application on your system in its own dedicated window. Go ahead and install your preference. If you install the application as the standalone packaged app, an icon to launch it will be added to your dock or taskbar. If you installed it as a regular browser app, you can launch it by opening a new tab in Google Chrome and going to Apps and finding the Postman REST Client icon. After you've installed and launched the app, you should be presented with an output similar to the following screenshot: A quick tour of Postman REST Client Using Postman REST Client, we're able to submit REST API calls to any endpoint we want as well as modify the type of request. Then, we can have complete access to the data that's returned from the API as well as any errors that might have occurred. To test an API call, enter the URL to your favorite website in the Enter request URL here field and leave the dropdown next to it as GET. This will mimic a standard GET request that your browser performs anytime you visit a website. Click on the blue Send button. The request is made and the response is displayed at the bottom half of the screen. In the following screenshot, I sent a simple GET request to http://kroltech.com and the HTML is returned as follows: If we change this URL to that of the RSS feed URL for my website, you can see the XML returned: The XML view has a few more features as it exposes the sidebar to the right that gives you a handy outline to glimpse the tree structure of the XML data. Not only that, you can now see a history of the requests we've made so far along the left sidebar. This is great when we're doing more advanced POST or PUT requests and don't want to repeat the data setup for each request while testing an endpoint. Here is a sample API endpoint I submitted a GET request to that returns the JSON data in its response: A really nice thing about making API calls to endpoints that return JSON using Postman Client is that it parses and displays the JSON in a very nicely formatted way, and each node in the data is expandable and collapsible. The app is very intuitive so make sure you spend some time playing around and experimenting with different types of calls to different URLs. Using the JSONView Chrome extension There is one other tool I want to let you know about (while extremely minor) that is actually a really big deal. The JSONView Chrome extension is a very small plugin that will instantly convert any JSON you view directly via the browser into a more usable JSON tree (exactly like Postman Client). Here is an example of pointing to a URL that returns JSON from Chrome before JSONView is installed: And here is that same URL after JSONView has been installed: You should install the JSONView Google Chrome extension the same way you installed Postman REST Client—access the Chrome Web Store and perform a search for JSONView. Now that you have the tools to be able to easily work with and test API endpoints, let's take a look at writing your own and handling the different request types. Creating a Basic API server Let's create a super basic Node.js server using Express that we'll use to create our own API. Then, we can send tests to the API using Postman REST Client to see how it all works. In a new project workspace, first install the npm modules that we're going to need in order to get our server up and running: $ npm init $ npm install --save express body-parser underscore Now that the package.json file for this project has been initialized and the modules installed, let's create a basic server file to bootstrap up an Express server. Create a file named server.js and insert the following block of code: var express = require('express'), bodyParser = require('body-parser'), _ = require('underscore'), json = require('./movies.json'), app = express(); app.set('port', process.env.PORT || 3500); app.use(bodyParser.urlencoded()); app.use(bodyParser.json()); var router = new express.Router(); // TO DO: Setup endpoints ... app.use('/', router); var server = app.listen(app.get('port'), function() { console.log('Server up: http://localhost:' + app.get('port')); }); Most of this should look familiar to you. In the server.js file, we are requiring the express, body-parser, and underscore modules. We're also requiring a file named movies.json, which we'll create next. After our modules are required, we set up the standard configuration for an Express server with the minimum amount of configuration needed to support an API server. Notice that we didn't set up Handlebars as a view-rendering engine because we aren't going to be rendering any HTML with this server, just pure JSON responses. Creating sample JSON data Let's create the sample movies.json file that will act as our temporary data store (even though the API we build for the purposes of demonstration won't actually persist data beyond the app's life cycle): [{ "Id": "1", "Title": "Aliens", "Director": "James Cameron", "Year": "1986", "Rating": "8.5" }, { "Id": "2", "Title": "Big Trouble in Little China", "Director": "John Carpenter", "Year": "1986", "Rating": "7.3" }, { "Id": "3", "Title": "Killer Klowns from Outer Space", "Director": "Stephen Chiodo", "Year": "1988", "Rating": "6.0" }, { "Id": "4", "Title": "Heat", "Director": "Michael Mann", "Year": "1995", "Rating": "8.3" }, { "Id": "5", "Title": "The Raid: Redemption", "Director": "Gareth Evans", "Year": "2011", "Rating": "7.6" }] This is just a really simple JSON list of a few of my favorite movies. Feel free to populate it with whatever you like. Boot up the server to make sure you aren't getting any errors (note we haven't set up any routes yet, so it won't actually do anything if you tried to load it via a browser): $ node server.js Server up: http://localhost:3500 Responding to GET requests Adding a simple GET request support is fairly simple, and you've seen this before already in the app we built. Here is some sample code that responds to a GET request and returns a simple JavaScript object as JSON. Insert the following code in the routes section where we have the // TO DO: Setup endpoints ... waiting comment: router.get('/test', function(req, res) { var data = { name: 'Jason Krol', website: 'http://kroltech.com' }; res.json(data); }); Let's tweak the function a little bit and change it so that it responds to a GET request against the root URL (that is /) route and returns the JSON data from our movies file. Add this new route after the /test route added previously: router.get('/', function(req, res) { res.json(json); }); The res (response) object in Express has a few different methods to send data back to the browser. Each of these ultimately falls back on the base send method, which includes header information, statusCodes, and so on. res.json and res.jsonp will automatically format JavaScript objects into JSON and then send using res.send. res.render will render a template view as a string and then send it using res.send as well. With that code in place, if we launch the server.js file, the server will be listening for a GET request to the / URL route and will respond with the JSON data of our movies collection. Let's first test it out using the Postman REST Client tool: GET requests are nice because we could have just as easily pulled that same URL via our browser and received the same result: However, we're going to use Postman for the remainder of our endpoint testing as it's a little more difficult to send POST and PUT requests using a browser. Receiving data – POST and PUT requests When we want to allow our users using our API to insert or update data, we need to accept a request from a different HTTP verb. When inserting new data, the POST verb is the preferred method to accept data and know it's for an insert. Let's take a look at code that accepts a POST request and data along with the request, and inserts a record into our collection and returns the updated JSON. Insert the following block of code after the route you added previously for GET: router.post('/', function(req, res) { // insert the new item into the collection (validate first) if(req.body.Id && req.body.Title && req.body.Director && req.body.Year && req.body.Rating) { json.push(req.body); res.json(json); } else { res.json(500, { error: 'There was an error!' }); } }); You can see the first thing we do in the POST function is check to make sure the required fields were submitted along with the actual request. Assuming our data checks out and all the required fields are accounted for (in our case every field), we insert the entire req.body object into the array as is using the array's push function. If any of the required fields aren't submitted with the request, we return a 500 error message instead. Let's submit a POST request this time to the same endpoint using the Postman REST Client. (Don't forget to make sure your API server is running with node server.js.): First, we submitted a POST request with no data, so you can clearly see the 500 error response that was returned. Next, we provided the actual data using the x-www-form-urlencoded option in Postman and provided each of the name/value pairs with some new custom data. You can see from the results that the STATUS was 200, which is a success and the updated JSON data was returned as a result. Reloading the main GET endpoint in a browser yields our original movies collection with the new one added. PUT requests will work in almost exactly the same way except traditionally, the Id property of the data is handled a little differently. In our example, we are going to require the Id attribute as a part of the URL and not accept it as a parameter in the data that's submitted (since it's usually not common for an update function to change the actual Id of the object it's updating). Insert the following code for the PUT route after the existing POST route you added earlier: router.put('/:id', function(req, res) { // update the item in the collection if(req.params.id && req.body.Title && req.body.Director && req.body.Year && req.body.Rating) { _.each(json, function(elem, index) { // find and update: if (elem.Id === req.params.id) { elem.Title = req.body.Title; elem.Director = req.body.Director; elem.Year = req.body.Year; elem.Rating = req.body.Rating; } }); res.json(json); } else { res.json(500, { error: 'There was an error!' }); } }); This code again validates that the required fields are included with the data that was submitted along with the request. Then, it performs an _.each loop (using the underscore module) to look through the collection of movies and find the one whose Id parameter matches that of the Id included in the URL parameter. Assuming there's a match, the individual fields for that matched object are updated with the new values that were sent with the request. Once the loop is complete, the updated JSON data is sent back as the response. Similarly, in the POST request, if any of the required fields are missing, a simple 500 error message is returned. The following screenshot demonstrates a successful PUT request updating an existing record. The response from Postman after including the value 1 in the URL as the Id parameter, which provides the individual fields to update as x-www-form-urlencoded values, and finally sending as PUT shows that the original item in our movies collection is now the original Alien (not Aliens, its sequel as we originally had). Removing data – DELETE The final stop on our whirlwind tour of the different REST API HTTP verbs is DELETE. It should be no surprise that sending a DELETE request should do exactly what it sounds like. Let's add another route that accepts DELETE requests and will delete an item from our movies collection. Here is the code that takes care of DELETE requests that should be placed after the existing block of code from the previous PUT: router.delete('/:id', function(req, res) { var indexToDel = -1; _.each(json, function(elem, index) { if (elem.Id === req.params.id) { indexToDel = index; } }); if (~indexToDel) { json.splice(indexToDel, 1); } res.json(json); }); This code will loop through the collection of movies and find a matching item by comparing the values of Id. If a match is found, the array index for the matched item is held until the loop is finished. Using the array.splice function, we can remove an array item at a specific index. Once the data has been updated by removing the requested item, the JSON data is returned. Notice in the following screenshot that the updated JSON that's returned is in fact no longer displaying the original second item we deleted. Note that ~ in there! That's a little bit of JavaScript black magic! The tilde (~) in JavaScript will bit flip a value. In other words, take a value and return the negative of that value incremented by one, that is ~n === -(n+1). Typically, the tilde is used with functions that return -1 as a false response. By using ~ on -1, you are converting it to a 0. If you were to perform a Boolean check on -1 in JavaScript, it would return true. You will see ~ is used primarily with the indexOf function and jQuery's $.inArray()—both return -1 as a false response. All of the endpoints defined in this article are extremely rudimentary, and most of these should never ever see the light of day in a production environment! Whenever you have an API that accepts anything other than GET requests, you need to be sure to enforce extremely strict validation and authentication rules. After all, you are basically giving your users direct access to your data. Consuming external APIs from Node.js There will undoubtedly be a time when you want to consume an API directly from within your Node.js code. Perhaps, your own API endpoint needs to first fetch data from some other unrelated third-party API before sending a response. Whatever the reason, the act of sending a request to an external API endpoint and receiving a response can be done fairly easily using a popular and well-known npm module called Request. Request was written by Mikeal Rogers and is currently the third most popular and (most relied upon) npm module after async and underscore. Request is basically a super simple HTTP client, so everything you've been doing with Postman REST Client so far is basically what Request can do, only the resulting data is available to you in your node code as well as the response status codes and/or errors, if any. Consuming an API endpoint using Request Let's do a neat trick and actually consume our own endpoint as if it was some third-party external API. First, we need to ensure we have Request installed and can include it in our app: $ npm install --save request Next, edit server.js and make sure you include Request as a required module at the start of the file: var express = require('express'), bodyParser = require('body-parser'), _ = require('underscore'), json = require('./movies.json'), app = express(), request = require('request'); Now let's add a new endpoint after our existing routes, which will be an endpoint accessible in our server via a GET request to /external-api. This endpoint, however, will actually consume another endpoint on another server, but for the purposes of this example, that other server is actually the same server we're currently running! The Request module accepts an options object with a number of different parameters and settings, but for this particular example, we only care about a few. We're going to pass an object that has a setting for the method (GET, POST, PUT, and so on) and the URL of the endpoint we want to consume. After the request is made and a response is received, we want an inline callback function to execute. Place the following block of code after your existing list of routes in server.js: router.get('/external-api', function(req, res) { request({ method: 'GET', uri: 'http://localhost:' + (process.env.PORT || 3500), }, function(error, response, body) { if (error) { throw error; } var movies = []; _.each(JSON.parse(body), function(elem, index) { movies.push({ Title: elem.Title, Rating: elem.Rating }); }); res.json(_.sortBy(movies, 'Rating').reverse()); }); }); The callback function accepts three parameters: error, response, and body. The response object is like any other response that Express handles and has all of the various parameters as such. The third parameter, body, is what we're really interested in. That will contain the actual result of the request to the endpoint that we called. In this case, it is the JSON data from our main GET route we defined earlier that returns our own list of movies. It's important to note that the data returned from the request is returned as a string. We need to use JSON.parse to convert that string to actual usable JSON data. Using the data that came back from the request, we transform it a little bit. That is, we take that data and manipulate it a bit to suit our needs. In this example, we took the master list of movies and just returned a new collection that consists of only the title and rating of each movie and then sorts the results by the top scores. Load this new endpoint by pointing your browser to http://localhost:3500/external-api, and you can see the new transformed JSON output to the screen. Let's take a look at another example that's a little more real world. Let's say that we want to display a list of similar movies for each one in our collection, but we want to look up that data somewhere such as www.imdb.com. Here is the sample code that will send a GET request to IMDB's JSON API, specifically for the word aliens, and returns a list of related movies by the title and year. Go ahead and place this block of code after the previous route for external-api: router.get('/imdb', function(req, res) { request({ method: 'GET', uri: 'http://sg.media-imdb.com/suggests/a/aliens.json', }, function(err, response, body) { var data = body.substring(body.indexOf('(')+1); data = JSON.parse(data.substring(0,data.length-1)); var related = []; _.each(data.d, function(movie, index) { related.push({ Title: movie.l, Year: movie.y, Poster: movie.i ? movie.i[0] : '' }); }); res.json(related); }); }); If we take a look at this new endpoint in a browser, we can see the JSON data that's returned from our /imdb endpoint is actually itself retrieving and returning data from some other API endpoint: Note that the JSON endpoint I'm using for IMDB isn't actually from their API, but rather what they use on their homepage when you type in the main search box. This would not really be the most appropriate way to use their data, but it's more of a hack to show this example. In reality, to use their API (like most other APIs), you would need to register and get an API key that you would use so that they can properly track how much data you are requesting on a daily or an hourly basis. Most APIs will to require you to use a private key with them for this same reason. Summary In this article, we took a brief look at how APIs work in general, the RESTful API approach to semantic URL paths and arguments, and created a bare bones API. We used Postman REST Client to interact with the API by consuming endpoints and testing the different types of request methods (GET, POST, PUT, and so on). You also learned how to consume an external API endpoint by using the third-party node module Request. Resources for Article: Further resources on this subject: RESTful Services JAX-RS 2.0 [Article] REST – Where It Begins [Article] RESTful Web Services – Server-Sent Events (SSE) [Article]

0
0
8114

How-To Tutorials

Packt

19 Sep 2014

21 min read

Jump Right In

Packt

19 Sep 2014

21 min read

In this article by Victor Quinn, J.D., the author of the book Getting Started with tmux, we'll go on a little tour, simulate an everyday use of tmux, and point out some key concepts along the way. tmux is short for Terminal Multiplexer. (For more resources related to this topic, see here.) Running tmux For now, let's jump right in and start playing with it. Open up your favorite terminal application and let's get started. Just run the following command: $ tmux You'll probably see a screen flash, and it'll seem like not much else has happened; it looks like you're right where you were previously, with a command prompt. The word tmux is gone, but not much else appears to have changed. However, you should notice that now there is a bar along the bottom of your terminal window. This can be seen in the following screenshot of the terminal window: Congratulations! You're now running tmux. That bar along the bottom is provided by tmux. We call this bar the status line. The status line gives you information about the session and window you are currently viewing, which other windows are available in this session, and more. Some of what's on that line may look like gibberish now, but we'll learn more about what things mean as we progress through this book. We'll also learn how to customize the status bar to ensure it always shows the most useful items for your workflow. These customizations include things that are a part of tmux (such as the time, date, server you are connected to, and so on) or things that are in third-party libraries (such as the battery level of your laptop, current weather, or number of unread mail messages). Sessions By running tmux with no arguments, you create a brand new session. In tmux, the base unit is called a session. A session can have one or more windows. A window can be broken into one or more panes. We'll have a sneak preview on this topic, what we have here on the current screen is a single pane taking up the whole window in a single session. Imagine that it could be split into two or more different terminals, all running different programs, and each visible split of the terminal is a pane. What is a session in tmux? It may be useful to think of a tmux session as a login on your computer. You can log on to your computer, which initiates a new session. After you log on by entering your username and password, you arrive at an empty desktop. This is similar to a fresh tmux session. You can run one or more programs in this session, where each program has its own window or windows and each window has its own state. In most operating systems, there is a way for you to log out, log back in, and arrive back at the same session, with the windows just as you left them. Often, some of the programs that you had opened will continue to run in the background when you log out, even though their windows are no longer visible. A session in tmux works in much the same way. So, it may be useful to think of tmux as a mini operating system that manages running programs, windows, and more, all within a session. You can have multiple sessions running at the same time. This is convenient if you want to have a session for each task you might be working on. You might have one for an application you are developing by yourself and another that you could use for pair programming. Alternatively, you might have one to develop an application and one to develop another. This way everything can be neat and clean and separate. Naming the session Each session has a name that you can set or change. Notice the [0] at the very left of the status bar? This is the name of the session in brackets. Here, since you just started tmux without any arguments, it was given the name 0. However, this is not a very useful name, so let's change it. In the prompt, just run the following command: $ tmux rename-session tutorial This tells tmux that you want to rename the current session and tutorial is the name you'd like it to have. Of course, you can name it anything you'd like. You should see that your status bar has now been updated, so now instead of [0] on the left-hand side, it should now say [tutorial]. Here's a screenshot of my screen: Of course, it's nice that the status bar now has a pretty name we defined rather than 0, but it provides many more utilities than this, as we'll see in a bit! It's worth noting that here we were giving a session a name, but this same command can also be used to rename an existing session. The window string The status bar has a string that represents each window to inform us about the things that are currently running. The following steps will help us to explore this a bit more: Let's fire up a text editor to pretend we're doing some coding: $ nano test Now type some stuff in there to simulate working very hard on some code: First notice how the text blob in our status bar just to the right of our session name ([tutorial]) has changed. It used to be 0:~* and now it's 0:nano*. Depending on the version of tmux and your chosen shell, yours may be slightly different (for example, 0:bash*). Let's decode this string a bit. This little string encodes a lot of information, some of which is provided in the following bullet points: The zero in the front represents the number of the window. As we'll shortly see, each window is given a number that we can use to identify and switch to it. The colon separates the window number from the name of the program running in that window. The symbols ~ or nano in the previous screenshot are loosely names of the running program. We say "loosely" because you'll notice that ~ is not the name of a program, but was the directory we were visiting. tmux is pretty slick about this; it knows some state of the program you're using and changes the default name of the window accordingly. Note that the name given is the default; it's possible to explicitly set one for the window, as we'll see later. The symbol * indicates that this is the currently viewed window. We only have one at the moment, so it's not too exciting; however, once we get more than one, it'll be very helpful. Creating another window OK! Now that we know a bit about a part of the status line, let's create a second window so we can run a terminal command. Just press Ctrl + b, then c, and you will be presented with a new window! A few things to note are as follows: Now there is a new window with the label 1:~*. It is given the number 1 because the last one was 0. The next will be 2, then 3, 4, and so on. The asterisk that denoted the currently active window has been moved to 1 since it is now the active one. The nano application is still running in window 0. The asterisk on window 0 has been replaced by a hyphen (-). The - symbol denotes the previously opened window. This is very helpful when you have a bunch of windows. Let's run a command here just to illustrate how it works. Run the following commands: $ echo "test" > test $ cat test The output of these commands can be seen in the following screenshot: This is just some stuff so we can help identify this window. Imagine in the real world though you are moving a file, performing operations with Git, viewing log files, running top, or anything else. Let's jump back to window 0 so we can see nano still running. Simply press Ctrl + b and l to switch back to the previously opened window (the one with the hyphen; l stands for the last). As shown in the following screenshot, you'll see that nano is alive, and well, it looks exactly as we left it: The prefix key There is a special key in tmux called the prefix key that is used to perform most of the keyboard shortcuts. We have even used it already quite a bit! In this section, we will learn more about it and run through some examples of its usage. You will notice that in the preceding exercise, we pressed Ctrl + b before creating a window, then Ctrl + b again before switching back, and Ctrl + b before a number to jump to that window. When using tmux, we'll be pressing this key a lot. It's even got a name! We call it the prefix key. Its default binding in tmux is Ctrl + b, but you can change that if you prefer something else or if it conflicts with a key in a program you often use within tmux. You can send the Ctrl + b key combination through to the program by pressing Ctrl + b twice in a row; however, if it's a keyboard command you use often, you'll most likely want to change it. This key is used before almost every command we'll use in tmux, so we'll be seeing it a lot. From here on, if we need to reference the prefix key, we'll do it like <Prefix>. This way if you rebind it, the text will still make sense. If you don't rebound it or see <Prefix>, just type Ctrl + b. Let's create another window for another task. Just run <Prefix>, c again. Now we've got three windows: 0, 1, and 2. We've got one running nano and two running shells, as shown in the following screenshot: Some more things to note are as follows: Now we have window 2, which is active. See the asterisk? Window 0 now has a hyphen because it was the last window we viewed. This is a clear, blank shell because the one we typed stuff into is over in Window 1. Let's switch back to window 1 to see our test commands above still active. The last time we switched windows, we used <Prefix>, l to jump to the last window, but that will not work to get us to window 1 at this point because the hyphen is on window 0. So, going to the last selected window will not get us to 1. Thankfully, it is very easy to switch to a window directly by its number. Just press <Prefix>, then the window number to jump to that window. So <Prefix>, 1 will jump to window 1 even though it wasn't the last one we opened, as shown in the following screenshot: Sure enough, now window 1 is active and everything is present, just as we left it. Now we typed some silly commands here, but it could just as well have been an active running process here, such as unit tests, code linting, or top. Any such process would run in the background in tmux without an issue. This is one of the most powerful features of tmux. In the traditional world, to have a long-running process in a terminal window and get some stuff done in a terminal, you would need two different terminal windows open; if you accidentally close one, the work done in that window will be gone. tmux allows you to keep just one terminal window open, and this window can have a multitude of different windows within it, closing all the different running processes. Closing this terminal window won't terminate the running processes; tmux will continue humming along in the background with all of the programs running behind the scenes. Help on key bindings Now a keen observer may notice that the trick of entering the window number will only work for the first 10 windows. This is because once you get into double digits, tmux won't be able to tell when you're done entering the number. If this trick of using the prefix key plus the number only works for the first 10 windows (windows 0 to 9), how will we select a window beyond 10? Thankfully, tmux gives us many powerful ways to move between windows. One of my favorites is the choose window interface. However, oh gee! This is embarrassing. Your author seems to have entirely forgotten the key combination to access the choose window interface. Don't fear though; tmux has a nice built-in way to access all of the key bindings. So let's use it! Press <Prefix>, ? to see your screen change to show a list with bind-key to the left, the key binding in the middle, and the command it runs to the right. You can use your arrow keys to scroll up and down, but there are a lot of entries there! Thankfully, there is a quicker way to get to the item you want without scrolling forever. Press Ctrl + s and you'll see a prompt appear that says Search Down:, where you can type a string and it will search the help document for that string. Emacs or vi mode tmux tries hard to play nicely with developer defaults, so it actually includes two different modes for many key combinations tailored for the two most popular terminal editors: Emacs and vi. These are referred to in tmux parlance as status-keys and mode-keys that can be either Emacs or vi. The tmux default mode is Emacs for all the key combinations, but it can be changed to vi via configuration. It may also be set to vi automatically based on the global $EDITOR setting in your shell. If you are used to Emacs, Ctrl + s should feel very natural since it's the command Emacs uses to search. So, if you try Ctrl + s and it has no effect, your keys are probably in the vi mode. We'll try to provide guidance when there is a mode-specific key like this by including the vi mode's counterpart in parentheses after the default key. For example, in this case, the command would look like Ctrl + s (/) since the default is Ctrl + s and / is the command in the vi mode. Type in choose-window and hit Enter to search down and find the choose-window key binding. Oh look! There it is; it's w: However, what exactly does that mean? Well, all that means is that we can type our prefix key (<Prefix>), followed by the key in that help document to run the mentioned command. First, let's get out of these help docs. To get out of these or any screens like them, generated by tmux, simply press q for quit and you should be back in the shell prompt for window 2. If you ever forget any key bindings, this should be your first step. A nice feature of this key binding help page is that it is dynamically updated as you change your key bindings. Later, when we get to Configuration, you may want to change bindings or bind new shortcuts. They'll all show up in this interface with the configuration you provide them with. Can't do that with manpages! Now, to open the choose window interface, simply type <Prefix>, w since w was the key shown in the help bound to choose-window and voilà: Notice how it nicely lays out all of the currently open windows in a task-manager-like interface. It's interactive too. You can use the arrow keys to move up and down to highlight whichever window you like and then just hit Enter to open it. Let's open the window with nano running. Move up to highlight window 0 and hit Enter. You may notice a few more convenient and intuitive ways to switch between the currently active windows when browsing through the key bindings help. For example, <Prefix>, p will switch to the previous window and <Prefix>, n will switch to the next window. Whether refreshing your recollection on a key binding you've already learnt or seeking to discover a new one, the key bindings help is an excellent resource. Searching for text Now we only have three windows so it's pretty easy to remember what's where, but what if we had 30 or 300? With tmux, that's totally possible. (Though, this is not terribly likely or useful! What would you do with 300 active windows?) One other convenient way to switch between windows is to use the find-window feature. This will prompt us for some text, and it will search all the active windows and open the window that has the text in it. If you've been following along, you should have the window with nano currently open (window 0). Remember we had a shell in window 1 where we had typed some silly commands? Let's try to switch to that one using the find-window feature. Type <Prefix>, f and you'll see a find-window prompt as shown in the following screenshot: Here, type in cat test and hit Enter . You'll see you've switched to window 1 because it had the cat test command in it. However, what if you search for some text that is ambiguous? For example, if you've followed along, you will see the word test appear multiple times on both windows 0 and 1. So, if you try find-window with just the word test, it couldn't magically switch right away because it wouldn't know which window you mean. Thankfully, tmux is smart enough to handle this. It will give you a prompt, similar to the choose-window interface shown earlier, but with only the windows that match the query (in our case, windows 0 and 1; window 2 did not have the word test in it). It also includes the first line in each window (for context) that had the text. Pick window 0 to open it. Detaching and attaching Now press <Prefix>, d . Uh oh! Looks like tmux is gone! The familiar status bar is no more available. The <Prefix> key set does nothing anymore. You may think we the authors have led you astray, causing you to lose your work. What will you do without that detailed document you were writing in nano? Fear not explorer, we are simply demonstrating another very powerful feature of tmux. <Prefix>, d will simply detach the currently active session, but it will keep running happily in the background! Yes, although it looks like it's gone, our session is alive and well. How can we get back to it? First, let's view the active sessions. In your terminal, run the following command: $ tmux list-sessions You should see a nice list that has your session name, number of windows, and date of creation and dimensions. If you had more than one session, you'd see them here too. To re attach the detached session to your session, simply run the following command: $ tmux attach-session –t tutorial This tells tmux to attach a session and the session to attach it to as the target (hence -t). In this case, we want to attach the session named tutorial. Sure enough, you should be back in your tmux session, with the now familiar status bar along the bottom and your nano masterpiece back in view. Note that this is the most verbose version of this command. You can actually omit the target if there is only one running session, as is in our scenario. This shortens the command to tmux attach-session. It can be further shortened because attach-session has a shorter alias, attach. So, we could accomplish the same thing with just tmux attach. Throughout this text, we will generally use the more verbose version, as they tend to be more descriptive, and leave shorter analogues as exercises for the reader. Explaining tmux commands Now you may notice that attach-session sounds like a pretty long command. It's the same as list-sessions, and there are many others in the lexicon of tmux commands that seem rather verbose. Tab completion There is less complexity to the long commands than it may seem because most of them can be tab-completed. Try going to your command prompt and typing the following: $ tmux list-se Next, hit the Tab key. You should see it fill out to this: $ tmux list-sessions So thankfully, due to tab completion, there is little need to remember these long commands. Note that tab completion will only work in certain shells with certain configurations, so if the tab completion trick doesn't work, you may want to search the Web and find a way to enable tab completion for tmux. Aliases Most of the commands have an alias, which is a shorter form of each command that can be used. For example, the alias of list-sessions is ls. The alias of new-session is new. You can see them all readily by running the tmux command list-commands (alias lscm), as used in the following code snippet: $ tmux list-commands This will show you a list of all the tmux commands along with their aliases in parenthesis after the full name. Throughout this text, we will always use the full form for clarity, but you could just as easily use the alias (or just tab complete of course). One thing you'll most likely notice is that only the last few lines are visible in your terminal. If you go for your mouse and try to scroll up, that won't work either! How can you view the text that is placed above? We will need to move into something called the Copy mode. Renaming windows Let's say you want to give a more descriptive name to a window. If you had three different windows, each with the nano editor open, seeing nano for each window wouldn't be all that helpful. Thankfully, it's very easy to rename a window. Just switch to the window you'd like to rename. Then <Prefix>, ,will prompt you for a new name. Let's rename the nano window to masterpiece. See how the status line has been updated and now shows window 0 with the masterpiece title as shown in the following screenshot. Thankfully, tmux is not smart enough to check the contents of your window; otherwise, we're not sure whether the masterpiece title would make it through. Killing windows As the last stop on our virtual tour, let's kill a window we no longer need. Switch to window 1 with our find-window trick by entering <Prefix>, f, cat test, Enter or of course we could use the less exciting <Prefix>, l command to move to the last opened window. Now let's say goodbye to this window. Press <Prefix>, & to kill it. You will receive a prompt to which you have to confirm that you want to kill it. This is a destructive process, unlike detaching, so be sure anything you care about has been saved. Once you confirm it, window 1 will be gone. Poor window 1! You will see that now there are only window 0 and window 2 left: You will also see that now <Prefix>, f, cat test, Enter no longer loads window 1 but rather says No windows matching: cat test. So, window 1 is really no longer with us. Whenever we create a new window, it will take the lowest available index, which in this case will be 1. So window 1 can rise again, but this time as a new and different window with little memory of its past. We can also renumber windows as we'll see later, so if window 1 being missing is offensive to your sense of aesthetics, fear not, it can be remedied! Summary In this article, we got to jump right in and get a whirlwind tour of some of the coolest features in tmux. Here is a quick summary of the features we covered in this article: Starting tmux Naming and renaming sessions The window string and what each chunk means Creating new windows The prefix key Multiple ways to switch back and forth between windows Accessing the help documents for available key bindings Detaching and attaching sessions Renaming and killing windows Resources for Article: Further resources on this subject: Getting Started with GnuCash [article] Creating a Budget for your Business with Gnucash [article] Apache CloudStack Architecture [article]

0
0
1524

article-image-handling-selinux-aware-applications

Packt

19 Sep 2014

5 min read

Handling SELinux-aware Applications

Packt

19 Sep 2014

5 min read

This article is written by Sven Vermeulen, the author of SELinux Cookbook. In this article, we will cover how to control D-Bus message flows. (For more resources related to this topic, see here.) Controlling D-Bus message flows D-Bus implementation on Linux is an example of an SELinux-aware application, acting as a user space object manager. Applications can register themselves on a bus and can send messages between applications through D-Bus. These messages can be controlled through the SELinux policy as well. Getting ready Before looking at the SELinux access controls related to message flows, it is important to focus on a D-Bus service and see how its authentication is done (and how messages are relayed in D-Bus) as this is reflected in the SELinux integration. Go to /etc/dbus-1/system.d/ (which hosts the configuration files for D-Bus services) and take a look at a configuration file. For instance, the service configuration file for dnsmasq looks like the following: <!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN" "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd"> <busconfig> <policy user="root"> <allow own="uk.org.thekelleys.dnsmasq"/> <allow send_destination="uk.org.thekelleys.dnsmasq"/> </policy> <policy context="default"> <deny own="uk.org.thekelleys.dnsmasq"/> <deny send_destination="uk.org.thekelleys.dnsmasq"/> </policy> </busconfig> This configuration tells D-Bus that only the root Linux user is allowed to have a service own the uk.org.thekelleys.dnsmasq service and send messages to this service. Others (as managed through the default policy) are denied these operations. On a system with SELinux enabled, having root as the finest granularity doesn't cut it. So, let's look at how the SELinux policy can offer a fine-grained access control in D-Bus. How to do it… To control D-Bus message flows with SELinux, perform the following steps: Identify the domain of the application that will (or does) own the D-Bus service we are interested in. For the dnsmasq application, this would be dnsmasq_t: ~# ps -eZ | grep dnsmasq | awk '{print $1}' system_u:system_r:dnsmasq_t:s0-s0:c0.c1023 Identify the domain of the application that wants to send messages to the service. For instance, this could be the sysadm_t user domain. Allow the two domains to interact with each other through D-Bus messages as follows: gen_require(` class dbus send_msg; ') allow sysadm_t dnsmasq_t:dbus send_msg; allow dnsmasq_t sysadm_t:dbus send_msg; How it works… When an application connects to D-Bus, the SELinux label of its connection is used as the label to check when sending messages. As there is no transition for such connections, the label of the connection is the context of the process itself (the domain); hence, the selection of dnsmasq_t in the example. When D-Bus receives a request to send a message to a service, D-Bus will check the SELinux policy for the send_msg permission. It does so by passing on the information about the session (source and target context and the permission that is requested) to the SELinux subsystem, which computes whether access should be allowed or not. The access control itself, however, is not enforced by SELinux (it only gives feedback), but by D-Bus itself as governing the message flows is solely D-Bus' responsibility. This is also why, when developing D-Bus-related policies, both the class and permission need to be explicitly mentioned in the policy module. Without this, the development environment might error out, claiming that dbus is not a valid class. D-Bus checks the context of the client that is sending a message as well as the context of the connection of the service (which are both domain labels) and see if there is a send_msg permission allowed. As most communication is two-fold (sending a message and then receiving a reply), the permission is checked in both directions. After all, sending a reply is just sending a message (policy-wise) in the reverse direction. It is possible to verify this behavior with dbus-send if the rule is on a user domain. For instance, to look at the objects provided by the service, the D-Bus introspection can be invoked against the service: ~# dbus-send --system --dest=uk.org.thekelleys.dnsmasq --print-reply /uk/org/thekelleys/dnsmasq org.freedesktop.DBus.Introspectable.Introspect When SELinux does not have the proper send_msg allow rules in place, the following error will be logged by D-Bus in its service logs (but no AVC denial will show up as it isn't the SELinux subsystem that denies the access): Error org.freedesktop.DBus.Error.AccessDenied: An SELinux policy prevents this sender from sending this message to this recipient. 0 matched rules; type="method_call", sender=":1.17" (uid=0 pid=6738 comm="") interface="org.freedesktop.DBus.Introspectable" member="Introspect" error name="(unset)" requested_reply="0" destination="uk.org.thekelleys.dnsmasq" (uid=0 pid=6635 comm="") When the policy does allow the send_msg permission, the introspection returns an XML output showing the provided methods and interfaces for this service. There's more... The current D-Bus implementation is a pure user space implementation. Because more applications become dependent on D-Bus, work is being done to create a kernel-based D-Bus implementation called kdbus. The exact implementation details of this project are not finished yet, so it is unknown whether the SELinux access controls that are currently applicable to D-Bus will still be valid on kdbus. Summary In this article, we learned how to control D-Bus message flows. It also covers what happens when the policy has or doesn't have the send_msg permission in place. Resources for Article: Further resources on this subject: An Introduction to the Terminal [Article] Wireless and Mobile Hacks [Article] Baking Bits with Yocto Project [Article]

0
0
3258

article-image-driving-visual-analyses-automobile-data-python

Packt

19 Sep 2014

19 min read

Driving Visual Analyses with Automobile Data (Python)

Packt

19 Sep 2014

19 min read

0
0
9021

Setting up of Software Infrastructure on the Cloud

Windows Phone 8 Applications

Installing RHEV Manager

Using Socket.IO and Express together

Improving Code Quality

Building, Publishing, and Supporting Your Force.com Application

Creating Our First Universe

Handling Long-running Requests in Play

Adding Real-time Functionality Using Socket.io

Visualization as a Tool to Understand Data

Trending Topics

Mobility

Creating a RESTful API

Jump Right In

Handling SELinux-aware Applications

Driving Visual Analyses with Automobile Data (Python)

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access