Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-running-your-spark-job-executors-docker-containers
Bernardo Gomez
27 May 2016
12 min read
Save for later

Running Your Spark Job Executors In Docker Containers

Bernardo Gomez
27 May 2016
12 min read
The following post showcases a Dockerized Apache Spark application running in a Mesos cluster. In our example, the Spark Driver as well as the Spark Executors will be running in a Docker image based on Ubuntu with the addition of the SciPy Python packages. If you are already familiar with the reasons for using Docker as well as Apache Mesos, feel free to skip the next section and jump right to the post, but if not, please carry on. Rational Today, it’s pretty common to find engineers and data scientist who need to run big data workloads in a shared infrastructure. In addition, the infrastructure can potentially be used not only for such workloads, but also for other important services required for business operations. A very common way to solve such problems is to virtualize the infrastructure and statically partition it in such a way that each development or business group in the company has its own resources to deploy and run their applications on. Hopefully, the maintainers of such infrastructure and services have a DevOps mentality and have automated, and continuously work on automating, the configuration and software provisioning tasks on such infrastructure. The problem is, as Benjamin Hindman backed by the studies done at the University of California, Berkeley, points out, static partitioning can be highly inefficient on the utilization of such infrastructure. This has prompted the development of Resource Schedulers that abstract CPU, memory, storage, and other computer resources away from machines, either physically or virtually, to enable the execution of applications across the infrastructure to achieve a higher utilization factor, among other things. The concept of sharing infrastructure resources is not new for applications that entail the analysis of large datasets, in most cases, through algorithms that favor parallelization of workloads. Today, the most common frameworks to develop such applications are Hadoop Map Reduce and Apache Spark. In the case of Apache Spark, it can be deployed in clusters managed by Resource Schedulers such as Hadoop YARN or Apache Mesos. Now, since different applications are running inside a shared infrastructure, it’s common to find applications that have different sets of requirements across the software packages and versions of such packages they depend on to function. As an operation engineer, or infrastructure manager, you can force your users to a predefine a set of software libraries, along with their versions, that the infrastructure supports. Hopefully, if you follow that path you also establish a procedure to upgrade such software libraries and add new ones. This tends to require an investment in time and might be frustrating to engineers and data scientist that are constantly installing new packages and libraries to facilitate their work. When you decide to upgrade, you might as well have to refactor some applications that might have been running for a long time but have heavy dependencies on previous versions of the packages that are part of the upgrade. All in all, it’s not simple. Linux Containers, and specially Docker, offer an abstraction such that software can be packaged into lightweight images that can be executed as containers. The containers are executed with some level of isolation, and such isolation is mainly provided by cgroups. Each image can define the type of operating system that it requires along with the software packages. This provides a fantastic mechanism to pass the burden of maintaining the software packages and libraries out of infrastructure management and operations to the owners of the applications. With this, the infrastructure and operations teams can run multiple isolated applications that can potentially have conflicting software libraries within the same infrastructure. Apache Spark can leverage this as long as it’s deployed with an Apache Mesos cluster that supports Docker. In the next sections, we will review how we can run Apache Spark applications within Docker containers. Tutorial For this post, we will use a CentOS 7.2 minimal image running on VirtualBox. However, in this tutorial, we will not include the instructions to obtain such a CentOS image, make it available in VirtualBox, or configure its network interfaces. Additionally, we will be using a single node to keep this exercise as simple as possible. We can later explore deploying a similar setup in a set of nodes in the cloud; but for the sake of simplicity and time, our single node will be running the following services: A Mesos master A Mesos slave A Zookeeper instance A Docker daemon Step 1: The Mesos Cluster To install Apache Mesos in your cluster, I suggest you follow the Mesosphere getting started guidelines. Since we are using CentOS 7.2, we first installed the Mesosphere YUM repository as follows: # Add the repository sudo rpm -Uvh http://repos.mesosphere.com/el/7/noarch/RPMS/ mesosphere-el-repo-7-1.noarch.rpm We then install Apache Mesos and the Apache Zookeeper packages. sudo yum -y install mesos mesosphere-zookeeper Once the packages are installed, we need to configure Zookeeper as well as the Mesos master and slave. Zookeeper For Zookeeper, we need to create a Zookeeper Node Identity. We do this by setting the numerical identifying inside the /var/lib/zookeeper/myid file. echo "1" > /var/lib/zookeeper/myid Since by default Zookeeper binds to all interfaces and exposes its services through port 2181, we do not need to change the /etc/zookeeper/conf/zoo.cfg file. Refer to the Mesosphere getting started guidelines if you have a Zookeeper ensemble, more than one node running Zookeeper. After that, we can start the Zookeeper service: sudo service zookeeper restart Mesos master and slave Before we start to describe the Mesos configuration, we must note that the location of the Mesos configuration files that we will talk about now is specific to Mesosphere's Mesos package. If you don't have a strong reason to build your own Mesos packages, I suggest you use the ones that Mesosphere kindly provides. We need to tell the Mesos master and slave about the connection string they can use to reach Zookeeper, including their namespace. By default, Zookeeper will bind to all interfaces; you might want to change this behavior. In our case, we will make sure that the IP address that we want to use to connect to Zookeeper can be resolved within the containers. The nodes public interface IP 192.168.99.100, and to do this, we do the following: echo "zk://192.168.99.100:2181/mesos" > /etc/mesos/zk Now, since in our setup we have several network interfaces associated with the node that will be running the Mesos master, we will pick an interface that will be reachable within the Docker containers that will eventually be running the Spark Driver and Spark Executors. Knowing that the IP address that we want to bind to is 192.168.99.100, we do the following: echo "192.168.99.100" > /etc/mesos-master/ip We do a similar thing for the Mesos slave. Again, consider that in our example the Mesos slave is running on the same node as the Mesos master and we will bind it to the same network interface. echo "192.168.99.100" > /etc/mesos-slave/ip echo "192.168.99.100" > /etc/mesos-slave/hostname The IP defines the IP address that the Mesos slave will bind to and the hostname defines the hostname that the slave will use to report its availability, and therefore, it is the value that the Mesos frameworks, in our case Apache Spark, will use to connect to it. Let’s start the services: systemctl start mesos-master systemctl start mesos-slave By default, the Mesos master will bind to port 5050 and the Mesos slave to port 5051. Let’s confirm this, assuming that you have installed the net-utils package: netstat -pleno | grep -E "5050|5051" tcp 0 0 192.168.99.100:5050 0.0.0.0:* LISTEN 0 127336 22205/mesos-master off (0.00/0/0) tcp 0 0 192.168.99.100:5051 0.0.0.0:* LISTEN 0 127453 22242/mesos-slave off (0.00/0/0) Let’s run a test: MASTER=$(mesos-resolve cat /etc/mesos/zk) LIBPROCESSIP=192.168.99.100 mesos-execute --master=$MASTER --name="cluster-test" --command="echo 'Hello World' && sleep 5 && echo 'Good Bye'" Step 2: Installing Docker We followed the Docker documentation on installing Docker in CentOS. I suggest that you do the same. In a nutshell, we executed the following: sudo yum update sudo tee /etc/yum.repos.d/docker.repo <<-'EOF' [dockerrepo] name=Docker Repository baseurl=https://yum.dockerproject.org/repo/main/centos/$releasever/ enabled=1 gpgcheck=1 gpgkey=https://yum.dockerproject.org/gpg EOF sudo yum install docker-engine sudo service docker start If the preceding code succeeded, you should be able to do a docker ps as well as a docker search ipython/scipystack successfully. Step 3: Creating a Spark image Let’s create the Dockerfile that will be used by the Spark Driver and Spark Executor. For our example, we will consider that the Docker image should provide the SciPy stack along with additional Python libraries. So, in a nutshell, the Docker image must have the following features: The version of libmesos should be compatible with the version of the Mesos master and slave. For example, /usr/lib/libmesos-0.26.0.so It should have a valid JDK It should have the SciPy stack as well as Python packages that we want It should have a version of Spark, we will choose 1.6.0. The Dockerfile below will provide the requirements that we mention above. Note that installing Mesos through the Mesosphere RPMs will install Open JDK, in this case version 1.7. Dockerfile: # Version 0.1 FROM ipython/scipystack MAINTAINER Bernardo Gomez Palacio "bernardo.gomezpalacio@gmail.com" ENV REFRESHEDAT 2015-03-19 ENV DEBIANFRONTEND noninteractive RUN apt-get update RUN apt-get dist-upgrade -y # Setup RUN sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF RUN export OSDISTRO=$(lsbrelease -is | tr '[:upper:]' '[:lower:]') && export OSCODENAME=$(lsbrelease -cs) && echo "deb http://repos.mesosphere.io/${OSDISTRO} ${OSCODENAME} main" | tee /etc/apt/sources.list.d/mesosphere.list && apt-get -y update RUN apt-get -y install mesos RUN apt-get install -y python libnss3 curl RUN curl http://d3kbcqa49mib13.cloudfront.net/spark-1.6.0-bin-hadoop2.6.tgz | tar -xzC /opt && mv /opt/spark* /opt/spark RUN apt-get clean # Fix pypspark six error. RUN pip2 install -U six RUN pip2 install msgpack-python RUN pip2 install avro COPY spark-conf/* /opt/spark/conf/ COPY scripts /scripts ENV SPARKHOME /opt/spark ENTRYPOINT ["/scripts/run.sh"] Let’s explain some very important files that will be available in the Docker image according to the Dockerfile mentioned earlier: The spark-conf/spark-env.sh, as mentioned in the Spark docs, will be used to set the location of the Mesos libmesos.so: export MESOSNATIVEJAVALIBRARY=${MESOSNATIVEJAVALIBRARY:-/usr/lib/libmesos.so}export SPARKLOCALIP=${SPARKLOCALIP:-"127.0.0.1"}export SPARKPUBLICDNS=${SPARKPUBLICDNS:-"127.0.0.1"} The spark-conf/spark-defaults.conf serves as the definition of the default configuration for our Spark jobs within the container, the contents are as follows: spark.master SPARKMASTER spark.mesos.mesosExecutor.cores MESOSEXECUTORCORE spark.mesos.executor.docker.image SPARKIMAGE spark.mesos.executor.home /opt/spark spark.driver.host CURRENTIP spark.executor.extraClassPath /opt/spark/custom/lib/* spark.driver.extraClassPath /opt/spark/custom/lib/* Note that the use of environment variables such as SPARKMASTER and SPARKIMAGE are critical since this will allow us to customize how the Spark application interacts with the Mesos Docker integration. We have Docker's entry point script. The script, showcased below, will populate the spark-defaults.conf file. Now, let’s define the Dockerfile entry point such that it lets us define some basic options that will get passed to the Spark command, for example, spark-shell, spark-submit or pyspark: #!/bin/bash SPARKMASTER=${SPARKMASTER:-local} MESOSEXECUTORCORE=${MESOSEXECUTORCORE:-0.1} SPARKIMAGE=${SPARKIMAGE:-sparkmesos:lastet} CURRENTIP=$(hostname -i) sed -i 's;SPARKMASTER;'$SPARKMASTER';g' /opt/spark/conf/spark-defaults.conf sed -i 's;MESOSEXECUTORCORE;'$MESOSEXECUTORCORE';g' /opt/spark/conf/spark-defaults.conf sed -i 's;SPARKIMAGE;'$SPARKIMAGE';g' /opt/spark/conf/spark-defaults.conf sed -i 's;CURRENTIP;'$CURRENTIP';g' /opt/spark/conf/spark-defaults.conf export SPARKLOCALIP=${SPARKLOCALIP:-${CURRENTIP:-"127.0.0.1"}} export SPARKPUBLICDNS=${SPARKPUBLICDNS:-${CURRENTIP:-"127.0.0.1"}} if [ $ADDITIONALVOLUMES ]; then echo "spark.mesos.executor.docker.volumes: $ADDITIONALVOLUMES" >> /opt/spark/conf/spark-defaults.conf fi exec "$@" Let’s build the image so we can start using it. docker build -t sparkmesos . && docker tag -f sparkmesos:latest sparkmesos:latest Step 4: Running a Spark application with Docker. Now that the image is built, we just need to run it. We will call the PySpark application: docker run -it --rm -e SPARKMASTER="mesos://zk://192.168.99.100:2181/mesos" -e SPARKIMAGE="sparkmesos:latest" -e PYSPARKDRIVERPYTHON=ipython2 sparkmesos:latest /opt/spark/bin/pyspark To make sure that SciPy is working, let's write the following to the PySpark shell: from scipy import special, optimize import numpy as np f = lambda x: -special.jv(3, x) sol = optimize.minimize(f, 1.0) x = np.linspace(0, 10, 5000) x Now, let’s try to calculate PI as an example: docker run -it --rm -e SPARKMASTER="mesos://zk://192.168.99.100:2181/mesos" -e SPARKIMAGE="sparkmesos:latest" -e PYSPARKDRIVERPYTHON=ipython2 sparkmesos:latest /opt/spark/bin/spark-submit --driver-memory 500M --executor-memory 500M /opt/spark/examples/src/main/python/pi.py 10 Conclusion and further notes Although we were able to run a Spark application within a Docker container leveraging Apache Mesos, there is more work to do. We need to explore containerized Spark applications that spread across multiple nodes along with providing a mechanism that enables network port mapping. References Apache Mesos, The Apache Software Foundation, 2015. Web. 27 Jan. 2016. Apache Spark, The Apache Software Foundation, 2015. Web. 27 Jan. 2016. Benjamin Hindman, "Apache Mesos NYC Meetup", August 20, 2013. Web. 27 Jan 2016. Docker, Docker Inc, 2015. Web. 27 Jan 2016. () Hindman, Konwinski, Zaharia, Ghodsi, D. Joseph, Katz, Shenker, Stoica. "Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center" Web. 27 Jan 2016. Mesosphere Inc, 2015. Web. 27 Jan 2016. SciPy, SciPy developers, 2015. Web. 28 Jan 2016. Virtual Box, Oracle Inc, 2015. Web 28 Jan 2016. Wang Qiang, "Docker Spark Mesos". Web 28 Jan 2016. About the Author Bernardo Gomez Palacio is a consulting member of technical staff, Big Data Cloud Services at Oracle Cloud. He is an electronic systems engineer but has worked for more than 12 years developing software and more than 6 years on DevOps. Currently, his work is that of developing infrastructure to aid the creation and deployment of big data applications. He is a supporter of open source software and has a particular interest in Apache Mesos, Apache Spark, Distributed File Systems, and Docker Containerization & Networking. His opinions are his own and do not reflect the opinions of his employer. 
Read more
  • 0
  • 0
  • 13661

article-image-wrappers
Packt
27 May 2016
13 min read
Save for later

Wrappers

Packt
27 May 2016
13 min read
In this article by Erik Westra, author of the book Modular Programming with Python, we learn the concepts of wrappers. A wrapper is essentially a group of functions that call other functions to do the work. Wrappers are used to simplify an interface, to make a confusing or badly designed API easier to use, to convert data formats into something more convenient, and to implement cross-language compatibility. Wrappers are also sometimes used to add testing and error-checking code to an existing API. Let's take a look at a real-world application of a wrapper module. Imagine that you work for a large bank and have been asked to write a program to analyze fund transfers to help identify possible fraud. Your program receives information, in real time, about every inter-bank funds transfer that takes place. For each transfer, you are given: The amount of the transfer The ID of the branch in which the transfer took place The identification code for the bank the funds are being sent to Your task is to analyze the transfers over time to identify unusual patterns of activity. To do this, you need to calculate, for each of the last eight days, the total value of all transfers for each branch and destination bank. You can then compare the current day's totals against the average for the previous seven days, and flag any daily totals that are more than 50% above the average. You start by deciding how to represent the total transfers for a day. Because you need to keep track of this for each branch and destination bank, it makes sense to store these totals in a two-dimensional array: In Python, this type of two-dimensional array is represented as a list of lists: totals = [[0, 307512, 1612, 0, 43902, 5602918], [79400, 3416710, 75, 23508, 60912, 5806], ... ] You can then keep a separate list of the branch ID for each row and another list holding the destination bank code for each column: branch_ids = [125000249, 125000252, 125000371, ...] bank_codes = ["AMERUS33", "CERYUS33", "EQTYUS44", ...] Using these lists, you can calculate the totals for a given day by processing the transfers that took place on that particular day: totals = [] for branch in branch_ids: branch_totals = [] for bank in bank_codes: branch_totals.append(0) totals.append(branch_totals) for transfer in transfers_for_day: branch_index = branch_ids.index(transfer['branch']) bank_index = bank_codes.index(transfer['dest_bank']) totals[branch_index][bank_index] += transfer['amount'] So far so good. Once you have these totals for each day, you can then calculate the average and compare it against the current day's totals to identify the entries that are higher than 150% of the average. Let's imagine that you've written this program and managed to get it working. When you start using it, though, you immediately discover a problem: your bank has over 5,000 branches, and there are more than 15,000 banks worldwide that your bank can transfer funds to—that's a total of 75 million combinations that you need to keep totals for, and as a result, your program is taking far too long to calculate the totals. To make your program faster, you need to find a better way of handling large arrays of numbers. Fortunately, there's a library designed to do just this: NumPy. NumPy is an excellent array-handling library. You can create huge arrays and perform sophisticated operations on an array with a single function call. Unfortunately, NumPy is also a dense and impenetrable library. It was designed and written for people with a deep understanding of mathematics. While there are many tutorials available and you can generally figure out how to use it, the code that uses NumPy is often hard to comprehend. For example, to calculate the average across multiple matrices would involve the following: daily_totals = [] for totals in totals_to_average: daily_totals.append(totals) average = numpy.mean(numpy.array(daily_totals), axis=0) Figuring out what that last line does would require a trip to the NumPy documentation. Because of the complexity of the code that uses NumPy, this is a perfect example of a situation where a wrapper module can be used: the wrapper module can provide an easier-to-use interface to NumPy, so your code can use it without being cluttered with complex and confusing function calls. To work through this example, we'll start by installing the NumPy library. NumPy (http://www.numpy.org) runs on Mac OS X, Windows, and Linux machines. How you install it depends on which operating system you are using: For Mac OS X, you can download an installer from http://www.kyngchaos.com/software/python. For MS Windows, you can download a Python "wheel" file for NumPy from http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy. Choose the pre-built version of NumPy that matches your operating system and the desired version of Python. To use the wheel file, use the pip install command, for example, pip install numpy-1.10.4+mkl-cp34-none-win32.whl. For more information about installing Python wheels, refer to https://pip.pypa.io/en/latest/user_guide/#installing-from-wheels. If your computer runs Linux, you can use your Linux package manager to install NumPy. Alternatively, you can download and build NumPy in source code form. To ensure that NumPy is working, fire up your Python interpreter and enter the following: import numpy a = numpy.array([[1, 2], [3, 4]]) print(a) All going well, you should see a 2 x 2 matrix displayed: [[1 2] [3 4]] Now that we have NumPy installed, let's start working on our wrapper module. Create a new Python source file, named numpy_wrapper.py, and enter the following into this file: import numpy That's all for now; we'll add functions to this wrapper module as we need them. Next, create another Python source file, named detect_unusual_transfers.py, and enter the following into this file: import random import numpy_wrapper as npw BANK_CODES = ["AMERUS33", "CERYUS33", "EQTYUS44", "LOYDUS33", "SYNEUS44", "WFBIUS6S"] BRANCH_IDS = ["125000249", "125000252", "125000371", "125000402", "125000596", "125001067"] As you can see, we are hardwiring the bank and branch codes for our example; in a real program, these values would be loaded from somewhere, such as a file or a database. Since we don't have any available data, we will use the random module to create some. We are also changing the name of the numpy_wrapper module to make it easier to access from our code. Let's now create some funds transfer data to process, using the random module: days = [1, 2, 3, 4, 5, 6, 7, 8] transfers = [] for i in range(10000): day = random.choice(days) bank_code = random.choice(BANK_CODES) branch_id = random.choice(BRANCH_IDS) amount = random.randint(1000, 1000000) transfers.append((day, bank_code, branch_id, amount)) Here, we randomly select a day, a bank code, a branch ID, and an amount, storing these values in the transfers list. Our next task is to collate this information into a series of arrays. This allows us to calculate the total value of the transfers for each day, grouped by the branch ID and destination bank. To do this, we'll create a NumPy array for each day, where the rows in each array represent branches and the columns represent destination banks. We'll then go through the list of transfers, processing them one by one. The following illustration summarizes how we process each transfer in turn: First, we select the array for the day on which the transfer occurred, and then we select the appropriate row and column based on the destination bank and the branch ID. Finally, we add the amount of the transfer to that item within the day's array. Let's implement this logic. Our first task is to create a series of NumPy arrays, one for each day. Here, we immediately hit a snag: NumPy has many different options for creating arrays; in this case, we want to create an array that holds integer values and has its contents initialized to zero. If we used NumPy directly, our code would look like the following: array = numpy.zeros((num_rows, num_cols), dtype=numpy.int32) This is not exactly easy to understand, so we're going to move this logic into our NumPy wrapper module. Edit the numpy_wrapper.py file, and add the following to the end of this module: def new(num_rows, num_cols): return numpy.zeros((num_rows, num_cols), dtype=numpy.int32) Now, we can create a new array by calling our wrapper function (npw.new()) and not have to worry about the details of how NumPy works at all. We have simplified the interface to this particular aspect of NumPy: Let's now use our wrapper function to create the eight arrays that we will need, one for each day. Add the following to the end of the detect_unusual_transfers.py file: transfers_by_day = {} for day in days: transfers_by_day[day] = npw.new(num_rows=len(BANK_CODES), num_cols=len(BRANCH_IDS)) Now that we have our NumPy arrays, we can use them as if they were nested Python lists. For example: array[row][col] = array[row][col] + amount We just need to choose the appropriate array, and calculate the row and column numbers to use. Here is the necessary code, which you should add to the end of your detect_unusual_transfers.py script: for day,bank_code,branch_id,amount in transfers: array = transfers_by_day[day] row = BRANCH_IDS.index(branch_id) col = BANK_CODES.index(bank_code) array[row][col] = array[row][col] + amount Now that we've collated the transfers into eight NumPy arrays, we want to use all this data to detect any unusual activity. For each combination of branch ID and destination bank code, we will need to do the following: Calculate the average of the first seven days' activity. Multiply the calculated average by 1.5. If the activity on the eighth day is greater than the average multiplied by 1.5, then we consider this activity to be unusual. Of course, we need to do this for every row and column in our arrays, which would be very slow; this is why we're using NumPy. So, we need to calculate the average for multiple arrays of numbers, then multiply the array of averages by 1.5, and finally, compare the values within the multiplied array against the array for the eighth day of data. Fortunately, these are all things that NumPy can do for us. We'll start by collecting together the seven arrays we need to average, as well as the array for the eighth day. To do this, add the following to the end of your program: latest_day = max(days) transfers_to_average = [] for day in days: if day != latest_day: transfers_to_average.append(transfers_by_day[day]) current = transfers_by_day[latest_day] To calculate the average of a list of arrays, NumPy requires us to use the following function call: average = numpy.mean(numpy.array(arrays_to_average), axis=0) Since this is confusing, we will move this function into our wrapper. Add the following code to the end of the numpy_wrapper.py module: def average(arrays_to_average): return numpy.mean(numpy.array(arrays_to_average), axis=0) This lets us calculate the average of the seven day's activity using a single call to our wrapper function. To do this, add the following to the end of your detect_unusual_transfers.py script: average = npw.average(transfers_to_average) As you can see, using the wrapper makes our code much easier to understand. Our next task is to multiply the array of calculated averages by 1.5, and compare the result against the current day's totals. Fortunately, NumPy makes this easy: unusual_transfers = current > average * 1.5 Because this code is so clear, there's no advantage in creating a wrapper function for it. The resulting array, unusual_transfers, will be the same size as our current and average arrays, where each entry in the array is either True or False: We're almost done; our final task is to identify the array entries with a value of True, and tell the user about the unusual activity. While we could scan through every row and column to find the True entries, using NumPy is much faster. The following NumPy code will give us a list containing the row and column numbers for the True entries in the array: indices = numpy.transpose(array.nonzero()) True to form, though, this code is hard to understand, so it's a perfect candidate for another wrapper function. Go back to your numpy_wrapper.py module, and add the following to the end of the file: def get_indices(array): return numpy.transpose(array.nonzero()) This function returns a list (actually an array) of (row,col) values for all the True entries in the array. Back in our detect_unusual_activity.py file, we can use this function to quickly identify the unusual activity: for row,col in npw.get_indices(unusual_transfers): branch_id = BRANCH_IDS[row] bank_code = BANK_CODES[col] average_amt = int(average[row][col]) current_amt = current[row][col] print("Branch {} transferred ${:,d}".format(branch_id, current_amt) + " to bank {}, average = ${:,d}".format(bank_code, average_amt)) As you can see, we use the BRANCH_IDS and BANK_CODES lists to convert from the row and column number back to the relevant branch ID and bank code. We also retrieve the average and current amounts for the suspicious activity. Finally, we print out this information to warn the user about the unusual activity. If you run your program, you should see an output that looks something like this: Branch 125000371 transferred $24,729,847 to bank WFBIUS6S, average = $14,954,617 Branch 125000402 transferred $26,818,710 to bank CERYUS33, average = $16,338,043 Branch 125001067 transferred $27,081,511 to bank EQTYUS44, average = $17,763,644 Because we are using random numbers for our financial data, the output will be random too. Try running the program a few times; you may not get any output at all if none of the randomly-generated values are suspicious. Of course, we are not really interested in detecting suspicious financial activity—this example is just an excuse for working with NumPy. What is far more interesting is the wrapper module that we created, hiding the complexity of the NumPy interface so that the rest of our program can concentrate on the job to be done. If we were to continue developing our unusual activity detector, we would no doubt add more functionality to our numpy_wrapper.py module as we found more NumPy functions that we wanted to wrap. Summary This is just one example of a wrapper module. As we mentioned earlier, simplifying a complex and confusing API is just one use for a wrapper module; they can also be used to convert data from one format to another, add testing and error-checking code to an existing API, and call functions that are written in a different language. Note that, by definition, a wrapper is always thin—while there might be code in a wrapper (for example, to convert a parameter from an object into a dictionary), the wrapper function always ends up calling another function to do the actual work.
Read more
  • 0
  • 0
  • 2422

article-image-security-considerations-multitenant-environment
Packt
24 May 2016
8 min read
Save for later

Security Considerations in Multitenant Environment

Packt
24 May 2016
8 min read
In this article by Zoran Pavlović and Maja Veselica, authors of the book, Oracle Database 12c Security Cookbook, we will be introduced to common privileges and learn how to grant privileges and roles commonly. We'll also study the effects of plugging and unplugging operations on users, roles, and privileges. (For more resources related to this topic, see here.) Granting privileges and roles commonly Common privilege is a privilege that can be exercised across all containers in a container database. Depending only on the way it is granted, a privilege becomes common or local. When you grant privilege commonly (across all containers) it becomes common privilege. Only common users or roles can have common privileges. Only common role can be granted commonly. Getting ready For this recipe, you will need to connect to the root container as an existing common user who is able to grant a specific privilege or existing role (in our case – create session, select any table, c##role1, c##role2) to another existing common user (c##john). If you want to try out examples in the How it works section given ahead, you should open pdb1 and pdb2. You will use: Common users c##maja and c##zoran with dba role granted commonly Common user c##john Common roles c##role1 and c##role2 How to do it... You should connect to the root container as a common user who can grant these privileges and roles (for example, c##maja or system user). SQL> connect c##maja@cdb1 Grant a privilege (for example, create session) to a common user (for example, c##john) commonly c##maja@CDB1> grant create session to c##john container=all; Grant a privilege (for example, select any table) to a common role (for example, c##role1) commonly c##maja@CDB1> grant select any table to c##role1 container=all; Grant a common role (for example, c##role1) to a common role (for example, c##role2) commonly c##maja@CDB1> grant c##role1 to c##role2 container=all; Grant a common role (for example, c##role2) to a common user (for example, c##john) commonly c##maja@CDB1> grant c##role2 to c##john container=all; How it works... Figure 16 You can grant privileges or common roles commonly only to a common user. You need to connect to the root container as a common user who is able to grant a specific privilege or role. In step 2, system privilege, create session, is granted to common user c##john commonly, by adding a container=all clause to the grant statement. This means that user c##john can connect (create session) to root or any pluggable database in this container database (including all pluggable databases that will be plugged-in in the future). N.B. container = all clause is NOT optional, even though you are connected to the root. Unlike during creation of common users and roles (if you omit container=all, user or role will be created in all containers – commonly), If you omit this clause during privilege or role grant, privilege or role will be granted locally and it can be exercised only in root container. SQL> connect c##john/oracle@cdb1 c##john@CDB1> connect c##john/oracle@pdb1 c##john@PDB1> connect c##john/oracle@pdb2 c##john@PDB2> In the step 3, system privilege, select any table, is granted to common role c##role1 commonly. This means that role c##role1 contains select any table privilege in all containers (root and pluggable databases). c##zoran@CDB1> select * from role_sys_privs where role='C##ROLE1'; ROLE PRIVILEGE ADM COM ------------- ----------------- --- --- C##ROLE1 SELECT ANY TABLE NO YES c##zoran@CDB1> connect c##zoran/oracle@pdb1 c##zoran@PDB1> select * from role_sys_privs where role='C##ROLE1'; ROLE PRIVILEGE ADM COM -------------- ------------------ --- --- C##ROLE1 SELECT ANY TABLE NO YES c##zoran@PDB1> connect c##zoran/oracle@pdb2 c##zoran@PDB2> select * from role_sys_privs where role='C##ROLE1'; ROLE PRIVILEGE ADM COM -------------- ---------------- --- --- C##ROLE1 SELECT ANY TABLE NO YES In step 4, common role c##role1, is granted to another common role c##role2 commonly. This means that role c##role2 has granted role c##role1 in all containers. c##zoran@CDB1> select * from role_role_privs where role='C##ROLE2'; ROLE GRANTED ROLE ADM COM --------------- --------------- --- --- C##ROLE2 C##ROLE1 NO YES c##zoran@CDB1> connect c##zoran/oracle@pdb1 c##zoran@PDB1> select * from role_role_privs where role='C##ROLE2'; ROLE GRANTED_ROLE ADM COM ------------- ----------------- --- --- C##ROLE2 C##ROLE1 NO YES c##zoran@PDB1> connect c##zoran/oracle@pdb2 c##zoran@PDB2> select * from role_role_privs where role='C##ROLE2'; ROLE GRANTED_ROLE ADM COM ------------- ------------- --- --- C##ROLE2 C##ROLE1 NO YES In step 5, common role c##role2, is granted to common user c##john commonly. This means that user c##john has c##role2 in all containers. Consequently, user c##john can use select any table privilege in all containers in this container database. c##john@CDB1> select count(*) from c##zoran.t1; COUNT(*) ---------- 4 c##john@CDB1> connect c##john/oracle@pdb1 c##john@PDB1> select count(*) from hr.employees; COUNT(*) ---------- 107 c##john@PDB1> connect c##john/oracle@pdb2 c##john@PDB2> select count(*) from sh.sales; COUNT(*) ---------- 918843 Effects of plugging/unplugging operations on users, roles, and privileges Purpose of this recipe is to show what is going to happen to users, roles, and privileges when you unplug a pluggable database from one container database (cdb1) and plug it into some other container database (cdb2). Getting ready To complete this recipe, you will need: Two container databases (cdb1 and cdb2) One pluggable database (pdb1) in container database cdb1 Local user mike in pluggable database pdb1 with local create session privilege Common user c##john with create session common privilege and create synonym local privilege on pluggable database pdb1 How to do it... Connect to the root container of cdb1 as user sys: SQL> connect sys@cdb1 as sysdba Unplug pdb1 by creating XML metadata file: SQL> alter pluggable database pdb1 unplug into '/u02/oradata/pdb1.xml'; Drop pdb1 and keep datafiles: SQL> drop pluggable database pdb1 keep datafiles; Connect to the root container of cdb2 as user sys: SQL> connect sys@cdb2 as sysdba Create (plug) pdb1 to cdb2 by using previously created metadata file: SQL> create pluggable database pdb1 using '/u02/oradata/pdb1.xml' nocopy; How it works... By completing previous steps, you unplugged pdb1 from cdb1 and plugged it into cdb2. After this operation, all local users and roles (in pdb1) are migrated with pdb1 database. If you try to connect to pdb1 as a local user: SQL> connect mike@pdb1 It will succeed. All local privileges are migrated, even if they are granted to common users/roles. However, if you try to connect to pdb1 as a previously created common user c##john, you'll get an error SQL> connect c##john@pdb1 ERROR: ORA-28000: the account is locked Warning: You are no longer connected to ORACLE. This happened because after migration, common users are migrated in a pluggable database as locked accounts. You can continue to use objects in these users' schemas, or you can create these users in root container of a new CDB. To do this, we first need to close pdb1: sys@CDB2> alter pluggable database pdb1 close; Pluggable database altered. sys@CDB2> create user c##john identified by oracle container=all; User created. sys@CDB2> alter pluggable database pdb1 open; Pluggable database altered. If we try to connect to pdb1 as user c##john, we will get an error: SQL> conn c##john/oracle@pdb1 ERROR: ORA-01045: user C##JOHN lacks CREATE SESSION privilege; logon denied Warning: You are no longer connected to ORACLE. Even though c##john had create session common privilege in cdb1, he cannot connect to the migrated PDB. This is because common privileges are not migrated! So we need to give create session privilege (either common or local) to user c##john. sys@CDB2> grant create session to c##john container=all; Grant succeeded. Let's try granting a create synonym local privilege to the migrated pdb2: c##john@PDB1> create synonym emp for hr.employees; Synonym created. This proves that local privileges are always migrated. Summary In this article, we learned about common privileges and the methods to grant common privileges and roles to users. We also studied what happens to users, roles, and privileges when you unplug a pluggable database from one container database and plug it into some other container database. Resources for Article: Further resources on this subject: Oracle 12c SQL and PL/SQL New Features[article] Oracle GoldenGate 12c — An Overview[article] Backup and Recovery for Oracle SOA Suite 12C[article]
Read more
  • 0
  • 0
  • 1671

article-image-mobile-forensics
Packt
24 May 2016
15 min read
Save for later

Mobile Forensics

Packt
24 May 2016
15 min read
In this article by Soufiane Tahiri, the author of Mastering Mobile Forensics, we will look at the basics of smartphone forensics. Smartphone forensic is a relatively new and quickly emerging field of interest within the digital forensic community and law enforcement, as today's mobile devices are getting smarter, cheaper, and more easily available for common daily use. (For more resources related to this topic, see here.) To investigate the growing number of digital crimes and complaints, researchers have put in a lot of efforts to design the most affordable investigative model; in this article, we will emphasize the importance of paying real attention to the growing market of smartphones and the efforts made in this field from a digital forensic point of view, in order to design the most comprehensive investigation process. Smartphone forensics models Given the pace at which mobile technology grows and the variety of complexities that are produced by today's mobile data, forensics examiners face serious adaptation problems; so, developing and adopting standards makes sense. Reliability of evidence depends directly on adopted investigative processes, choosing to bypass or bypassing a step accidentally may (and will certainly) lead to incomplete evidence and increase the risk of rejection in the court of law. Today, there is no standard or unified model that is adapted to acquiring evidences from smartphones. The dramatic development of smart devices suggests that any forensic examiner will have to apply as many independent models as necessary in order to collect and preserve data. Similar to any forensic investigation, several approaches and techniques can be used to acquire, examine, and analyze data from a mobile device. This section provides a proposed process in which guidelines from different standards and models (SWGDE Best Practices for Mobile Phone Forensics, NIST Guidelines on Mobile Device Forensics, and Developing Process for Mobile Device Forensics by Det. Cynthia A. Murphy) were summarized. The following flowchart schematizes the overall process: Evidence Intake: This triggers the examination process. This step should be documented. Identification: In this, the examiner needs to identify the device's capabilities and specifications. The examiner should document everything that takes place during the whole process of identification. Preparation: In this, the examiner should prepare tools and methods to use and must document them. Securing and preserving evidences: In this, the examiner should protect the evidences and secure the scene, as well as isolate the device from all networks. The examiner needs to be vigilant when documenting the scene. Processing: At this stage, the examiner starts performing the actual (and technical) data acquisition, analysis, and documents the steps, and tools used and all his findings. Verification and validation: The examiner should be sure of the integrity of his findings and he must validate acquired data and evidences in this step. This step should be documented as well. Reporting: The examiner produces a final report in which he documents process and finding. Presentation: This stage is meant to exhibit and present the findings. Archiving: At the end of the forensic process, the examiner should preserve data, report, tools, and all his finding in common formats for an eventual use. Low-level techniques Digital forensic examiners can neither always nor exclusively rely on commercially available tools, handling low-level techniques is a must. This section will also cover the techniques of extracting strings from different object (for example, smartphone images) Any digital examiner should be familiar with concepts and techniques, such as: File carving: This is defined as the process of extracting a collection of data from a larger data set. It is applied to a digital investigation case. File carving is the process of extracting "data" from unallocated filesystem space using file type inner structure and not filesystem structure, meaning that the extraction process is principally based on file types headers and trailers. Extracting metadata: In an ambiguous way metadata is data that describes data or information about information. In general, metadata is hidden and extra information is generated and embedded automatically in a digital file. The definition of metadata differs depending on the context in which it's used and the community that refers to it; metadata can be considered as machine understandable information or record that describes digital records. In fact, metadata can be subdivided into three important types: Descriptive (including elements, such as author, title, abstract, keywords, and so on), Structural (describing how an object is constituted and how the elements are arranged) and Administrative (including elements, such as date and time of creation, data type, and other technical details) String dump and analysis: Most of the digital investigations rely on textual evidences, this is obviously due to the fact that most of the stored digital data is linguistic; for instance, logged conversation, a lot of important text based evidence can be gathered while dumping strings from images (smartphone memory dumps) and can include emails, instant messaging, address books, browsing history, and so on. Most of the currently available digital forensic tools rely on matching and indexing algorithms to search textual evidence at physical level, so that they search every byte to locate specific text strings. Encryption versus encoding versus hashing: The important thing to keep in mind is that encoding, encrypting and hashing are the terms that do not say the same thing at all: Encoding: Is meant for data usability, and it can be reversed using the same algorithm and requires no key Encrypting: Is meant for confidentiality, is reversible and depending on algorithms, it relies on key(s) to encrypt and decrypt. Hashing: Is meant for data integrity and cannot be 'theoretically' reversible and depends on no keys. Decompiling and disassembling: These are types of reverse engineering processes that do the opposite of what a compiler and an assembler do. Decompiler: This translates a compiled binary's low-level code designed to be computer readable into human readable high-level code. The accuracy of decompilers depends on many factors, such as the amount of metadata present in the code being decompiled and the complexity of the code (not in term of algorithms but in term of the high-level code used sophistication). Disassembler: The output of a disassembler is at some level dependent on the processor. It maps processor instructions into mnemonics, which is in contrast to decompiler's output that is far more complicated to understand and edit. iDevices forensics Similar to all Apple operating systems, iOS is derived from Mac OS X; thus, iOS uses Hierarchical File System Plus (HFS+) as its primary file system. HFS+ replaces the first developed filesystem HFS and is considered to be an enhanced version of HFS, but they are still architecturally very similar. The main improvements seen in HFS+ are: A decrease in disk space usage on large volumes (efficient use of disk space) Internationally-friendly file names (by the use of UNICODE instead of MacRoman) Allows future systems to use and extend files/folder's metadata HFS+ divides the total space on a volume (file that contains data and structure to access this data) into allocation blocks and uses 32-bit fields to identify them, meaning that this allows up to 2^32 blocks on a given volume which "simply" means that a volume can hold more files. All HFS+ volumes respect a well-defined structure and each volume contains a volume header, a catalog file, extents overflow file, attributes file, allocation file, and startup file. In addition, all Apple' iDevices have a combined built-in hardware/software advanced security and can be categorized according to Apple's official iOS Security Guide as: System security: Integrated software and hardware platform Encryption and data protection: Mechanisms implemented to protect data from unauthorized use Application security: Application sandboxing Network security: Secure data transmission Apple Pay: Implementation of secure payments Internet services: Apple's network of messaging, synchronizing, and backuping Device controls: Remotely wiping the device if it is lost or stolen Privacy control: Capabilities of control access to geolocation and user data When dealing with seizure, it's important to turn on Airplane mode and if the device is unlocked, set auto-lock to never and check whether passcode was set or not (Settings | Passcode). If you are dealing with a passcode, try to keep the phone charged if you cannot acquire its content immediately; if no passcode was set, turn off the device. There are four different acquisition methods when talking about iDevices: Normal or Direct, this is the most perfect case where you can deal directly with a powered on device; Logical Acquisition, when acquisition is done using iTunes backup or a forensic tool that uses AFC protocol and is in general not complete when emails, geolocation database, apps cache folder, and executables are missed; Advanced Logical Acquisition, a technique introduced by Jonathan Zdziarski (http://www.zdziarski.com/blog/) but no longer possible due to the introduction of iOS 8; and Physical Acquisition that generates a forensic bit-by-bit image of both system and data partitions. Before selecting (or not, because the method to choose depends on some parameters) one method, the examiner should answer three important questions: What is the device model? What is the iOS version installed? Is the device passcode protected? Is it a simple passcode? Is it a complex passcode? Android forensics Android is an open source Linux based operating system, it was first developed by Android Inc. in 2003; then in 2005 it was acquired by Google and was unveiled in 2007. The Android operating system is like most of operating systems; it consists of a stack of software components roughly divided into four main layers and five main sections, as shown on the image from https://upload.wikimedia.org/wikipedia/commons/a/af/Android-System-Architecture.svg) and each layer provides different services to the layer above. Understanding every smartphone's OS security model is a big deal in a forensic context, all vendors and smartphones manufacturers care about securing their user's data and in most of the cases the security model implemented can cause a real headache to every forensic examiner and Android is no exception to the rule. Android, as you know, is an open source OS built on the Linux Kernel and provides an environment offering the ability to run multiple applications simultaneously, each application is digitally signed and isolated in its very own sandbox. Each application sandbox defines the application's privileges. Above the Kernel all activities have constrained access to the system. Android OS implements many security components and has many considerations of its various layers; the following figure summarizes Android security architecture on ARM with TrustZone support: Without any doubt, lock screens represent the very first starting point in every mobile forensic examination. As for all smartphone's OS, Android offers a way to control access to a given device by requiring user authentication. The problem with recent implementations of lock screen in modern operating systems in general, and in Android since it is the point of interest of this section, is that beyond controlling access to the system user interface and applications, the lock screens have now been extended with more "fancy" features (showing widgets, switching users in multi-users devices, and so on) and more forensically challenging features, such as unlocking the system keystore to derive the key-encryption key (used among the disk encryption key) as well as the credential storage encryption key. The problem with bypassing lock screens (also called keyguards) is that techniques that can be used are very version/device dependent, thus there is neither a generalized method nor all-time working techniques. Android keyguard is basically an Android application whose window lives on a high window layer with the possibility of intercepting navigation buttons, in order to produce the lock effect. Each unlock method (PIN, password, pattern and face unlock) is a view component implementation hosted by the KeyguardHostView view container class. All of the methods/modes, used to secure an android device, are activated by setting the current selected mode in the enumerable SecurityMode of the class KeyguardSecurityModel. The following is the KeyguardSecurityModel.SecurityModeimplementation, as seen from Android open source project:     enum SecurityMode {         Invalid, // NULL state         None, // No security enabled         Pattern, // Unlock by drawing a pattern.         Password, // Unlock by entering an alphanumeric password         PIN, // Strictly numeric password         Biometric, // Unlock with a biometric key (e.g. finger print or face unlock)         Account, // Unlock by entering an account's login and password.         SimPin, // Unlock by entering a sim pin.         SimPuk // Unlock by entering a sim puk     } Before starting our bypass and locks cracking techniques, dealing with system files or "system protected files" assumes that the device you are handling meets some requirements: Using Android Debug Bridge (ADB) The device must be rooted USB Debugging should be enabled on the device Booting into a custom recovery mode JTAG/chip-off to acquire a physical bit-by-bit copy Windows Phone forensics Based on Windows NT Kernel, Windows Phone 8.x uses the Core System to boot, manage hardware, authenticate, and communicate on networks. The Core System is a minimal Windows system that contains low-level security features and is supplemented by a set of Windows Phone specific binaries from Mobile Core to handle phone-specific tasks which make it the only distinct architectural entity (From desktop based Windows) in Windows Phone. Windows and Windows Phone are completely aligned at Window Core System and are running exactly the same code at this level. The shared core actually consists of the Windows Core System and Mobile Core where APIs are the same but the code behinds is turned to mobile needs. Similar to most of the mobile operating systems, Windows Phone has a pretty layered architecture; the kernel and OS layers are mainly provided and supported by Microsoft but some layers are provided by Microsoft's partners depending on hardware properties in the form of board support package (BSP), which usually consists of a set of drivers and support libraries that ensure low-level hardware interaction and boot process created by the CPU supplier, then comes the original equipment manufacturers (OEMs) and independent hardware vendors (IHVs) that write the required drivers to support the phone hardware and specific component. Following this is a high level diagram describing Windows Phone architecture organized by layer and ownership: There are three main partitions on a Windows Phone that are forensically interesting: MainOS, Data, and Removable User Data (not visible on the preceding screenshot since Lumia 920 does not support SD cards) partitions; as their respective names suggest, the MainOS partition contains all Windows Phone operating system components, Data partition stores all user's data, third-party applications and all application's states. The Removable User Data partition is considered by Windows Phone as a separate volume and refers to all data stored in the SD Card (on devices that supports SD cards). Each of the previously named partitions respects a folder layout and can be mapped to their root folders with predefined Access Control Lists (ACL). Each ACL is in the form of a list of access control entries (ACE) and each ACE identifies the user account to which it applies (trustee) and specifies the access right allowed, denied or audited for that trustee. Windows Phone 8.1 is an extremely challenging and different; forensic tools and techniques should be used in order to gather evidences. One of the interesting techniques is side loading, where an agent to extract contacts and appointments from a WP8.1 device. To extract phonebook and appointments entries we will use WP Logical, which is a contacts and appointments acquisition tool designed to run under Windows Phone 8.1, once deployed and executed will create a folder with the name WPLogical_MDY__HMMSS_PM/AM under the public folder PhonePictures where M=Month, D=Day, Y=Year, H=hour, MM=Minutes and SS= Seconds of the extraction date. Inside the created folder you can find appointments__MDY__HMMSS_PM/AM.html and contacts_MDY__HMMSS_PM/AM.html. WP Logical will extract the following information (if found) regarding each appointment starting from 01/01/CurrentYear at 00:00:00 to 31/12/CurrentYear at 00:00:00: Subject Location Organizer Invitees Start time (UTC) Original start time Duration (in hours) Sensitivity Replay time Is organized by user? Is canceled? More details And the following information about each found contact: Display name First name Middle name Last name Phones (types: personal, office, home, and numbers) Important dates Emails (types: personal, office, home, and numbers) Websites Job info Addresses Notes Thumbnail WP Logical also allows the extraction of some device related information, such as Phone time zone, device's friendly name, Store Keeping Unit (SKU), and so on. Windows Phone 8.1 is relatively strict regarding application deployment; WP Logical can be deployed in two ways: Upload the compiled agent to Windows Store and get it signed by Microsoft, after that it will be available in the store for download. Deploy the agent directly to a developer unlocked device using Windows Phone Application Deployment utility. Summary In this article, we looked at forensics for iOS and Android devices. We also looked at some low-level forensic techniques. Resources for Article: Further resources on this subject: Mobile Forensics and Its Challanges [article] Introduction to Mobile Forensics [article] Forensics Recovery [article]
Read more
  • 0
  • 0
  • 18661

Packt
23 May 2016
13 min read
Save for later

Bang Bang – Let's Make It Explode

Packt
23 May 2016
13 min read
In this article by Justin Plowman, author of the book 3D Game Design with Unreal Engine 4 and Blender, We started with a basic level that, when it comes right down to it, is simply two rooms connected by a hallway with a simple crate. From humble beginnings our game has grown, as have our skills. Our simple cargo ship leads the player to a larger space station level. This level includes scripted events to move the story along and a game asset that looks great and animates. However, we are not done. How do we end our journey? We blow things up, that's how! In this article we will cover the following topics: Using class blueprints to bring it all together Creating an explosion using sound effects Adding particle effects (For more resources related to this topic, see here.) Creating a class blueprint to tie it all together We begin with the first step to any type of digital destruction, creation. we have created a disturbing piece of ancient technology. The Artifact stands as a long forgotten terror weapon of another age, somehow brought forth by an unknown power. But we know the truth. That unknown power is us, and we are about to import all that we need to implement the Artifact on the deck of our space station. Players beware! Take a look at the end result. To get started we will need to import the Artifact body, the Tentacle, and all of the texture maps from Substance Painter. Let's start with exporting the main body of the Artifact. In Blender, open our file with the complete Artifact. The FBX file format will allow us to export both the completed 3d model and the animations we created, all in a single file. Select the Artifact only. Since it is now bound to the skeleton we created, the bones and the geometry should all be one piece. Now press Alt+S to reset the scale of our game asset. Doing this will make sure that we won't have any problems weird scaling problems when we import the Artifact into Unreal. Head to the File menu and select Export. Choose FBX as our file format. On the first tab of the export menu, select the check box for Selected Objects. This will make sure that we get just the Artifact and not the Tentacle. On the Geometries tab, change the Smoothing option to Faces. Name your file and click Export! Alright, we now have the latest version of the Artifact exported out as an FBX. With all the different processes described in its pages, it makes a great reference! Time to bring up Unreal. Open the game engine and load our space station level. It's been a while since we've taken a look at it and there is no doubt in my mind that you've probably thought of improvements and new sections you would love to add. Don't forget them! Just set them aside for now. Once we get our game assets in there and make them explode, you will have plenty of time to add on. Time to import into Unreal! Before we begin importing our pieces, let's create a folder to hold our custom assets. Click on the Content folder in the Content Browser and then right click. At the top of the menu that appears, select New Folder and name it CustomAssets. It's very important not to use spaces or special characters (besides the underscore). Select our new folder and click Import. Select the Artifact FBX file. At the top of the Import menu, make sure Import as Skeletal and Import Mesh are selected. Now click the small arrow at the bottom of the section to open the advanced options. Lastly, turn on the check box to tell Unreal to use T0 As Reference Pose. A Reference Pose is the starting point for any animations associated with a skeletal mesh. Next, take a look at the Animation section of the menu. Turn on Import Animations to tell Unreal to bring in our open animation for the Artifact. Once all that is done, it's time to click Import! Unreal will create a Skeletal Mesh, an Animation, a Physics Asset, and a Skeleton asset for the Artifact. Together, these pieces make up a fully functioning skeletal mesh that can be used within our game. Take a moment and repeat the process for the Tentacle, again being careful to make sure to export only selected objects from Blender. Next, we need to import all of our texture maps from Substance Painter. Locate the export folder for Substance Painter. By default it is located in the Documents folder, under Substance Painter, and finally under Export. Head back into Unreal and then bring up the Export folder from your computer's task bar. Click and drag each of the texture maps we need into the Content Browser. Unreal will import them automatically. Time to set them all up as a usable material! Right click in the Content Browser and select Material from the Create Basic Asset section of the menu. Name the material Artifact_MAT. This will open a Material Editor window. Creating Materials and Shaders for video games is an art form all its own. Here I will talk about creating Materials in basic terms but I would encourage you to check out the Unreal documentation and open up some of the existing materials in the Starter Content folder and begin exploring this highly versatile tool. So we need to add our textures to our new Material. An easy way to add texture maps to any Material is to click and drag them from the Content Browser into the Material Editor. This will create a node called a Texture Sample which can plug into the different sockets on the main Material node. Now to plug in each map. Drag a wire from each of the white connections on the right side of each Texture Sample to its appropriate slot on the main Material node. The Metallic and Roughness texture sample will be plugged into two slots on the main node. Let's preview the result. Back in the Content Browser, select the Artifact. Then in the Preview Window of the Material Editor, select the small button the far right that reads Set the Preview Mesh based on the current Content Browser selection. The material has come out just a bit too shiny. The large amount of shine given off by the material is called the specular highlight and is controlled by the Specular connection on the main Material node. If we check the documentation, we can see that this part of the node accepts a value between 0 and 1. How might we do this? Well, the Material Editor has a Constant node that allows us to input a number and then plug that in wherever we may need it. This will work perfectly! Search for a Constant in the search box of the Palette, located on the right side of the Material Editor. Drag it into the area with your other nodes and head over to the Details panel. In the Value field, try different values between 0 and 1 and preview the result. I ended up using 0.1. Save your changes. Time to try it out on our Artifact! Double click on the Skeletal Mesh to open the Skeletal Mesh editor window. On the left hand side, look for the LOD0 section of the menu. This section has an option to add a material (I have highlighted it in the image above). Head back to the content browser and select our Artifact_MAT material. Now select the small arrow in the LOD0 box to apply the selection to the Artifact. How does it look? Too shiny? Not shiny enough? Feel free to adjust our Constant node in the material until you are able to get the result you want. When you are happy, repeat the process for the Tentacle. You will import it as a static mesh (since it doesn't have any animations) and create a material for it made out of the texture maps you created in Substance Painter. Now we will use a Class Blueprint for final assembly. Class Blueprints are a form of standalone Blueprint that allows us to combine art assets with programming in an easy and, most importantly, reusable package. For example, the player is a class blueprint as it combines the player character skeletal mesh with blueprint code to help the player move around. So how and when might we use class blueprints vs just putting the code in the level blueprint? The level blueprint is great for anything that is specific to just that level. Such things would include volcanoes on a lava based level or spaceships in the background of our space station level. Class blueprints work great for building objects that are self-contained and repeatable, such as doors, enemies, or power ups. These types of items would be used frequently and would have a place in several levels of a game. Let's create a class blueprint for the Artifact. Click on the Blueprints button and select the New Empty Blueprints Tab. This will open the Pick Parent Class menu. Since we creating a prop and not something that the player needs to control directly, select the Actor Parent Class. The next screen will ask us to name our new class and for a location to save it. I chose to save it in my CustomAssets folder and named it Artifact_Blueprint. Welcome to the Class Blueprint Editor. Similar to other editor windows within Unreal, the Class Blueprint editor has both a Details panel and a Palette. However, there is a panel that is new to us. The Components panel contains a list of the art that makes up a class blueprint. These components are various pieces that make up the whole object. For our Artifact, this would include the main piece itself, any number of tentacles, and a collision box. Other components that can be added include particle emitters, audio, and even lights. Let's add the Artifact. In the Components section, click the Add Component button and select Skeletal Mesh from the drop down list. You can find it in the Common section. This adds a blank Skeletal Mesh to the Viewport and the Components list. With it selected, check out the Details panel. In the Mesh section is an area to assign the skeletal mesh you wish it to be. Back in the Content Browser, select the Artifact. Lastly, back in the Details panel of the Blueprint Editor click the small arrow next to the Skeletal Mesh option to assign the Artifact. It should now appear in the viewport. Back to the Components list. Let's add a Collision Box. Click Add Component and select Collision Box from the Collision section of the menu. Click it and in the Details panel, increase the Box Extents to a size that would allow the player to enter within its bounds. I used 180 for x, y, and z. Repeat the last few steps and add the Tentacles to the Artifact using the Add Component menu. We will use the Static Mesh option. The design calls for 3, but add more if you like. Time to give this class blueprint a bit of programming. We want the player to be able to walk up to the Artifact and press the E key to open it. We used a Gate ton control the flow of information through the blueprint. However, Gates don't function the same within class blueprints so we require a slightly different approach. The first step in the process is to use the Enable Input and Disable Input nodes to allow the player to use input keys when they are within our box collision. Using the search box located within our Palette, grab an Enable Input and a Disable Input. Now we need to add our trigger events. Click on the Box variable within the Variable section of the My Blueprint panel. This changes the Details panel to display a list of all the Events that can be created for this component. Click the + button next to the OnComponentBeginOverlap and the OnComponentEndOverlap events. Connect the OnComponentBeginOverlap event to the Enable Input node and the OnComponentEndOverlap event to the Disable Input node. Next, create an event for the player pressing the E key by searching for it and dragging it in from the Palette. To that, we will add a Do Once node. This node works similar to a Gate in that it restricts the flow of information through the network, but it does allow the action to happen once before closing. This will make it so the player can press E to open the Artifact but the animation will only play once. Without it a player can press E as many times as they want, playing the animation over and over again. Its fun for a while since it makes it look like a mouth trying to eat you, but it's not our original intention (I might have spent some time pressing it repeatedly and laughing hysterically). Do Once can be easily found in the Palette. Lastly, we will need a Play Animation node. There are two versions so be sure to grab this node from the Skeletal Mesh section of your search so that its target is Skeletal Mesh Component. Connect the input E event to the Do Once node and the Do Once node to the Play Animation. Once last thing to complete this sequence. We need to set the target and animation to play on the Play Animation node. So the target will be our Skeletal Mesh component. Click on the Artifact component in the Components list and drag it into the Blueprint window and plug that into the Target on our Play Animation. Lastly, click the drop down under the New Anim to Play option on the Play Animation node and select our animation of the Artifact opening. We're done! Let's save all of our files and test this out. Drag the Artifact into our Space Station and position it in the Importer/Export Broker's shop. Build the level and then drop in and test. Did it open? Does it need more tentacles? Debug and refine it until it is exactly what you want. Summary This article provides an overview about using class blueprints, creating an explosion and adding particle effects. Resources for Article: Further resources on this subject: Dynamic Graphics [article] Lighting basics [article] VR Build and Run [article]
Read more
  • 0
  • 0
  • 10769

article-image-virtualizing-hosts-and-applications
Packt
20 May 2016
19 min read
Save for later

Virtualizing Hosts and Applications

Packt
20 May 2016
19 min read
In this article by Jay LaCroix, the author of the book Mastering Ubuntu Server, you will learn how there have been a great number of advancements in the IT space in the last several decades, and a few technologies have come along that have truly revolutionized the technology industry. The author is sure few would argue that the Internet itself is by far the most revolutionary technology to come around, but another technology that has created a paradigm shift in IT is virtualization. It evolved the way we maintain our data centers, allowing us to segregate workloads into many smaller machines being run from a single server or hypervisor. Since Ubuntu features the latest advancements of the Linux kernel, virtualization is actually built right in. After installing just a few packages, we can create virtual machines on our Ubuntu Server installation without the need for a pricey license agreement or support contract. In this article, Jay will walk you through creating, running, and managing Docker containers. (For more resources related to this topic, see here.) Creating, running, and managing Docker containers Docker is a technology that seemed to come from nowhere and took the IT world by storm just a few years ago. The concept of containerization is not new, but Docker took this concept and made it very popular. The idea behind a container is that you can segregate an application you'd like to run from the rest of your system, keeping it sandboxed from the host operating system, while still being able to use the host's CPU and memory resources. Unlike a virtual machine, a container doesn't have a virtual CPU and memory of its own, as it shares resources with the host. This means that you will likely be able to run more containers on a server than virtual machines, since the resource utilization would be lower. In addition, you can store a container on a server and allow others within your organization to download a copy of it and run it locally. This is very useful for developers developing a new solution and would like others to test or run it. Since the Docker container contains everything the application needs to run, it's very unlikely that a systematic difference between one machine or another will cause the application to behave differently. The Docker server, also known as Hub, can be used remotely or locally. Normally, you'd pull down a container from the central Docker Hub instance, which will make various containers available, which are usually based on a Linux distribution or operating system. When you download it locally, you'll be able to install packages within the container or make changes to its files, just as if it were a virtual machine. When you finish setting up your application within the container, you can upload it back to Docker Hub for others to benefit from or your own local Hub instance for your local staff members to use. In some cases, some developers even opt to make their software available to others in the form of containers rather than creating distribution-specific packages. Perhaps they find it easier to develop a container that can be used on every distribution than build separate packages for individual distributions. Let's go ahead and get started. To set up your server to run or manage Docker containers, simply install the docker.io package: # apt-get install docker.io Yes, that's all there is to it. Installing Docker has definitely been the easiest thing we've done during this entire article. Ubuntu includes Docker in its default repositories, so it's only a matter of installing this one package. You'll now have a new service running on your machine, simply titled docker. You can inspect it with the systemctl command, as you would any other: # systemctl status docker Now that Docker is installed and running, let's take it for a test drive. Having Docker installed gives us the docker command, which has various subcommands to perform different functions. Let's try out docker search: # docker search ubuntu What we're doing with this command is searching Docker Hub for available containers based on Ubuntu. You could search for containers based on other distributions, such as Fedora or CentOS, if you wanted. The command will return a list of Docker images available that meet your search criteria. The search command was run as root. This is required, unless you make your own user account a member of the docker group. I recommend you do that and then log out and log in again. That way, you won't need to use root anymore. From this point on, I won't suggest using root for the remaining Docker examples. It's up to you whether you want to set up your user account with the docker group or continue to run docker commands as root. To pull down a docker image for our use, we can use the docker pull command, along with one of the image names we saw in the output of our search command: docker pull ubuntu With this command, we're pulling down the latest Ubuntu container image available on Docker Hub. The image will now be stored locally, and we'll be able to create new containers from it. To create a new container from our downloaded image, this command will do the trick: docker run -it ubuntu:latest /bin/bash Once you run this command, you'll notice that your shell prompt immediately changes. You're now within a shell prompt from your container. From here, you can run commands you would normally run within a real Ubuntu machine, such as installing new packages, changing configuration files, and so on. Go ahead and play around with the container, and then we'll continue on with a bit more theory on how it actually works. There are some potentially confusing aspects of Docker we should get out of the way first before we continue with additional examples. The most likely thing to confuse newcomers to Docker is how containers are created and destroyed. When you execute the docker run command against an image you've downloaded, you're actually creating a container. Each time you use the docker run command, you're not resuming the last container, but creating a new one. To see this in action, run a container with the docker run command provided earlier, and then type exit. Run it again, and then type exit again. You'll notice that the prompt is different each time you run the command. After the root@ portion of the bash prompt within the container is a portion of a container ID. It'll be different each time you execute the docker run command, since you're creating a new container with a new ID each time. To see the number of containers on your server, execute the docker info command. The first line of the output will tell you how many containers you have on your system, which should be the number of times you've run the docker run command. To see a list of all of these containers, execute the docker ps -a command: docker ps -a The output will give you the container ID of each container, the image it was created from, the command being run, when the container was created, its status, and any ports you may have forwarded. The output will also display a randomly generated name for each container, and these names are usually quite wacky. As I was going through the process of creating containers while writing this section, the codenames for my containers were tender_cori, serene_mcnulty, and high_goldwasser. This is just one of the many quirks of Docker, and some of these can be quite hilarious. The important output of the docker ps -a command is the container ID, the command, and the status. The ID allows you to reference a specific container. The command lets you know what command was run. In our example, we executed /bin/bash when we started our containers. Using the ID, we can resume a container. Simply execute the docker start command with the container ID right after. Your command will end up looking similar to the following: docker start 353c6fe0be4d The output will simply return the ID of the container and then drop you back to your shell prompt. Not the shell prompt of your container, but that of your server. You might be wondering at this point, then, how you get back to the shell prompt for the container. We can use docker attach for that: docker attach 353c6fe0be4d You should now be within a shell prompt inside your container. If you remember from earlier, when you type exit to disconnect from your container, the container stops. If you'd like to exit the container without stopping it, press CTRL + P and then CTRL + Q on your keyboard. You'll return to your main shell prompt, but the container will still be running. You can see this for yourself by checking the status of your containers with the docker ps -a command. However, while these keyboard shortcuts work to get you out of the container, it's important to understand what a container is and what it isn't. A container is not a service running in the background, at least not inherently. A container is a collection of namespaces, such as a namespace for its filesystem or users. When you disconnect without a process running within the container, there's no reason for it to run, since its namespace is empty. Thus, it stops. If you'd like to run a container in a way that is similar to a service (it keeps running in the background), you would want to run the container in detached mode. Basically, this is a way of telling your container, "run this process, and don't stop running it until I tell you to." Here's an example of creating a container and running it in detached mode: docker run -dit ubuntu /bin/bash Normally, we use the -it options to create a container. This is what we used a few pages back. The -i option triggers interactive mode, while the -t option gives us a psuedo-TTY. At the end of the command, we tell the container to run the Bash shell. The -d option runs the container in the background. It may seem relatively useless to have another Bash shell running in the background that isn't actually performing a task. But these are just simple examples to help you get the hang of Docker. A more common use case may be to run a specific application. In fact, you can even run a website from a Docker container by installing and configuring Apache within the container, including a virtual host. The question then becomes this: how do you access the container's instance of Apache within a web browser? The answer is port redirection, which Docker also supports. Let's give this a try. First, let's create a new container in detached mode. Let's also redirect port 80 within the container to port 8080 on the host: docker run -dit -p 8080:80 ubuntu /bin/bash The command will output a container ID. This ID will be much longer than you're accustomed to seeing, because when we run docker ps -a, it only shows shortened container IDs. You don't need to use the entire container ID when you attach; you can simply use part of it, so long as it's long enough to be different from other IDs—like this: docker attach dfb3e Here, I've attached to a container with an ID that begins with dfb3e. I'm now attached to a Bash shell within the container. Let's install Apache. We've done this before, but to keep it simple, just install the apache2 package within your container, we don't need to worry about configuring the default sample web page or making it look nice. We just want to verify that it works. Apache should now be installed within the container. In my tests, the apache2 daemon wasn't automatically started as it would've been on a real server instance. Since the latest container available on Docker Hub for Ubuntu hasn't yet been upgraded to 16.04 at the time of writing this (it's currently 14.04), the systemctl command won't work, so we'll need to use the legacy start command for Apache: # /etc/init.d/apache2 start We can similarly check the status, to make sure it's running: # /etc/init.d/apache2 status Apache should be running within the container. Now, press CTRL + P and then CTRL + Q to exit the container, but allow it to keep running in the background. You should be able to visit the sample Apache web page for the container by navigating to localhost:8080 in your web browser. You should see the default "It works!" page that comes with Apache. Congratulations, you're officially running an application within a container! Before we continue, think for a moment of all the use cases you can use Docker for. It may seem like a very simple concept (and it is), but it allows you to do some very powerful things. I'll give you a personal example. At a previous job, I worked with some embedded Linux software engineers, who each had their preferred Linux distribution to run on their workstation computers. Some preferred Ubuntu, others preferred Debian, and a few even ran Gentoo. For developers, this poses a problem—the build tools are different in each distribution, because they all ship different versions of all development packages. The application they developed was only known to compile in Debian, and newer versions of the GNU Compiler Collection (GCC) compiler posed a problem for the application. My solution was to provide each developer a Docker container based on Debian, with all the build tools baked in that they needed to perform their job. At this point, it no longer mattered which distribution they ran on their workstations. The container was the same no matter what they were running. I'm sure there are some clever use cases you can come up with. Anyway, back to our Apache container: it's now running happily in the background, responding to HTTP requests over port 8080 on the host. But, what should we do with it at this point? One thing we can do is create our own image from it. Before we do, we should configure Apache to automatically start when the container is started. We'll do this a bit differently inside the container than we would on an actual Ubuntu server. Attach to the container, and open the /etc/bash.bashrc file in a text editor within the container. Add the following to the very end of the file: /etc/init.d/apache2 start Save the file, and exit your editor. Exit the container with the CTRL + P and CTRL + Q key combinations. We can now create a new image of the container with the docker commit command: docker commit <Container ID> ubuntu:apache-server This command will return to us the ID of our new image. To view all the Docker images available on our machine, we can run the docker images command to have Docker return a list. You should see the original Ubuntu image we downloaded, along with the one we just created. We'll first see a column for the repository the image came from. In our case, it's Ubuntu. Next, we can see the tag. Our original Ubuntu image (the one we used docker pull command to download) has a tag of latest. We didn't specify that when we first downloaded it, it just defaulted to latest. In addition, we see an image ID for both, as well as the size. To create a new container from our new image, we just need to use docker run but specify the tag and name of our new image. Note that we may already have a container listening on port 8080, so this command may fail if that container hasn't been stopped: docker run -dit -p 8080:80 ubuntu:apache-server /bin/bash Speaking of stopping a container, I should probably show you how to do that as well. As you could probably guess, the command is docker stop followed by a container ID. This will send the SIGTERM signal to the container, followed by SIGKILL if it doesn't stop on its own after a delay: docker stop <Container ID> To remove a container, issue the docker rm command followed by a container ID. Normally, this will not remove a running container, but it will if you add the -f option. You can remove more than one docker container at a time by adding additional container IDs to the command, with a space separating each. Keep in mind that you'll lose any unsaved changes within your container if you haven't committed the container to an image yet: docker rm <Container ID> The docker rm command will not remove images. If you want to remove a docker image, use the docker rmi command followed by an image ID. You can run the docker image command to view images stored on your server, so you can easily fetch the ID of the image you want to remove. You can also use the repository and tag name, such as ubuntu:apache-server, instead of the image ID. If the image is in use, you can force its removal with the -f option: docker rmi <Image ID> Before we conclude our look into Docker, there's another related concept you'll definitely want to check out: Dockerfiles. A Dockerfile is a neat way of automating the building of docker images, by creating a text file with a set of instructions for their creation. The easiest way to set up a Dockerfile is to create a directory, preferably with a descriptive name for the image you'd like to create (you can name it whatever you wish, though) and inside it create a file named Dockerfile. Following is a sample—copy this text into your Dockerfile and we'll look at how it works: FROM ubuntu MAINTAINER Jay <jay@somewhere.net> # Update the container's packages RUN apt-get update; apt-get dist-upgrade # Install apache2 and vim RUN apt-get install -y apache2 vim # Make Apache automatically start-up` RUN echo "/etc/init.d/apache2 start" >> /etc/bash.bashrc Let's go through this Dockerfile line by line to get a better understanding of what it's doing: FROM ubuntu We need an image to base our new image on, so we're using Ubuntu as a base. This will cause Docker to download the ubuntu:latest image from Docker Hub if we don't already have it downloaded: MAINTAINER Jay <myemail@somewhere.net> Here, we're setting the maintainer of the image. Basically, we're declaring its author: # Update the container's packages Lines beginning with a hash symbol (#) are ignored, so we are able to create comments within the Dockerfile. This is recommended to give others a good idea of what your Dockerfile does: RUN apt-get update; apt-get dist-upgrade -y With the RUN command, we're telling Docker to run a specific command while the image is being created. In this case, we're updating the image's repository index and performing a full package update to ensure the resulting image is as fresh as can be. The -y option is provided to suppress any requests for confirmation while the command runs: RUN apt-get install -y apache2 vim Next, we're installing both apache2 and vim. The vim package isn't required, but I personally like to make sure all of my servers and containers have it installed. I mainly included it here to show you that you can install multiple packages in one line: RUN echo "/etc/init.d/apache2 start" >> /etc/bash.bashrc Earlier, we copied the startup command for the apache2 daemon into the /etc/bash.bashrc file. We're including that here so that we won't have to do this ourselves when containers are crated from the image. To build the image, we can use the docker build command, which can be executed from within the directory that contains the Dockerfile. What follows is an example of using the docker build command to create an image tagged packt:apache-server: docker build -t packt:apache-server Once you run this command, you'll see Docker create the image for you, running each of the commands you asked it to. The image will be set up just the way you like. Basically, we just automated the entire creation of the Apache container we used as an example in this section. Once this is complete, we can create a container from our new image: docker run -dit -p 8080:80 packt:apache-server /bin/bash Almost immediately after running the container, the sample Apache site will be available on the host. With a Dockerfile, you'll be able to automate the creation of your Docker images. There's much more you can do with Dockerfiles though; feel free to peruse Docker's official documentation to learn more. Summary In this article, we took a look at virtualization as well as containerization. We began by walking through the installation of KVM as well as all the configuration required to get our virtualization server up and running. We also took a look at Docker, which is a great way of virtualizing individual applications rather than entire servers. We installed Docker on our server, and we walked through managing containers by pulling down an image from Docker Hub, customizing our own images, and creating Dockerfiles to automate the deployment of Docker images. We also went over many of the popular Docker commands to manage our containers. Resources for Article: Further resources on this subject: Configuring and Administering the Ubuntu Server[article] Network Based Ubuntu Installations[article] Making the most of Ubuntu through Windows Proxies[article]
Read more
  • 0
  • 0
  • 25196
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-visualizations-using-ccc
Packt
20 May 2016
28 min read
Save for later

Visualizations Using CCC

Packt
20 May 2016
28 min read
In this article by Miguel Gaspar the author of the book Learning Pentaho CTools you will learn about the Charts Component Library in detail. The Charts Components Library is not really a Pentaho plugin, but instead is a Chart library that Webdetails created some years ago and that Pentaho started to use on the Analyzer visualizations. It allows a great level of customization by changing the properties that are applied to the charts and perfectly integrates with CDF, CDE, and CDA. (For more resources related to this topic, see here.) The dashboards that Webdetails creates make use of the CCC charts, usually with a great level of customization. Customizing them is a way to make them fancy and really good-looking, and even more importantly, it is a way to create a visualization that best fits the customer/end user's needs. We really should be focused on having the best visualizations for the end user, and CCC is one of the best ways to achieve this, but do this this you need to have a very deep knowledge of the library, and know how to get amazing results. I think I could write an entire book just about CCC, and in this article I will only be able to cover a small part of what I like, but I will try to focus on the basics and give you some tips and tricks that could make a difference.I'll be happy if I can give you some directions that you follow, and then you can keep searching and learning about CCC. An important part of CCC is understanding some properties such as the series in rows or the crosstab mode, because that is where people usually struggle at the start. When you can't find a property to change some styling/functionality/behavior of the charts, you might find a way to extend the options by using something called extension points, so we will also cover them. I also find the interaction within the dashboard to be an important feature.So we will look at how to use it, and you will see that it's very simple. In this article,you will learn how to: Understand the properties needed to adapt the chart to your data source results Use the properties of a CCC chart Create a CCC chat by using the JavaScript library Make use of internationalization of CCC charts See how to handle clicks on charts Scale the base axis Customize the tooltips Some background on CCC CCC is built on top of Protovis, a JavaScript library that allows you to produce visualizations just based on simple marks such as bars, dots, and lines, among others, which are created through dynamic properties based on the data to be represented. You can get more information on this at: http://mbostock.github.io/protovis/. If you want to extend the charts with some elements that are not available you can, but it would be useful to have an idea about how Protovis works.CCC has a great website, which is available at http://www.webdetails.pt/ctools/ccc/, where you can see some samples including the source code. On the page, you can edit the code, change some properties, and click the apply button. If the code is valid, you will see your chart update.As well as that, it provides documentation for almost all of the properties and options that CCC makes available. Making use of the CCC library in a CDF dashboard As CCC is a chart library, you can use it as you would use it on any other webpage, by using it like the samples on CCC webpages. But CDF also provides components that you can implement to use a CCC chart on a dashboard and fully integrate with the life cycle of the dashboard. To use a CCC chart on CDF dashboard, the HTML that is invoked from the XCDF file would look like the following(as we already covered how to build a CDF dashboard, I will not focus on that, and will mainly focus on the JavaScript code): <div class="row"> <div class="col-xs-12"> <div id="chart"/> </div> </div> <script language="javascript" type="text/javascript">   require(['cdf/Dashboard.Bootstrap', 'cdf/components/CccBarChartComponent'], function(Dashboard, CccBarChartComponent) {     var dashboard = new Dashboard();     var chart = new CccBarChartComponent({         type: "cccBarChart",         name: "cccChart",         executeAtStart: true,         htmlObject: "chart",         chartDefinition: {             height: 200,             path: "/public/…/queries.cda",             dataAccessId: "totalSalesQuery",             crosstabMode: true,             seriesInRows: false, timeSeries: false             plotFrameVisible: false,             compatVersion: 2         }     });     dashboard.addComponent(chart);     dashboard.init();   }); </script> The most important thing here is the use of the CCC chart component that we have covered as an example in which we have covered it's a bar chart. We can see by the object that we are instantiating CccBarChartComponent as also by the type that is cccBarChart. The previous dashboard will execute the query specified as dataAccessId of the CDA file set on the property path, and render the chart on the dashboard. We are also saying that its data comes from the query in the crosstab mode, but the base axis should not be atimeSeries. There are series in the columns, but don't worry about this as we'll be covering it later. The existing CCC components that you are able to use out of the box inside CDF dashboards are as follows. Don't forget that CCC has plenty of charts, so the sample images that you will see in the following table are just one example of the type of charts you can achieve. CCC Component Chart Type Sample Chart CccAreaChartComponent cccAreaChart   CccBarChartComponent cccBarChart http://www.webdetails.pt/ctools/ccc/#type=bar CccBoxplotChartComponent cccBoxplotChart http://www.webdetails.pt/ctools/ccc/#type=boxplot CccBulletChartComponent cccBulletChart http://www.webdetails.pt/ctools/ccc/#type=bullet CccDotChartComponent cccDotChart http://www.webdetails.pt/ctools/ccc/#type=dot CccHeatGridChartComponent cccHeatGridChart http://www.webdetails.pt/ctools/ccc/#type=heatgrid CccLineChartComponent cccLineChart http://www.webdetails.pt/ctools/ccc/#type=line CccMetricDotChartComponent cccMetricDotChart http://www.webdetails.pt/ctools/ccc/#type=metricdot CccMetricLineChartComponent cccMetricLineChart   CccNormalizedBarChartComponent cccNormalizedBarChart   CccParCoordChartComponent cccParCoordChart   CccPieChartComponent cccPieChart http://www.webdetails.pt/ctools/ccc/#type=pie CccStackedAreaChartComponent cccStackedAreaChart http://www.webdetails.pt/ctools/ccc/#type=stackedarea CccStackedDotChartComponent cccStackedDotChart   CccStackedLineChartComponent cccStackedLineChart http://www.webdetails.pt/ctools/ccc/#type=stackedline CccSunburstChartComponent cccSunburstChart http://www.webdetails.pt/ctools/ccc/#type=sunburst CccTreemapAreaChartComponent cccTreemapAreaChart http://www.webdetails.pt/ctools/ccc/#type=treemap CccWaterfallAreaChartComponent cccWaterfallAreaChart http://www.webdetails.pt/ctools/ccc/#type=waterfall In the sample code, you will find a property calledcompatMode that hasa value of 2 set. This will make CCC work as a revamped version that delivers more options, a lot of improvements, and makes it easier to use. Mandatory and desirable properties Among otherssuch as name, datasource, and htmlObject, there are other properties of the charts that are mandatory. The height is really important, because if you don't set the height of the chart, you will not fit the chart in the dashboard. The height should also be specified in pixels. If you don't set the width of the component, or to be more precise, then the chart will grab the width of the element where it's being rendered it will grab the width of the HTML element with the name specified in the htmlObject property. The seriesInRows, crosstabMode, and timeseriesproperties are optional, but depending on the kind of chart you are generating, you might want to specify them. The use of these properties becomes clear if we can also see the output of the queries we are executing. We need to get deeper into the properties that are related to the data mapping to visual elements. Mapping data We need to be aware of the way that data mapping is done in the chart.You can understand how it works if you can imagine data input as a table. CCC can receive the data as two different structures: relational and crosstab. If CCC receives data as crosstab,it will translate it to a relational structure. You can see this in the following examples. Crosstab The following table is an example of the crosstab data structure: Column Data 1 Column Data 2 Row Data 1 Measure Data 1.1 Measure Data 1.2 Row Data 2 Measure Data 2.1 Measure Data 2.2 Creating crosstab queries To create a crosstab query, usually you can do this with the group when using SQL, or just use MDX, which allows us to easily specify a set for the columns and for the rows. Just by looking at the previous and following examples, you should be able to understand that in the crosstab structure (the previous), columns and rows are part of the result set, while in the relational format (the following), column headers or headers are not part of the result set, but are part of the metadata that is returned from the query. The relationalformat is as follows: Column Row Value Column Data 1 Row Data 1 Measure Data 1.1 Column Data 2 Row Data 1 Measure Data 2.1 Column Data 1 Row Data 2 Measure Data 1.2 Column Data 2 Row Data 2 Measure Data 2.1   The preceding two data structures represent the options when setting the properties crosstabMode and seriesInRows. The crosstabMode property To better understand these concepts, we will make use of a real example. This property, crosstabMode, is easy to understand when comparing the two that represents the results of two queries. Non-crosstab (Relational): Markets Sales APAC 1281705 EMEA 50028224 Japan 503957 NA 3852061 Crosstab: Markets 2003 2004 2005 APAC 3529 5938 3411 EMEA 16711 23630 9237 Japan 2851 1692 380 NA 13348 18157 6447   In the previous tables, you can see that on the left-handside you can find the values of sales from each of the territories. The only relevant information relative to the values presented in only one variable, territories. We can say that we are able to get all the information just by looking at the rows, where we can see a direct connection between markets and the sales value. In the table presented on the right, you will find a value for each territory/year, meaning that the values presented, and in the sample provided in the matrix, are dependent on two variables, which are the territory in the rows and the years in the columns. Here we need both the rows andthe columns to know what each one of the values represents. Relevant information can be found in the rows and the columns, so this a crosstab. The crosstabs display the joint distribution of two or more variables, and are usually represented in the form of a contingency table in a matrix. When the result of a query is dependent only on one variable, then you should set the crosstabModeproperty to false. When it is dependent on 2 or more variables, you should set the crosstabMode property to false, otherwise CCC will just use the first two columns like in the non-crosstab example. The seriesInRows property Now let's use the same examplewhere we have a crosstab: The previous image shows two charts: the one on the left is a crosstab with the series in the rows, and the one on the right is also crosstab but the series are not in the rows (the series are in the columns).When the crosstab is set to true, it means that the measure column title can be translated as a series or a category,and that's determined by the property seriesInRows. If this property is set to true, then it will read the series from the rows, otherwise it will read the series from the columns. If the crosstab is set to false, the community chart component is expecting a row to correspond exactly to one data point, and two or three columns can be returned. When three columns are returned, they can be a category, series and dataor series, category and data and that's determined by the seriesInRows property. When set to true, CCC will expect the structure to have three columns such as category, series, and data. When it is set to false, it will expect them to be series, category, and data. A simple table should give you a quicker reference, so here goes: crosstabMode seriesInRows Description true true The column titles will act as category values while the series values are represented as data points of the first column. true false The column titles will act as series value while the category/category values are represented as data points of the first column. false true The column titles will act as category values while the series values are represented as data points of the first column. false false The column titles will act as category values while the series values are represented as data points of the first column. The timeSeries and timeSeriesFormat properties The timeSeries property defines whether the data to be represented by the chart is discrete or continuous. If we want to present some values over time, then the timeSeries property should be set to true. When we set the chart to be timeSeries, we also need to set another property to tell CCC how it should interpret the dates that come from the query.Check out the following image for timeSeries and timeSeriesFormat: The result of one of the queries has the year and the abbreviated month name separate by -, like 2015-Nov. For the chart to understand it as a date, we need to specify the format by setting the property timeSeriesFomart, which in our example would be %Y-%b, where %Y is the year is represented by four digits, and %b is the abbreviated month name. The format should be specified using the Protovis format that follows the same format as strftime in the C programming language, aside from some unsupported options. To find out what options are available, you should take a look at the documentation, which you will find at: https://mbostock.github.io/protovis/jsdoc/symbols/pv.Format.date.html. Making use of CCC inCDE There are a lot of properties that will use a default value, and you can find out aboutthem by looking at the documentation or inspecting the code that is generated by CDE when you use the charts components. By looking at the console log of your browser, you should also able to understand and get some information about the properties being used by default and/or see whether you are using a property that does not fit your needs. The use of CCC charts in CDE is simpler, just because you may not need to code. I am only saying may because to achieve quicker results, you may apply some code and make it easier to share properties among different charts or type of charts. To use a CCC chart, you just need to select the property that you need to change and set its value by using the dropdown or by just setting the value: The previous image shows a group of properties with the respective values on the right side. One of the best ways to start to get used to the CCC properties is to use the CCC page available as part of the Webdetails page: http://www.webdetails.pt/ctools/ccc. There you will find samples and the properties that are being used for each of the charts. You can use the dropdown to select different kinds of charts from all those that are available inside CCC. You also have the ability to change the properties and update the chart to check the result immediately. What I usually do, as it's easier and faster, is to change the properties here and check the results and then apply the necessary values for each of the properties in the CCC charts inside the dashboards. In the following samples, you will also find documentation about the properties, see where the properties are separated by sections of the chart, and after that you will find the extension points. On the site, when you click on a property/option, you will be redirected to another page where you will find the documentation and how to use it. Changing properties in the preExecution or postFetch We are able to change the properties for the charts, as with any other component. Inside the preExecution, this, refers to the component itself, so we will have access to the chart's main object, which we can also manipulate and add, remove, and change options. For instance, you can apply the following code: function() {    var cdProps = {         dotsVisible: true,         plotFrame_strokeStyle: '#bbbbbb',         colors: ['#005CA7', '#FFC20F', '#333333', '#68AC2D']     };     $.extend(true, this.chartDefinition, cdProps); } What we are doing is creating an object with all the properties that we want to add or change for the chart, and then extending the chartDefinitions (where the properties or options are). This is what we are doing with the JQuery function, extending. Use the CCC website and make your life easier This way to apply options makes it easier to set the properties. Just change or add the properties that you need, test it, and when you're happy with the result, you just need to copy them into the object that will extend/overwrite the chart options. Just keep in mind that the properties you change directly in the editor will be overwritten by the ones defined in the preExecution, if they match each other of course. Why is this important? It's because not all the properties that you can apply to CCC are exposed in CDE, so you can use the preExecution to use or set those properties. Handling the click event One important thing about the charts is that they allow interaction. CCC provides a way to handle some events in the chart and click is one of those events. To have it working, we need to change two properties: clickable, which needs to be set to true, and clickAction where we need to write a function with the code to be executed when a click happens. The function receives one argument that usually is referred to as a scene. The scene is an object that has a lot of information about the context where the event happened. From the object you will have access to vars, another object where we can find the series and the categories where the clicked happened. We can use the function to get the series/categories being clicked and perform a fireChange that can trigger updates on other components: function(scene) {     var series =  "Series:"+scene.atoms.series.label;     var category =  "Category:"+scene.vars.category.label;     var value = "Value:"+scene.vars.value.label;     Logger.log(category+"&"+value);     Logger.log(series); } In the previous code example, you can find the function to handle the click action for a CCC chart. When the click happens, the code is executed, and a variable with the click series is taken from scene.atoms.series.label. As well as this, the categories clickedscene.vars.category.label and the value that crosses the same series/category in scene.vars.value.value. This is valid for a crosstab, but you will not find the series when it's non-crosstab. You can think of a scene as describing one instance of visual representation. It is generally local to each panel or section of the chart and it's represented by a group of variables that are organized hierarchically. Depending on the scene, it may contain one or many datums. And you must be asking what a hell is a datum? A datum represents a row, so it contains values for multiple columns. We also can see from the example that we are referring to atoms, which hold at least a value, a label, and a key of a column. To get a better understanding of what I am talking about, you should perform a breakpoint anywhere in the code of the previous function and explore the object scene. In the previous example, you would be able to access to the category, series labels, and value, as you can see in the following table:   Corosstab Non-crosstab Value scene.vars.value.label or scene.getValue(); scene.vars.value.label or scene.getValue(); Category scene.vars.category.label or scene.getCategoryLabel(); scene.vars.category.label or scene.getCategoryLabel(); Series scene.atoms.series.label or scene.getSeriesLabel()   For instance, if you add the previous function code to a chart that is a crosstab where the categories are the years and the series are the territories, if you click on the chart, the output would be something like: [info] WD: Category:2004 & Value:23630 [info] WD: Series:EMEA This means that you clicked on the year 2004 for the EMEA. EMEA sales for the year 2004 were 23,630. If you replace the Logger functions withfireChangeas follows, you will be able to make use of the label/value of the clicked category to render other components and some details about them: this.dashboard.fireChange("parameter", scene.vars.category.label); Internationalization of CCCCharts We already saw that all the values coming from the database should not need to be translated. There are some ways in Pentaho to do this, but we may still need to set the title of a chart, where the title should be also internationalized. Another case is when you have dates where the month is represented by numbers in the base axis, but you want to display the month's abbreviated name. This name could be also translated to different languages, which is not hard. For the title, sub-title, and legend, the way to do it is using the instructions on how to set properties on preExecution.First, you will need to define the properties files for the internationalization and set the properties/translations: var cd = this.chartDefinition; cd.title =  this.dashboard.i18nSupport.prop('BOTTOMCHART.TITLE'); To change the title of the chart based on the language defined, we will need to define a function, but we can't use the property on the chart because that will only allow you to define a string, so you will not be able to use a JavaScript instruction to get the text. If you set the previous example code on the preExecution of the chart then, you will be able to. It may also make sense to change not only the titles, but for instance also internationalize the month names. If you are getting data like 2004-02, this may correspond to a time series format as %Y-%m. If that's the case and you want to display the abbreviated month name, then you may use the baseAxisTickFormatter and the dateFormat function from the dashboard utilities, also known as Utils. The code to write inside the preExecution would be like: var cd = this.chartDefinition; cd.baseAxisTickFormatter = function(label) {   return Utils.dateFormat(moment(label, 'YYYY-mmm'), 'MMM/YYYY'); }; The preceding code uses the baseAxisTickFormatter, which allows you to write a function that receives an argument, identified on the code as a label, because it will store the label for each one of the base axis ticks. We are using the dateFormatmethod and moment to format and return the year followed by the abbreviated month name. You can get information about the language defined and being used by running the following instruction moment.locale(); If you need to, you can change the language. Format a basis axis label based on the scale When you are working with a time series chart, you may want to set a different format for the base axis labels. Let's suppose you want to have a chart that is listening to a time selector. If you select one year old data to be displayed on the chart, certainly you are not interested in seeing the minutes on the date label. However, if you want to display the last hour, the ticks of the base axis need to be presented in minutes. There is an extension point we can use to get a conditional format based on the scale of the base axis. The extension point is baseAxisScale_tickFormatter, and it can be used like in the code as follows: baseAxisScale_tickFormatter: function(value, dateTickPrecision) { switch(dateTickPrecision) { casepvc.time.intervals.y: return format_date_year_tick(value);              break; casepvc.time.intervals.m: return format_date_month_tick(value);              break;            casepvc.time.intervals.H: return format_date_hour_tick(value);              break;          default:              return format_date_default_tick(value);   } } It accepts a function with two arguments: the value to be formatted and the tick precision, and should return the formatted label to be presented on each label of the base axis. The previous code shows howthe function is used. You can see a switch that based on the base axis scale will do a different format, calling a function. The functions in the code are not pre-defined—we need to write the functions or code to create the formatting. One example of a function to format the date is that we could use the utils dateFormat function to return the formatted value to the chart. The following table shows the intervals that can be used when verifying which time intervals are being displayed on the chart: Interval Description Number representing the interval y Year 31536e6 m Month 2592e6 d30 30 days 2592e6 d7 7 days 6048e5 d Day 864e5 H Hour 36e5 m Minute 6e4 s Second 1e3 ms Milliseconds 1 Customizing tooltips CCC provides the ability to change the tooltip format that comes by default, and can be changed using the tooltipFormat property. We can change it, making it look likethe following image, on the right side. You can also compare it to the one on the left, which is the default one: The tooltip default format might change depending on the chart type, but also on some options that you apply to the chart, mainly crosstabMode and seriesInRows. The property accepts a function that receives one argument, the scene, which will be a similar structure as already covered for the click event. You should return the HTML to be showed on the dashboard when we hover the chart. In the previous image,you will see on the chart on the left side the defiant tooltip, and on the right a different tooltip. That's because the following code was applied: tooltipFormat: function(scene){   var year = scene.atoms.series.label;   var territory = scene.atoms.category.value;   var sales = Utils.numberFormat(scene.vars.value.value, "#.00A");   var html = '<html>' + <div>Sales for '+year+' at '+territory+':'+sales+'</div>' + '</html>';   return html; } The code is pretty self-explanatory. First we are setting some variables such as year, territory, and the sales values, which we need to present inside the tooltip. Like in the click event, we are getting the labels/value from the scene, which might depend on the properties we set for the chart. For the sales, we are also abbreviating it, using two decimal places. And last, we build the HTML to be displayed when we hover over the chart. You can also change the base axis tooltip Like we are doing to the tooltip when hovering over the values represented in the chart, we can also baseAxisTooltip, just don't forget that the baseAxisTooltipVisible must be set to true (the value by default). Getting the values to show will pretty similar. It can get more complex, though not much more, when we also want for instance, to display the total value of sales for one year or for the territory. Based on that, we could also present the percentage relative to the total. We should use the property as explained earlier. The previous image is one example of how we can customize a tooltip. In this case, we are showing the value but also the percentage that represents the hovered over territory (as the percentage/all the years) and also for the hovered over year (where we show the percentage/all the territories): tooltipFormat: function(scene){   var year = scene.getSeriesLabel();   var territory = scene.getCategoryLabel();   var value = scene.getValue();   var sales = Utils.numberFormat(value, "#.00A");   var totals = {};   _.each(scene.chart().data._datums, function(element) {     var value = element.atoms.value.value;     totals[element.atoms.category.label] =            (totals[element.atoms.category.label]||0)+value;     totals[element.atoms.series.label] =       (totals[element.atoms.series.label]||0)+value;   });   var categoryPerc = Utils.numberFormat(value/totals[territory], "0.0%");   var seriesPerc = Utils.numberFormat(value/totals[year], "0.0%");   var html =  '<html>' + '<div class="value">'+sales+'</div>' + '<div class="dValue">Sales for '+territory+' in '+year+'</div>' + '<div class="bar">'+ '<div class="pPerc">'+categoryPerc+' of '+territory+'</div>'+ '<div class="partialBar" style="width:'+cPerc+'"></div>'+ '</div>' + '<div class="bar">'+ '<div class="pPerc">'+seriesPerc+' of '+year+'</div>'+ '<div class="partialBar" style="width:'+seriesPerc+'"></div>'+ '</div>' + '</html>';   return html; } The first lines of the code are pretty similar except that we are using scene.getSeriesLabel() in place of scene.atoms.series.label. They do the same, so it's only different ways to get the values/labels. Then the total calculations that are calculated by iterating in all the elements of scene.chart().data._datums, which return the logical/relational table, a combination of the territory, years, and value. The last part is just to build the HTML with all the values and labels that we already got from the scene. There are multiple ways to get the values you need, for instance to customize the tooltip, you just need to explore the hierarchical structure of the scene and get used to it. The image that you are seeing also presents a different style, and that should be done using CSS. You can add CSS for your dashboard and change the style of the tooltip, not just the format. Styling tooltips When we want to style a tooltip, we may want to use the developer's tools to check the classes or names and CSS properties already applied, but it's hard because the popup does not stay still. We can change the tooltipDelayOut property and increase its default value from 80 to 1000 or more, depending on the time you need. When you want to apply some styles to the tooltips for a particular chart you can do by setting a CSS class on the tooltip. For that you should use the propertytooltipClassName and set the class name to be added and latter user on the CSS. Summary In this article,we provided a quick overview of how to use CCC in CDF and CDE dashboards and showed you what kinds of charts are available. We covered some of the base options as well as some advanced option that you might use to get a more customized visualization. Resources for Article: Further resources on this subject: Diving into OOP Principles [article] Python Scripting Essentials [article] Building a Puppet Module Skeleton [article]
Read more
  • 0
  • 0
  • 6801

article-image-gradle-java-plugin
Packt
19 May 2016
18 min read
Save for later

Gradle with the Java Plugin

Packt
19 May 2016
18 min read
In this article by Hubert Klein Ikkink, author of the book Gradle Effective Implementations Guide, Second Edition, we will discuss the Java plugin provides a lot of useful tasks and properties that we can use for building a Java application or library. If we follow the convention-over-configuration support of the plugin, we don't have to write a lot of code in our Gradle build file to use it. If we want to, we can still add extra configuration options to override the default conventions defined by the plugin. (For more resources related to this topic, see here.) Let's start with a new build file and use the Java plugin. We only have to apply the plugin for our build: apply plugin: 'java' That's it! Just by adding this simple line, we now have a lot of tasks that we can use to work with in our Java project. To see the tasks that have been added by the plugin, we run the tasks command on the command line and look at the output: $ gradle tasks :tasks ------------------------------------------------------------ All tasks runnable from root project ------------------------------------------------------------ Build tasks ----------- assemble - Assembles the outputs of this project. build - Assembles and tests this project. buildDependents - Assembles and tests this project and all projects that depend on it. buildNeeded - Assembles and tests this project and all projects it depends on. classes - Assembles main classes. clean - Deletes the build directory. jar - Assembles a jar archive containing the main classes. testClasses - Assembles test classes. Build Setup tasks ----------------- init - Initializes a new Gradle build. [incubating] wrapper - Generates Gradle wrapper files. [incubating] Documentation tasks ------------------- javadoc - Generates Javadoc API documentation for the main source code. Help tasks ---------- components - Displays the components produced by root project 'getting_started'. [incubating] dependencies - Displays all dependencies declared in root project 'getting_started'. dependencyInsight - Displays the insight into a specific dependency in root project 'getting_started'. help - Displays a help message. model - Displays the configuration model of root project 'getting_started'. [incubating] projects - Displays the sub-projects of root project 'getting_started'. properties - Displays the properties of root project 'getting_started'. tasks - Displays the tasks runnable from root project 'getting_started'. Verification tasks ------------------ check - Runs all checks. test - Runs the unit tests. Rules ----- Pattern: clean<TaskName>: Cleans the output files of a task. Pattern: build<ConfigurationName>: Assembles the artifacts of a configuration. Pattern: upload<ConfigurationName>: Assembles and uploads the artifacts belonging to a configuration. To see all tasks and more detail, run gradle tasks --all To see more detail about a task, run gradle help --task <task> BUILD SUCCESSFUL Total time: 0.849 secs If we look at the list of tasks, we can see the number of tasks that are now available to us, which we didn't have before; all this is done just by adding a simple line to our build file. We have several task groups with their own individual tasks, which can be used. We have tasks related to building source code and packaging in the Build tasks section. The javadoc task is used to generate Javadoc documentation, and is in the Documentation tasks section. The tasks for running tests and checking code quality are in the Verification tasks section. Finally, we have several rule-based tasks to build, upload, and clean artifacts or tasks in our Java project. The tasks added by the Java plugin are the visible part of the newly added functionality to our project. However, the plugin also adds the so-called convention object to our project. A convention object has several properties and methods, which are used by the tasks of the plugin. These properties and methods are added to our project and can be accessed like normal project properties and methods. So, with the convention object, we can not only look at the properties used by the tasks in the plugin, but we can also change the value of the properties to reconfigure certain tasks. Using the Java plugin To work with the Java plugin, we are first going to create a very simple Java source file. We can then use the plugin's tasks to build the source file. You can make this application as complex as you want, but in order to stay on topic, we will make this as simple as possible. By applying the Java plugin, we must now follow some conventions for our project directory structure. To build the source code, our Java source files must be in the src/main/java directory, relative to the project directory. If we have non-Java source files that need to be included in the JAR file, we must place them in the src/main/resources directory. Our test source files need to be in the src/test/java directory and any non-Java source files required for testing can be placed in src/test/resources. These conventions can be changed if we want or need it, but it is a good idea to stick with them so that we don't have to write any extra code in our build file, which could lead to errors. Our sample Java project that we will write is a Java class that uses an external property file to receive a welcome message. The source file with the name Sample.java is located in the src/main/java directory, as follows: $ gradle tasks :tasks ------------------------------------------------------------ All tasks runnable from root project ------------------------------------------------------------ Build tasks ----------- assemble - Assembles the outputs of this project. build - Assembles and tests this project. buildDependents - Assembles and tests this project and all projects that depend on it. buildNeeded - Assembles and tests this project and all projects it depends on. classes - Assembles main classes. clean - Deletes the build directory. jar - Assembles a jar archive containing the main classes. testClasses - Assembles test classes. Build Setup tasks ----------------- init - Initializes a new Gradle build. [incubating] wrapper - Generates Gradle wrapper files. [incubating] Documentation tasks ------------------- javadoc - Generates Javadoc API documentation for the main source code. Help tasks ---------- components - Displays the components produced by root project 'getting_started'. [incubating] dependencies - Displays all dependencies declared in root project 'getting_started'. dependencyInsight - Displays the insight into a specific dependency in root project 'getting_started'. help - Displays a help message. model - Displays the configuration model of root project 'getting_started'. [incubating] projects - Displays the sub-projects of root project 'getting_started'. properties - Displays the properties of root project 'getting_started'. tasks - Displays the tasks runnable from root project 'getting_started'. Verification tasks ------------------ check - Runs all checks. test - Runs the unit tests. Rules ----- Pattern: clean<TaskName>: Cleans the output files of a task. Pattern: build<ConfigurationName>: Assembles the artifacts of a configuration. Pattern: upload<ConfigurationName>: Assembles and uploads the artifacts belonging to a configuration. To see all tasks and more detail, run gradle tasks --all To see more detail about a task, run gradle help --task <task> BUILD SUCCESSFUL Total time: 0.849 secs In the code, we use ResourceBundle.getBundle() to read our welcome message. The welcome message itself is defined in a properties file with the name messages.properties, which will go in the src/main/resources directory: # File: src/main/resources/gradle/sample/messages.properties welcome = Welcome to Gradle! To compile the Java source file and process the properties file, we run the classes task. Note that the classes task has been added by the Java plugin. This is the so-called life cycle task in Gradle. The classes task is actually dependent on two other tasks—compileJava and processResources. We can see this task dependency when we run the tasks command with the --all command-line option: $ gradle tasks --all ... classes - Assembles main classes. compileJava - Compiles main Java source. processResources - Processes main resources. ... Let's run the classes task from the command line: $ gradle classes :compileJava :processResources :classes BUILD SUCCESSFUL Total time: 1.08 secs Here, we can see that compileJava and processResources tasks are executed because the classes task depends on these tasks. The compiled class file and properties file are now in the build/classes/main and build/resources/main directories. The build directory is the default directory that Gradle uses to build output files. If we execute the classes task again, we will notice that the tasks support the incremental build feature of Gradle. As we haven't changed the Java source file or the properties file, and the output is still present, all the tasks can be skipped as they are up-to-date: $ gradle classes :compileJava UP-TO-DATE :processResources UP-TO-DATE :classes UP-TO-DATE BUILD SUCCESSFUL Total time: 0.595 secs To package our class file and properties file, we invoke the jar task. This task is also added by the Java plugin and depends on the classes task. This means that if we run the jar task, the classes task is also executed. Let's try and run the jar task, as follows: $ gradle jar :compileJava UP-TO-DATE :processResources UP-TO-DATE :classes UP-TO-DATE :jar BUILD SUCCESSFUL Total time: 0.585 secs The default name of the resulting JAR file is the name of our project. So if our project is called sample, then the JAR file is called sample.jar. We can find the file in the build/libs directory. If we look at the contents of the JAR file, we see our compiled class file and the messages.properties file. Also, a manifest file is added automatically by the jar task: $ jar tvf build/libs/sample.jar 0 Wed Oct 21 15:29:36 CEST 2015 META-INF/ 25 Wed Oct 21 15:29:36 CEST 2015 META-INF/MANIFEST.MF 0 Wed Oct 21 15:26:58 CEST 2015 gradle/ 0 Wed Oct 21 15:26:58 CEST 2015 gradle/sample/ 685 Wed Oct 21 15:26:58 CEST 2015 gradle/sample/Sample.class 90 Wed Oct 21 15:26:58 CEST 2015 gradle/sample/messages.properties We can also execute the assemble task to create the JAR file. The assemble task, another life cycle task, is dependent on the jar task and can be extended by other plugins. We could also add dependencies on other tasks that create packages for a project other than the JAR file, such as a WAR file or ZIP archive file: $ gradle assemble :compileJava UP-TO-DATE :processResources UP-TO-DATE :classes UP-TO-DATE :jar UP-TO-DATE :assemble UP-TO-DATE BUILD SUCCESSFUL Total time: 0.607 secs To start again and clean all the generated output from the previous tasks, we can use the clean task. This task deletes the project build directory and all the generated files in this directory. So, if we execute the clean task from the command line, Gradle will delete the build directory: $ gradle clean :clean BUILD SUCCESSFUL Total time: 0.583 secs Note that the Java plugin also added some rule-based tasks. One of them was clean<TaskName>. We can use this task to remove the output files of a specific task. The clean task deletes the complete build directory; but with clean<TaskName>, we only delete the files and directories created by the named task. For example, to clean the generated Java class files of the compileJava task, we execute the cleanCompileJava task. As this is a rule-based task, Gradle will determine that everything after clean must be a valid task in our project. The files and directories created by this task are then determined by Gradle and deleted: $ gradle cleanCompileJava :cleanCompileJava UP-TO-DATE BUILD SUCCESSFUL Total time: 0.578 secs   Working with source sets The Java plugin also adds a new concept to our project—source sets. A source set is a collection of source files that are compiled and executed together. The files can be Java source files or resource files. Source sets can be used to group files together with a certain meaning in our project, without having to create a separate project. For example, we can separate the location of source files that describe the API of our Java project in a source set, and run tasks that only apply to the files in this source set. Without any configuration, we already have the main and test source sets, which are added by the Java plugin. For each source set, the plugin also adds the following three tasks: compile<SourceSet>Java, process<SourceSet>Resources, and <SourceSet>Classes. When the source set is named main, we don't have to provide the source set name when we execute a task. For example, compileJava applies to the main source test, but compileTestJava applies to the test source set. Each source set also has some properties to access the directories and files that make up the source set. The following table shows the properties that we can access in a source set: Source set property Type Description java org.gradle.api.file.SourceDirectorySet These are the Java source files for this project. Only files with the.java extension are in this collection. allJava SourceDirectorySet By default, this is the same as the java property, so it contains all the Java source files. Other plugins can add extra source files to this collection. resources SourceDirectorySet These are all the resource files for this source set. This contains all the files in the resources source directory, excluding any files with the.java extension. allSource SourceDirectorySet By default, this is the combination of the resources and Java properties. This includes all the source files of this source set, both resource and Java source files. output SourceSetOutput These are the output files for the source files in the source set. This contains the compiled classes and processed resources. java.srcDirs Set<File> These are the directories with Java source files. resources.srcDirs Set<File> These are the directories with the resource files for this source set. output.classesDir File This is the output directory with the compiled class files for the Java source files in this source set. output.resourcesDir File This is the output directory with the processed resource files from the resources in this source set. name String This is the read-only value with the name of the source set. We can access these properties via the sourceSets property of our project. In the following example, we will create a new task to display values for several properties: apply plugin: 'java' task sourceSetJavaProperties << { sourceSets { main { println "java.srcDirs = ${java.srcDirs}" println "resources.srcDirs = ${resources.srcDirs}" println "java.files = ${java.files.name}" println "allJava.files = ${allJava.files.name}" println "resources.files = ${resources.files.name}" println "allSource.files = ${allSource.files.name}" println "output.classesDir = ${output.classesDir}" println "output.resourcesDir = ${output.resourcesDir}" println "output.files = ${output.files}" } } } When we run the sourceSetJavaProperties task, we get the following output: $ gradle sourceSetJavaproperties :sourceSetJavaProperties java.srcDirs = [/gradle-book/Chapter4/Code_Files/sourcesets/src/main/java] resources.srcDirs = [/gradle-book/Chapter4/Code_Files/sourcesets/src/main/resources] java.files = [Sample.java] allJava.files = [Sample.java] resources.files = [messages.properties] allSource.files = [messages.properties, Sample.java] output.classesDir = /gradle-book/Chapter4/Code_Files/sourcesets/build/classes/main output.resourcesDir = /gradle-book/Chapter4/Code_Files/sourcesets/build/resources/main output.files = [/gradle-book/Chapter4/Code_Files/sourcesets/build/classes/main, /gradle-book/Chapter4/Code_Files/sourcesets/build/resources/main] BUILD SUCCESSFUL Total time: 0.594 secs Creating a new source set We can create our own source set in a project. A source set contains all the source files that are related to each other. In our example, we will add a new source set to include a Java interface. Our Sample class will then implement the interface; however, as we use a separate source set, we can use this later to create a separate JAR file with only the compiled interface class. We will name the source set api as the interface is actually the API of our example project, which we can share with other projects. To define this source set, we only have to put the name in the sourceSets property of the project, as follows: apply plugin: 'java' sourceSets { api } Gradle will create three new tasks based on this source set—apiClasses, compileApiJava, and processApiResources. We can see these tasks after we execute the tasks command: $ gradle tasks --all ... Build tasks ----------- apiClasses - Assembles api classes. compileApiJava - Compiles api Java source. processApiResources - Processes api resources. We have created our Java interface in the src/api/java directory, which is the source directory for the Java source files for the api source set. The following code allows us to see the Java interface: // File: src/api/java/gradle/sample/ReadWelcomeMessage.java package gradle.sample; /** * Read welcome message from source and return value. */ public interface ReadWelcomeMessage { /** * @return Welcome message */ String getWelcomeMessage(); } To compile the source file, we can execute the compileApiJava or apiClasses task: $ gradle apiClasses :compileApiJava :processApiResources UP-TO-DATE :apiClasses BUILD SUCCESSFUL Total time: 0.595 secs The source file is compiled in the build/classes/api directory. We will now change the source code of our Sample class and implement the ReadWelcomeMessage interface, as shown in the following code: // File: src/main/java/gradle/sample/Sample.java package gradle.sample; import java.util.ResourceBundle; /** * Read welcome message from external properties file * <code>messages.properties</code>. */ public class Sample implements ReadWelcomeMessage { public Sample() { } /** * Get <code>messages.properties</code> file * and read the value for <em>welcome</em> key. * * @return Value for <em>welcome</em> key * from <code>messages.properties</code> */ public String getWelcomeMessage() { final ResourceBundle resourceBundle = ResourceBundle.getBundle("messages"); final String message = resourceBundle.getString("welcome"); return message; } } Next, we run the classes task to recompile our changed Java source file: $ gradle classes :compileJava /gradle-book/Chapter4/src/main/java/gradle/sample/Sample.java:10: error: cannot find symbol public class Sample implements ReadWelcomeMessage { ^ symbol: class ReadWelcomeMessage 1 error :compileJava FAILED FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':compileJava'. > Compilation failed; see the compiler error output for details. * Try: Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. BUILD FAILED Total time: 0.608 secs We get a compilation error! The Java compiler cannot find the ReadWelcomeMessage interface. However, we just ran the apiClasses task and compiled the interface without errors. To fix this, we must define a dependency between the classes and apiClasses tasks. The classes task is dependent on the apiClasses tasks. First, the interface must be compiled and then the class that implements the interface. Next, we must add the output directory with the compiled interface class file to the compileClasspath property of the main source set. Once we have done this, we know for sure that the Java compiler for compiling the Sample class picks up the compiled class file. To do this, we will change the build file and add the task dependency between the two tasks and the main source set configuration, as follows: apply plugin: 'java' sourceSets { api main { compileClasspath += files(api.output.classesDir) } } classes.dependsOn apiClasses   Now we can run the classes task again, without errors: $ gradle classes :compileApiJava :processApiResources UP-TO-DATE :apiClasses :compileJava :processResources :classes BUILD SUCCESSFUL Total time: 0.648 secs   Custom configuration If we use Gradle for an existing project, we might have a different directory structure than the default structure defined by Gradle, or it may be that we want to have a different structure for another reason. We can account for this by configuring the source sets and using different values for the source directories. Consider that we have a project with the following source directory structure: . ├── resources │ ├── java │ └── test ├── src │ └── java ├── test │ ├── integration │ │ └── java │ └── unit │ └── java └── tree.txt We will need to reconfigure the main and test source sets, but we must also add a new integration-test source set. The following code reflects the directory structure for the source sets: apply plugin: 'java' sourceSets { main { java { srcDir 'src/java' } resources { srcDir 'resources/java' } } test { java { srcDir 'test/unit/java' } resources { srcDir 'resources/test' } } 'integeration-test' { java { srcDir 'test/integration/java' } resources { srcDir 'resources/test' } } } Notice how we must put the name of the integration-test source set in quotes; this is because we use a hyphen in the name. Gradle then converts the name of the source set into integrationTest (without the hyphen and with a capital T). To compile, for example, the source files of the integration test source set, we use the compileIntegrationTestJava task. Summary In this article, we discussed the support for a Java project in Gradle. With a simple line needed to apply the Java plugin, we get masses of functionality, which we can use for our Java code. Resources for Article: Further resources on this subject: Working with Gradle [article] Speeding up Gradle builds for Android [article] Developing a JavaFX Application for iOS [article]
Read more
  • 0
  • 0
  • 5874

article-image-exploring-scala-performance
Packt
19 May 2016
19 min read
Save for later

Exploring Scala Performance

Packt
19 May 2016
19 min read
In this article by Michael Diamant and Vincent Theron, author of the book Scala High Performance Programming, we look at how Scala features get compiled with bytecode. (For more resources related to this topic, see here.) Value classes The domain model of the order book application included two classes, Price and OrderId. We pointed out that we created domain classes for Price and OrderId to provide contextual meanings to the wrapped BigDecimal and Long. While providing us with readable code and compilation time safety, this practice also increases the amount of instances that are created by our application. Allocating memory and generating class instances create more work for the garbage collector by increasing the frequency of collections and by potentially introducing additional long-lived objects. The garbage collector will have to work harder to collect them, and this process may severely impact our latency. Luckily, as of Scala 2.10, the AnyVal abstract class is available for developers to define their own value classes to solve this problem. The AnyVal class is defined in the Scala doc (http://www.scala-lang.org/api/current/#scala.AnyVal) as, "the root class of all value types, which describe values not implemented as objects in the underlying host system." The AnyVal class can be used to define a value class, which receives special treatment from the compiler. Value classes are optimized at compile time to avoid the allocation of an instance, and instead, they use the wrapped type. Bytecode representation As an example, to improve the performance of our order book, we can define Price and OrderId as value classes: case class Price(value: BigDecimal) extends AnyVal case class OrderId(value: Long) extends AnyVal To illustrate the special treatment of value classes, we define a dummy function taking a Price value class and an OrderId value class as arguments: def printInfo(p: Price, oId: OrderId): Unit = println(s"Price: ${p.value}, ID: ${oId.value}") From this definition, the compiler produces the following method signature: public void printInfo(scala.math.BigDecimal, long); We see that the generated signature takes a BigDecimal object and a long object, even though the Scala code allows us to take advantage of the types defined in our model. This means that we cannot use an instance of BigDecimal or Long when calling printInfo because the compiler will throw an error. An interesting thing to notice is that the second parameter of printInfo is not compiled as Long (an object), but long (a primitive type, note the lower case 'l'). Long and other objects matching to primitive types, such as Int,Float or Short, are specially handled by the compiler to be represented by their primitive type at runtime. Value classes can also define methods. Let's enrich our Price class, as follows: case class Price(value: BigDecimal) extends AnyVal { def lowerThan(p: Price): Boolean = this.value < p.value } // Example usage val p1 = Price(BigDecimal(1.23)) val p2 = Price(BigDecimal(2.03)) p1.lowerThan(p2) // returns true Our new method allows us to compare two instances of Price. At compile time, a companion object is created for Price. This companion object defines a lowerThan method that takes two BigDecimal objects as parameters. In reality, when we call lowerThan on an instance of Price, the code is transformed by the compiler from an instance method call to a static method call that is defined in the companion object: public final boolean lowerThan$extension(scala.math.BigDecimal, scala.math.BigDecimal); Code: 0: aload_1 1: aload_2 2: invokevirtual #56 // Method scala/math/BigDecimal.$less:(Lscala/math/BigDecimal;)Z 5: ireturn If we were to write the pseudo-code equivalent to the preceding Scala code, it would look something like the following: val p1 = BigDecimal(1.23) val p2 = BigDecimal(2.03) Price.lowerThan(p1, p2) // returns true   Performance considerations Value classes are a great addition to our developer toolbox. They help us reduce the count of instances and spare some work for the garbage collector, while allowing us to rely on meaningful types that reflect our business abstractions. However, extending AnyVal comes with a certain set of conditions that the class must fulfill. For example, a value class may only have one primary constructor that takes one public val as a single parameter. Furthermore, this parameter cannot be a value class. We saw that value classes can define methods via def. Neither val nor var are allowed inside a value class. A nested class or object definitions are also impossible. Another limitation prevents value classes from extending anything other than a universal trait, that is, a trait that extends Any, only has defs as members, and performs no initialization. If any of these conditions is not fulfilled, the compiler generates an error. In addition to the preceding constraints that are listed, there are special cases in which a value class has to be instantiated by the JVM. Such cases include performing a pattern matching or runtime type test, or assigning a value class to an array. An example of the latter looks like the following snippet: def newPriceArray(count: Int): Array[Price] = { val a = new Array[Price](count) for(i <- 0 until count){ a(i) = Price(BigDecimal(Random.nextInt())) } a } The generated bytecode is as follows: public highperfscala.anyval.ValueClasses$$anonfun$newPriceArray$1(highperfscala.anyval.ValueClasses$Price[]); Code: 0: aload_0 1: aload_1 2: putfield #29 // Field a$1:[Lhighperfscala/anyval/ValueClasses$Price; 5: aload_0 6: invokespecial #80 // Method scala/runtime/AbstractFunction1$mcVI$sp."<init>":()V 9: return public void apply$mcVI$sp(int); Code: 0: aload_0 1: getfield #29 // Field a$1:[Lhighperfscala/anyval/ValueClasses$Price; 4: iload_1 5: new #31 // class highperfscala/anyval/ValueClasses$Price // omitted for brevity 21: invokevirtual #55 // Method scala/math/BigDecimal$.apply:(I)Lscala/math/BigDecimal; 24: invokespecial #59 // Method highperfscala/anyval/ValueClasses$Price."<init>":(Lscala/math/BigDecimal;)V 27: aastore 28: return Notice how mcVI$sp is invoked from newPriceArray, and this creates a new instance of ValueClasses$Price at the 5 instruction. As turning a single field case class into a value class is as trivial as extending the AnyVal trait, we recommend that you always use AnyVal wherever possible. The overhead is quite low, and it generate high benefits in terms of garbage collection's performance. To learn more about value classes, their limitations and use cases, you can find detailed descriptions at http://docs.scala-lang.org/overviews/core/value-classes.html. Tagged types – an alternative to value classes Value classes are an easy to use tool, and they can yield great improvements in terms of performance. However, they come with a constraining set of conditions, which can make them impossible to use in certain cases. We will conclude this section with a glance at an interesting alternative to leveraging the tagged type feature that is implemented by the Scalaz library. The Scalaz implementation of tagged types is inspired by another Scala library, named shapeless. The shapeless library provides tools to write type-safe, generic code with minimal boilerplate. While we will not explore shapeless, we encourage you to learn more about the project at https://github.com/milessabin/shapeless. Tagged types are another way to enforce compile-type checking without incurring the cost of instance instantiation. They rely on the Tagged structural type and the @@ type alias that is defined in the Scalaz library, as follows: type Tagged[U] = { type Tag = U } type @@[T, U] = T with Tagged[U] Let's rewrite part of our code to leverage tagged types with our Price object: object TaggedTypes { sealed trait PriceTag type Price = BigDecimal @@ PriceTag object Price { def newPrice(p: BigDecimal): Price = Tag[BigDecimal, PriceTag](p) def lowerThan(a: Price, b: Price): Boolean = Tag.unwrap(a) < Tag.unwrap(b) } } Let's perform a short walkthrough of the code snippet. We will define a PriceTag sealed trait that we will use to tag our instances, a Price type alias is created and defined as a BigDecimal object tagged with PriceTag. The Price object defines useful functions, including the newPrice factory function that is used to tag a given BigDecimal object and return a Price object (that is, a tagged BigDecimal object). We will also implement an equivalent to the lowerThan method. This function takes two Price objects (that is two tagged BigDecimal objects), extracts the content of the tag that are two BigDecimal objects, and compares them. Using our new Price type, we rewrite the same newPriceArray function that we previously looked at (the code is omitted for brevity, but you can refer to it in the attached source code), and print the following generated bytecode: public void apply$mcVI$sp(int); Code: 0: aload_0 1: getfield #29 // Field a$1:[Ljava/lang/Object; 4: iload_1 5: getstatic #35 // Field highperfscala/anyval/TaggedTypes$Price$.MODULE$:Lhighperfscala/anyval/TaggedTypes$Price$; 8: getstatic #40 // Field scala/package$.MODULE$:Lscala/package$; 11: invokevirtual #44 // Method scala/package$.BigDecimal:()Lscala/math/BigDecimal$; 14: getstatic #49 // Field scala/util/Random$.MODULE$:Lscala/util/Random$; 17: invokevirtual #53 // Method scala/util/Random$.nextInt:()I 20: invokevirtual #58 // Method scala/math/BigDecimal$.apply:(I)Lscala/math/BigDecimal; 23: invokevirtual #62 // Method highperfscala/anyval/TaggedTypes$Price$.newPrice:(Lscala/math/BigDecimal;)Ljava/lang/Object; 26: aastore 27: return In this version, we no longer see an instantiation of Price, even though we are assigning them to an array. The tagged Price implementation involves a runtime cast, but we anticipate that the cost of this cast will be less than the instance allocations (and garbage collection) that was observed in the previous value class Price strategy. Specialization To understand the significance of specialization, it is important to first grasp the concept of object boxing. The JVM defines primitive types (boolean, byte, char, float, int, long, short, and double) that are stack allocated rather than heap allocated. When a generic type is introduced, for example, scala.collection.immutable.List, the JVM references an object equivalent, instead of a primitive type. In this example, an instantiated list of integers would be heap allocated objects rather than integer primitives. The process of converting a primitive to its object equivalent is called boxing, and the reverse process is called unboxing. Boxing is a relevant concern for performance-sensitive programming because boxing involves heap allocation. In performance-sensitive code that performs numerical computations, the cost of boxing and unboxing can create an order of magnitude or larger performance slowdowns. Consider the following example to illustrate boxing overhead: List.fill(10000)(2).map(_* 2) Creating the list via fill yields 10,000 heap allocations of the integer object. Performing the multiplication in map requires 10,000 unboxings to perform multiplication and then 10,000 boxings to add the multiplication result into the new list. From this simple example, you can imagine how critical section arithmetic will be slowed down due to boxing or unboxing operations. As shown in Oracle's tutorial on boxing at https://docs.oracle.com/javase/tutorial/java/data/autoboxing.html, boxing in Java and also in Scala happens transparently. This means that without careful profiling or bytecode analysis, it is difficult to discern where you are paying the cost for object boxing. To ameliorate this problem, Scala provides a feature named specialization. Specialization refers to the compile-time process of generating duplicate versions of a generic trait or class that refer directly to a primitive type instead of the associated object wrapper. At runtime, the compiler-generated version of the generic class, or as it is commonly referred to, the specialized version of the class, is instantiated. This process eliminates the runtime cost of boxing primitives, which means that you can define generic abstractions while retaining the performance of a handwritten, specialized implementation. Bytecode representation Let's look at a concrete example to better understand how the specialization process works. Consider a naive, generic representation of the number of shares purchased, as follows: case class ShareCount[T](value: T) For this example, let's assume that the intended usage is to swap between an integer or long representation of ShareCount. With this definition, instantiating a long-based ShareCount instance incurs the cost of boxing, as follows: def newShareCount(l: Long): ShareCount[Long] = ShareCount(l) This definition translates to the following bytecode: public highperfscala.specialization.Specialization$ShareCount<java.lang.Object> newShareCount(long); Code: 0: new #21 // class orderbook/Specialization$ShareCount 3: dup 4: lload_1 5: invokestatic #27 // Method scala/runtime/BoxesRunTime.boxToLong:(J)Ljava/lang/Long; 8: invokespecial #30 // Method orderbook/Specialization$ShareCount."<init>":(Ljava/lang/Object;)V 11: areturn In the preceding bytecode, it is clear in the 5 instruction that the primitive long value is boxed before instantiating the ShareCount instance. By introducing the @specialized annotation, we are able to eliminate the boxing by having the compiler provide an implementation of ShareCount that works with primitive long values. It is possible to specify which types you wish to specialize by supplying a set of types. As defined in the Specializables trait (http://www.scala-lang.org/api/current/index.html#scala.Specializable), you are able to specialize for all JVM primitives, such as Unit and AnyRef. For our example, let's specialize ShareCount for integers and longs, as follows: case class ShareCount[@specialized(Long, Int) T](value: T) With this definition, the bytecode now becomes the following: public highperfscala.specialization.Specialization$ShareCount<java.lang.Object> newShareCount(long); Code: 0: new #21 // class highperfscala.specialization/Specialization$ShareCount$mcJ$sp 3: dup 4: lload_1 5: invokespecial #24 // Method highperfscala.specialization/Specialization$ShareCount$mcJ$sp."<init>":(J)V 8: areturn The boxing disappears and is curiously replaced with a different class name, ShareCount $mcJ$sp. This is because we are invoking the compiler-generated version of ShareCount that is specialized for long values. By inspecting the output of javap, we see that the specialized class generated by the compiler is a subclass of ShareCount: public class highperfscala.specialization.Specialization$ShareCount$mcI$sp extends highperfscala.specialization.Specialization$ShareCount<java .lang.Object> Bear this specialization implementation detail in mind as we turn to the Performance considerations section. The use of inheritance forces tradeoffs to be made in more complex use cases. Performance considerations At first glance, specialization appears to be a simple panacea for JVM boxing. However, there are several caveats to consider when using specialization. A liberal use of specialization leads to significant increases in compile time and resulting code size. Consider specializing Function3, which accepts three arguments as input and produces one result. To specialize four arguments across all types (that is, Byte, Short, Int, Long, Char, Float, Double, Boolean, Unit, and AnyRef) yields 10^4 or 10,000 possible permutations. For this reason, the standard library conserves application of specialization. In your own use cases, consider carefully which types you wish to specialize. If we specialize Function3 only for Int and Long, the number of generated classes shrinks to 2^4 or 16. Specialization involving inheritance requires extra attention because it is trivial to lose specialization when extending a generic class. Consider the following example: class ParentFoo[@specialized T](t: T) class ChildFoo[T](t: T) extends ParentFoo[T](t) def newChildFoo(i: Int): ChildFoo[Int] = new ChildFoo[Int](i) In this scenario, you likely expect that ChildFoo is defined with a primitive integer. However, as ChildFoo does not mark its type with the @specialized annotation, zero specialized classes are created. Here is the bytecode to prove it: public highperfscala.specialization.Inheritance$ChildFoo<java.lang.Object> newChildFoo(int); Code: 0: new #16 // class highperfscala/specialization/Inheritance$ChildFoo 3: dup 4: iload_1 5: invokestatic #22 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer; 8: invokespecial #25 // Method highperfscala/specialization/Inheritance$ChildFoo."<init>":(Ljava/lang/Object;)V 11: areturn The next logical step is to add the @specialized annotation to the definition of ChildFoo. In doing so, we stumble across a scenario where the compiler warns about use of specialization, as follows: class ParentFoo must be a trait. Specialized version of class ChildFoo will inherit generic highperfscala.specialization.Inheritance.ParentFoo[Boolean] class ChildFoo[@specialized T](t: T) extends ParentFoo[T](t) The compiler indicates that you have created a diamond inheritance problem, where the specialized versions of ChildFoo extend both ChildFoo and the associated specialized version of ParentFoo. This issue can be resolved by modeling the problem with a trait, as follows: trait ParentBar[@specialized T] { def t(): T } class ChildBar[@specialized T](val t: T) extends ParentBar[T] def newChildBar(i: Int): ChildBar[Int] = new ChildBar(i) This definition compiles using a specialized version of ChildBar, as we originally were hoping for, as see in the following code: public highperfscala.specialization.Inheritance$ChildBar<java.lang.Object> newChildBar(int); Code: 0: new #32 // class highperfscala/specialization/Inheritance$ChildBar$mcI$sp 3: dup 4: iload_1 5: invokespecial #35 // Method highperfscala/specialization/Inheritance$ChildBar$mcI$sp."<init>":(I)V 8: areturn An analogous and equally error-prone scenario is when a generic function is defined around a specialized type. Consider the following definition: class Foo[T](t: T) object Foo { def create[T](t: T): Foo[T] = new Foo(t) } def boxed: Foo[Int] = Foo.create(1) Here, the definition of create is analogous to the child class from the inheritance example. Instances of Foo wrapping a primitive that are instantiated from the create method will be boxed. The following bytecode demonstrates how boxed leads to heap allocations: public highperfscala.specialization.MethodReturnTypes$Foo<java.lang.Object> boxed(); Code: 0: getstatic #19 // Field highperfscala/specialization/MethodReturnTypes$Foo$.MODULE$:Lhighperfscala/specialization/MethodReturnTypes$Foo$; 3: iconst_1 4: invokestatic #25 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer; 7: invokevirtual #29 // Method highperfscala/specialization/MethodReturnTypes$Foo$.create:(Ljava/lang/Object;)Lhighperfscala/specialization/MethodReturnTypes$Foo; 10: areturn The solution is to apply the @specialized annotation at the call site, as follows: def createSpecialized[@specialized T](t: T): Foo[T] = new Foo(t) The solution is to apply the @specialized annotation at the call site, as follows: def createSpecialized[@specialized T](t: T): Foo[T] = new Foo(t) One final interesting scenario is when specialization is used with multiple types and one of the types extends AnyRef or is a value class. To illustrate this scenario, consider the following example: case class ShareCount(value: Int) extends AnyVal case class ExecutionCount(value: Int) class Container2[@specialized X, @specialized Y](x: X, y: Y) def shareCount = new Container2(ShareCount(1), 1) def executionCount = new Container2(ExecutionCount(1), 1) def ints = new Container2(1, 1) In this example, which methods do you expect to box the second argument to Container2? For brevity, we omit the bytecode, but you can easily inspect it yourself. As it turns out, shareCount and executionCount box the integer. The compiler does not generate a specialized version of Container2 that accepts a primitive integer and a value extending AnyVal (for example, ExecutionCount). The shareCount variable also causes boxing due to the order in which the compiler removes the value class type information from the source code. In both scenarios, the workaround is to define a case class that is specific to a set of types (for example, ShareCount and Int). Removing the generics allows the compiler to select the primitive types. The conclusion to draw from these examples is that specialization requires extra focus to be used throughout an application without boxing. As the compiler is unable to infer scenarios where you accidentally forgot to apply the @specialized annotation, it fails to raise a warning. This places the onus on you to be vigilant about profiling and inspecting bytecode to detect scenarios where specialization is incidentally dropped. To combat some of the shortcomings that specialization brings, there is a compiler plugin under active development, named miniboxing, at http://scala-miniboxing.org/. This compiler plugin applies a different strategy that involves encoding all primitive types into a long value and carrying metadata to recall the original type. For example, boolean can be represented in long using a single bit to signal true or false. With this approach, performance is qualitatively similar to specialization while producing orders of magnitude for fewer classes for large permutations. Additionally, miniboxing is able to more robustly handle inheritance scenarios and can warn when boxing will occur. While the implementations of specialization and miniboxing differ, the end user usage is quite similar. Like specialization, you must add appropriate annotations to activate the miniboxing plugin. To learn more about the plugin, you can view the tutorials on the miniboxing project site. The extra focus to ensure specialization produces heap-allocation free code is worthwhile because of the performance wins in performance-sensitive code. To drive home the value of specialization, consider the following microbenchmark that computes the cost of a trade by multiplying share count with execution price. For simplicity, primitive types are used directly instead of value classes. Of course, in production code this would never happen: @BenchmarkMode(Array(Throughput)) @OutputTimeUnit(TimeUnit.SECONDS) @Warmup(iterations = 3, time = 5, timeUnit = TimeUnit.SECONDS) @Measurement(iterations = 30, time = 10, timeUnit = TimeUnit.SECONDS) @Fork(value = 1, warmups = 1, jvmArgs = Array("-Xms1G", "-Xmx1G")) class SpecializationBenchmark { @Benchmark def specialized(): Double = specializedExecution.shareCount.toDouble * specializedExecution.price @Benchmark def boxed(): Double = boxedExecution.shareCount.toDouble * boxedExecution.price } object SpecializationBenchmark { class SpecializedExecution[@specialized(Int) T1, @specialized(Double) T2]( val shareCount: Long, val price: Double) class BoxingExecution[T1, T2](val shareCount: T1, val price: T2) val specializedExecution: SpecializedExecution[Int, Double] = new SpecializedExecution(10l, 2d) val boxedExecution: BoxingExecution[Long, Double] = new BoxingExecution(10l, 2d) } In this benchmark, two versions of a generic execution class are defined. SpecializedExecution incurs zero boxing when computing the total cost because of specialization, while BoxingExecution requires object boxing and unboxing to perform the arithmetic. The microbenchmark is invoked with the following parameterization: sbt 'project chapter3' 'jmh:run SpecializationBenchmark -foe true' We configure this JMH benchmark via annotations that are placed at the class level in the code. Annotations have the advantage of setting proper defaults for your benchmark, and simplifying the command-line invocation. It is still possible to override the values in the annotation with command-line arguments. We use the -foe command-line argument to enable failure on error because there is no annotation to control this behavior. In the rest of this book, we will parameterize JMH with annotations and omit the annotations in the code samples because we always use the same values. The results are summarized in the following table: Benchmark Throughput (ops per second) Error as percentage of throughput boxed 251,534,293.11 ±2.23 specialized 302,371,879.84 ±0.87 This microbenchmark indicates that the specialized implementation yields approximately 17% higher throughput. By eliminating boxing in a critical section of the code, there is an order of magnitude performance improvement available through judicious usage of specialization. For performance-sensitive arithmetic, this benchmark provides justification for the extra effort that is required to ensure that specialization is applied properly. Summary This article talk about different Scala constructs and features. It also explained different features and how they get compiled with bytecode. Resources for Article: Further resources on this subject: Differences in style between Java and Scala code [article] Integrating Scala, Groovy, and Flex Development with Apache Maven [article] Cluster Computing Using Scala [article]
Read more
  • 0
  • 0
  • 18191

Packt
19 May 2016
13 min read
Save for later

First Person Shooter Part 1 – Creating Exterior Environments

Packt
19 May 2016
13 min read
In this article by John P. Doran, the author of the book Unity 5.x Game Development Blueprints, we will be creating a first-person shooter; however, instead of shooting a gun to damage our enemies, we will be shooting a picture in a survival horror environment, similar to the Fatal Frame series of games and the recent indie title DreadOut. To get started on our project, we're first going to look at creating our level or, in this case, our environments starting with the exterior. In the game industry, there are two main roles in level creation: an environment artist and a level designer. An environment artist is a person who builds the assets that go into the environment. He/she uses tools such as 3Ds Max or Maya to create the model and then uses other tools such as Photoshop to create textures and normal maps. The level designer is responsible for taking the assets that the environment artist created and assembling them in an environment for players to enjoy. He/she designs the gameplay elements, creates the scripted events, and tests the gameplay. Typically, a level designer will create environments through a combination of scripting and using a tool that may or may not be in development as the game is being made. In our case, that tool is Unity. One important thing to note is that most companies have their own definition for different roles. In some companies, a level designer may need to create assets and an environment artist may need to create a level layout. There are also some places that hire someone to just do lighting or just to place meshes (called a mesher) because they're so good at it. (For more resources related to this topic, see here.) Project overview In this article, we take on the role of an environment artist who has been tasked to create an outdoor environment. We will use assets that I've placed in the example code as well as assets already provided to us by Unity for mesh placement. In addition, you will also learn some beginner-level design. Your objectives This project will be split into a number of tasks. It will be a simple step-by-step process from the beginning to end. Here is the outline of our tasks: Creating the exterior environment—terrain Beautifying the environment—adding water, trees, and grass Building the atmosphere Designing the level layout and background Project setup At this point, I assume that you have a fresh installation of Unity and have started it. You can perform the following steps: With Unity started, navigate to File | New Project. Select a project location of your choice somewhere on your hard drive and ensure that you have Setup defaults for set to 3D. Then, put in a Project name (I used First Person Shooter). Once completed, click on Create project. Here, if you see the Welcome to Unity popup, feel free to close it as we won't be using it. Level design 101 – planning Now just because we are going to be diving straight into Unity, I feel that it's important to talk a little more about how level design is done in the game industry. Although you may think a level designer will just jump into the editor and start playing, the truth is that you normally would need to do a ton of planning ahead of time before you even open up your tool. In general, a level begins with an idea. This can come from anything; maybe you saw a really cool building, or a photo on the Internet gave you a certain feeling; maybe you want to teach the player a new mechanic. Turning this idea into a level is what a level designer does. Taking all of these ideas, the level designer will create a level design document, which will outline exactly what you're trying to achieve with the entire level from start to end. A level design document will describe everything inside the level; listing all of the possible encounters, puzzles, so on and so forth, which the player will need to complete as well as any side quests that the player will be able to achieve. To prepare for this, you should include as many references as you can with maps, images, and movies similar to what you're trying to achieve. If you're working with a team, making this document available on a website or wiki will be a great asset so that you know exactly what is being done in the level, what the team can use in their levels, and how difficult their encounters can be. In general, you'll also want a top-down layout of your level done either on a computer or with a graph paper, with a line showing a player's general route for the level with encounters and missions planned out. Of course, you don't want to be too tied down to your design document and it will change as you playtest and work on the level, but the documentation process will help solidify your ideas and give you a firm basis to work from. For those of you interested in seeing some level design documents, feel free to check out Adam Reynolds (Level Designer on Homefront and Call of Duty: World at War) at http://wiki.modsrepository.com/index.php?title=Level_Design:_Level_Design_Document_Example. If you want to learn more about level design, I'm a big fan of Beginning Game Level Design, John Feil (previously my teacher) and Marc Scattergood, Cengage Learning PTR. For more of an introduction to all of game design from scratch, check out Level Up!: The Guide to Great Video Game Design, Scott Rogers, Wiley and The Art of Game Design, Jesse Schell, CRC Press. For some online resources, Scott has a neat GDC talk named Everything I Learned About Level Design I Learned from Disneyland, which can be found at http://mrbossdesign.blogspot.com/2009/03/everything-i-learned-about-game-design.html, and World of Level Design (http://worldofleveldesign.com/) is a good source for learning about of level design, though it does not talk about Unity specifically. Introduction to terrain Terrain is basically used for non-manmade ground; things such as hills, deserts, and mountains. Unity's way of dealing with terrain is different than what most engines use in the fact that there are two mays to make terrains, one being using a height map and the other sculpting from scratch. Height maps Height maps are a common way for game engines to support terrains. Rather than creating tools to build a terrain within the level, they use a piece of graphics software to create an image and then we can translate that image into a terrain using the grayscale colors provided to translate into different height levels, hence the name height map. The lighter in color the area is, the lower its height, so in this instance, black represents the terrain's lowest areas, whereas white represents the highest. The Terrain's Terrain Height property sets how high white actually is compared with black. In order to apply a height map to a terrain object, inside an object's Terrain component, click on the Settings button and scroll down to Import Raw…. For more information on Unity's Height tools, check out http://docs.unity3d.com/Manual/terrain-Height.html. If you want to learn more about creating your own HeightMaps using Photoshop while this tutorial is for UDK, the area in Photoshop is the same: http://worldofleveldesign.com/categories/udk/udk-landscape-heightmaps-photoshop-clouds-filter.php  Others also use software such as Terragen to create HeightMaps. More information on that is at http://planetside.co.uk/products/terragen3. Exterior environment – terrain When creating exterior environments, we cannot use straight floors for the most part unless you're creating a highly urbanized area. Our game takes place in a haunted house in the middle of nowhere, so we're going to create a natural landscape. In Unity, the best tool to use to create a natural landscape is the Terrain tool. Unity's Terrain system lets us add landscapes, complete with bushes, trees, and fading materials to our game. To show how easy it is to use the terrain tool, let's get started. The first thing that we're going to do is actually create the terrain we'll be placing for the world. Let's first create a Terrain by selecting GameObject | 3D Object | Terrain. At this point, you should see the terrain on the screen. If for some reason you have problems seeing the terrain object, go to the Hierarchy tab and double-click on the Terrain object to focus your camera on it and move in as needed. Right now, it's just a flat plane, but we'll be doing a lot to it to make it shine. If you look to the right with the Terrain object selected, you'll see the Terrain editing tools, which do the following (from left to right): Raise/Lower Height—This will allow us to raise or lower the height of our terrain in a certain radius to create hills, rivers, and more. Paint Height—If you already know exactly the height that a part of your terrain needs to be, this tool will allow you to paint a spot to that location. Smooth Height—This averages out the area that it is in, attempts to smooth out areas, and reduces the appearance of abrupt changes. Paint Texture—This allows us to add textures to the surface of our terrain. One of the nice features of this is the ability to lay multiple textures on top of each other. Place Trees—This allows us to paint objects in our environment that will appear on the surface. Unity attempts to optimize these objects by billboarding distant trees so we can have dense forests without having a horrible frame rate. By billboarding, I mean that the object will be simplified and its direction usually changes constantly as the object and camera move, so it always faces the camera direction. Paint Details—In addition to trees, you can also have small things like rocks or grass covering the surface of your environment, using 2D images to represent individual clumps with bits of randomization to make it appear more natural. Terrain Settings—Settings that will affect the overall properties of the particular Terrain, options such as the size of the terrain and wind can be found here. By default, the entire Terrain is set to be at the bottom, but we want to have ground above us and below us so we can add in things like lakes. With the Terrain object selected, click on the second button from the left on the Terrain component (Paint height mode). From there, set the Height value under Settings to 100 and then press the Flatten button. At this point, you should note the plane moving up, so now everything is above by default. Next, we are going to create some interesting shapes to our world with some hills by "painting" on the surface. With the Terrain object selected, click on the first button on the left of our Terrain component (the Raise/Lower Terrain mode). Once this is completed, you should see a number of different brushes and shapes that you can select from. Our use of terrain is to create hills in the background of our scene, so it does not seem like the world is completely flat. Under the Settings, change the Brush Size and Opacity of your brush to 100 and left-click around the edges of the world to create some hills. You can increase the height of the current hills if you click on top of the previous hill. When creating hills, it's a good idea to look at multiple angles while you're building them, so you can make sure that none are too high or too low In general, you want to have taller hills as you go further back, or else you cannot see the smaller ones since they're blocked. In the Scene view, to move your camera around, you can use the toolbar at the top-right corner or hold down the right mouse button and drag it in the direction you want the camera to move around in, pressing the W, A, S, and D keys to pan. In addition, you can hold down the middle mouse button and drag it to move the camera around. The mouse wheel can be scrolled to zoom in and out from where the camera is. Even though you should plan out the level ahead of time on something like a piece of graph paper to plan out encounters, you will want to avoid making the level entirely from the preceding section, as the player will not actually see the game with a bird's eye view in the game at all (most likely). Referencing the map from the same perspective as your character will help ensure that the map looks great. To see many different angles at one time, you can use a layout with multiple views of the scene, such as the 4 Split. Once we have our land done, we now want to create some holes in the ground, which we will fill with water later. This will provide a natural barrier to our world that players will know they cannot pass, so we will create a moat by first changing the Brush Size value to 50 and then holding down the Shift key, and left-clicking around the middle of our texture. In this case, it's okay to use the Top view; remember that this will eventually be water to fill in lakes, rivers, and so on, as shown in the following screenshot:   To make this easier to see, you can click on the sun-looking light icon from the Scene tab to disable lighting for the time being. At this point, we have done what is referred to in the industry as "grayboxing," making the level in the engine in the simplest way possible but without artwork (also known as "whiteboxing" or "orangeboxing" depending on the company you're working for). At this point in a traditional studio, you'd spend time playtesting the level and iterating on it before an artist or you will take the time to make it look great. However, for our purposes, we want to create a finished project as soon as possible. When doing your own games, be sure to play your level and have others play your level before you polish it. For more information on grayboxing, check out http://www.worldofleveldesign.com/categories/level_design_tutorials/art_of_blocking_in_your_map.php. For an example with images of a graybox to the final level, PC Gamer has a nice article available at http://www.pcgamer.com/2014/03/18/building-crown-part-two-layout-design-textures-and-the-hammer-editor/. Summary With this, we now have a great-looking exterior level for our game! In addition, we covered a lot of features that exist in Unity for you to be able to use in your own future projects.   Resources for Article: Further resources on this subject: Learning NGUI for Unity [article] Components in Unity [article] Saying Hello to Unity and Android [article]
Read more
  • 0
  • 0
  • 7100
article-image-diving-oop-principles
Packt
17 May 2016
21 min read
Save for later

Diving into OOP Principles

Packt
17 May 2016
21 min read
In this article by Andrea Chiarelli, the author of the book Mastering JavaScript Object-Oriented Programming, we will discuss about the OOP nature of JavaScript by showing that it complies with the OOP principles. It also will explain the main differences with classical OOP. The following topics will be addressed in the article: What are the principles of OOP paradigm? Support of abstraction and modeling How JavaScript implements Aggregation, Association, and Composition The Encapsulation principle in JavaScript How JavaScript supports inheritance principle Support of the polymorphism principle What are the differences between classical OOP and JavaScript's OOP (For more resources related to this topic, see here.) Object-Oriented Programming principles Object-Oriented Programming (OOP) is one of the most popular programming paradigms. Many developers use languages based on this programming model such as C++, Java, C#, Smalltalk, Objective-C, and many other. One of the keys to the success of this programming approach is that it promotes a modular design and code reuse—two important features when developing complex software. However, the Object-Oriented Programming paradigm is not based on a formal standard specification. There is not a technical document that defines what OOP is and what it is not. The OOP definition is mainly based on a common sense taken from the papers published by early researchers as Kristen Nygaard, Alan Kays, William Cook, and others. An interesting discussion about various attempts to define Object-Oriented Programming can be found online at the following URL: http://c2.com/cgi/wiki?DefinitionsForOo Anyway, a widely accepted definition to classify a programming language as Object Oriented is based on two requirements—its capability to model a problem through objects and its support of a few principles that grant modularity and code reuse. In order to satisfy the first requirement, a language must enable a developer to describe the reality using objects and to define relationships among objects such as the following: Association: This is the object's capability to refer another independent object Aggregation: This is the object's capability to embed one or more independent objects Composition: This is the object's capability to embed one or more dependent objects Commonly, the second requirement is satisfied if a language supports the following principles: Encapsulation: This is the capability to concentrate into a single entity data and code that manipulates it, hiding its internal details Inheritance: This is the mechanism by which an object acquires some or all features from one or more other objects Polymorphism: This is the capability to process objects differently based on their data type or structure Meeting these requirements is what usually allows us to classify a language as Object Oriented. Is JavaScript Object Oriented? Once we have established the principles commonly accepted for defining a language as Object Oriented, can we affirm that JavaScript is an OOP language? Many developers do not consider JavaScript a true Object-Oriented language due to its lack of class concept and because it does not enforce compliance with OOP principles. However, we can see that our informal definition make no explicit reference to classes. Features and principles are required for objects. Classes are not a real requirement, but they are sometimes a convenient way to abstract sets of objects with common properties. So, a language can be Object Oriented if it supports objects even without classes, as in JavaScript. Moreover, the OOP principles required for a language are intended to be supported. They should not be mandatory in order to do programming in a language. The developer can choose to use constructs that allow him to create Object Oriented code or not. Many criticize JavaScript because developers can write code that breaches the OOP principles. But this is just a choice of the programmer, not a language constraint. It also happens with other programming languages, such as C++. We can conclude that lack of abstract classes and leaving the developer free to use or not features that support OOP principles are not a real obstacle to consider JavaScript an OOP language. So, let's analyze in the following sections how JavaScript supports abstraction and OOP principles. Abstraction and modeling support The first requirement for us to consider a language as Object Oriented is its support to model a problem through objects. We already know that JavaScript supports objects, but here we should determine whether they are supported in order to be able to model reality. In fact, in Object-Oriented Programming we try to model real-world entities and processes and represent them in our software. We need a model because it is a simplification of reality, it allows us to reduce the complexity offering a vision from a particular perspective and helps us to reason about relationship among entities. This simplification feature is usually known as abstraction, and it is sometimes considered as one of the principles of OOP. Abstraction is the concept of moving the focus from the details and concrete implementation of things to the features that are relevant for a specific purpose, with a more general and abstract approach. In other words, abstraction is the capability to define which properties and actions of a real-world entity have to be represented by means of objects in a program in order to solve a specific problem. For example, thanks to abstraction, we can decide that to solve a specific problem we can represent a person just as an object with name, surname, and age, since other information such as address, height, hair color, and so on are not relevant for our purpose. More than a language feature, it seems a human capability. For this reason, we prefer not to consider it an OOP principle but a (human) capability to support modeling. Modeling reality not only involves defining objects with relevant features for a specific purpose. It also includes the definition of relationships between objects, such as Association, Aggregation, and Composition. Association Association is a relationship between two or more objects where each object is independent of each other. This means that an object can exist without the other and no object owns the other. Let us clarify with an example. In order to define a parent–child relationship between persons, we can do so as follows: function Person(name, surname) { this.name = name; this.surname = surname; this.parent = null; } var johnSmith = new Person("John", "Smith"); var fredSmith = new Person("Fred", "Smith"); fredSmith.parent = johnSmith; The assignment of the object johnSmith to the parent property of the object fredSmith establishes an association between the two objects. Of course, the object johnSmith lives independently from the object fredSmith and vice versa. Both can be created and deleted independently each other. As we can see from the example, JavaScript allows to define association between objects using a simple object reference through a property. Aggregation Aggregation is a special form of association relationship where an object has a major role than the other one. Usually, this major role determines a sort of ownership of an object in relation to the other. The owner object is often called aggregate and the owned object is called component. However, each object has an independent life. An example of aggregation relationship is the one between a company and its employees, as in the following example: var company = { name: "ACME Inc.", employees: [] }; var johnSmith = new Person("John", "Smith"); var marioRossi = new Person("Mario", "Rossi"); company.employees.push(johnSmith); company.employees.push(marioRossi); The person objects added to employees collection help to define the company object, but they are independent from it. If the company object is deleted, each single person still lives. However, the real meaning of a company is bound to the presence of its employees. Again, the code show us that the aggregation relationship is supported by JavaScript by means of object reference. It is important not to confuse the Association with the Aggregation. Even if the support of the two relationships is syntactically identical, that is, the assignment or attachment of an object to a property, from a conceptual point of view they represent different situations. Aggregation is the mechanism that allows you to create an object consisting of several objects, while the association relates autonomous objects. In any case, JavaScript makes no control over the way in which we associate or aggregate objects between them. Association and Aggregation raise a constraint more conceptual than technical. Composition Composition is a strong type of Aggregation, where each component object has no independent life without its owner, the aggregate. Consider the following example: var person = {name: "John", surname: "Smith", address: { street: "123 Duncannon Street", city: "London", country: "United Kingdom" }}; This code defines a person with his address represented as an object. The address property in strictly bound to the person object. Its life is dependent on the life of the person and it cannot have an independent life without the person. If the person object is deleted, also the address object is deleted. In this case, the strict relation between the person and his address is expressed in JavaScript assigning directly the literal representing the address to the address property. OOP principles support The second requirement that allows us to consider JavaScript as an Object-Oriented language involves the support of at least three principles—encapsulation, inheritance, and polymorphism. Let analyze how JavaScript supports each of these principles. Encapsulation Objects are central to the Object-Oriented Programming model, and they represent the typical expression of encapsulation, that is, the ability to concentrate in one entity both data (properties) and functions (methods), hiding the internal details. In other words, the encapsulation principle allows an object to expose just what is needed to use it, hiding the complexity of its implementation. This is a very powerful principle, often found in the real world that allows us to use an object without knowing how it internally works. Consider for instance how we drive cars. We need just to know how to speed up, brake, and change direction. We do not need to know how the car works in details, how its motor burns fuel or transmits movement to the wheels. To understand the importance of this principle also in software development, consider the following code: var company = { name: "ACME Inc.", employees: [], sortEmployeesByName: function() {...} }; It creates a company object with a name, a list of employees and a method to sort the list of employees using their name property. If we need to get a sorted list of employees of the company, we simply need to know that the sortEmployeesByName() method accomplishes this task. We do not need to know how this method works, which algorithm it implements. That is an implementation detail that encapsulation hides to us. Hiding internal details and complexity has two main reasons: The first reason is to provide a simplified and understandable way to use an object without the need to understand the complexity inside. In our example, we just need to know that to sort employees, we have to call a specific method. The second reason is to simplify change management. Changes to the internal sort algorithm do not affect our way to order employees by name. We always continue to call the same method. Maybe we will get a more efficient execution, but the expected result will not change. We said that encapsulation hides internal details in order to simplify both the use of an object and the change of its internal implementation. However, when internal implementation depends on publicly accessible properties, we risk to frustrate the effort of hiding internal behavior. For example, what happens if you assign a string to the property employees of the object company? company.employees = "this is a joke!"; company.sortEmployeesByName(); The assignment of a string to a property whose value is an array is perfectly legal in JavaScript, since it is a language with dynamic typing. But most probably, we will get an exception when calling the sort method after this assignment, since the sort algorithm expects an array. In this case, the encapsulation principle has not been completely implemented. A general approach to prevent direct access to relevant properties is to replace them with methods. For example, we can redefine our company object as in the following: function Company(name) { var employees = []; this.name = name; this.getEmployees = function() { return employees; }; this.addEmployee = function(employee) { employees.push(employee); }; this.sortEmployeesByName = function() { ... }; } var company = new Company("ACME Inc."); With this approach, we cannot access directly the employees property, but we need to use the getEmployees() method to obtain the list of employees of the company and addEmployee() to add an employee to the list. This guarantees that the internal state remains really hidden and consistent. The way we created methods for the Company() constructor is not the best one. This is just one possible approach to enforce encapsulation by protecting the internal state of an object. This kind of data protection is usually called information hiding and, although often linked to encapsulation, it should be considered as an autonomous principle. Information hiding deals with the accessibility to an object's members, in particular to properties. While encapsulation concerns hiding details, the information hiding principle usually allows different access levels to the members of an object. Inheritance In Object-Oriented Programming, inheritance enables new objects to acquire the properties of existing objects. This relationship between two objects is very common and can be found in many situations in real life. It usually refers to creating a specialized object starting from a more general one. Let's consider, for example, a person: he has some features such as name, surname, height, weight, and so on. The set of features describes a generic entity that represents a person. Using abstraction, we can select the features needed for our purpose and represent a person as an object: If we need a special person who is able to program computers, that is a programmer, we need to create an object that has all the properties of a generic person plus some new properties that characterize the programmer object. For instance, the new programmer object can have a property describing which programming language he knows. Suppose we choose to create the new programmer object by duplicating the properties of the person object and adding to it the programming language knowledge as follows: This approach is in contrast with the Object-Oriented Programming goals. In particular, it does not reuse existing code, since we are duplicating the properties of the person object. A more appropriate approach should reuse the code created to define the person object. This is where the inheritance principle can help us. It allows to share common features between objects avoiding code duplication. Inheritance is also called subclassing in languages that support classes. A class that inherits from another class is called subclass, while the class from which it is derived is called superclass. Apart from the naming, the inheritance concept is the same, although of course it does not seem suited to JavaScript. We can implement inheritance in JavaScript in various ways. Consider, for example, the following constructor of person objects: function Person() { this.name = ""; this.surname = ""; } In order to define a programmer as a person specialized in computer programming, we will add a new property describing its knowledge about a programming language: knownLanguage. A simple approach to create the programmer object that inherits properties from person is based on prototype. Here is a possible implementation: function Programmer() { this.knownLanguage = ""; } Programmer.prototype = new Person(); We will create a programmer with the following code: var programmer = new Programmer(); We will obtain an object that has the properties of the person object (name and surname) and the specific property of the programmer (knownLanguage), that is, the programmer object inherits the person properties. This is a simple example to demonstrate that JavaScript supports the inheritance principle of Object-Oriented Programming at its basic level. Inheritance is a complex concept that has many facets and several variants in programming, many of them dependent on the used language. Polymorphism In Object-Oriented Programming, polymorphism is understood in different ways, even if the basis is a common notion—the ability to handle multiple data types uniformly. Support of polymorphism brings benefits in programming that go toward the overall goal of OOP. Mainly, it reduces coupling in our application, and in some cases, allows to create more compact code. Most common ways to support polymorphism by a programming language include: Methods that take parameters with different data types (overloading) Management of generic types, not known in advance (parametric polymorphism) Expressions whose type can be represented by a class and classes derived from it (subtype polymorphism or inclusion polymorphism) In most languages, overloading is what happens when you have two methods with the same name but different signature. At compile time, the compiler works out which method to call based on matching between types of invocation arguments and method's parameters. The following is an example of method overloading in C#: public int CountItems(int x) { return x.ToString().Length; } public int CountItems(string x) { return x.Length; } The CountItems()method has two signatures—one for integers and one for strings. This allows to count the number of digits in a number or the number of characters in a string in a uniform manner, just calling the same method. Overloading can also be expressed through methods with different number of arguments, as shown in the following C# example: public int Sum(int x, int y) { return Sum(x, y, 0); } public int Sum(int x, int y, int z) { return x+ y + z; } Here, we have the Sum()method that is able to sum two or three integers. The correct method definition will be detected on the basis of the number of arguments passed. As JavaScript developers, we are able to replicate this behavior in our scripts. For example, the C# CountItems() method become in JavaScript as follows: function countItems(x) { return x.toString().length; } While the Sum() example will be as follows: function sum(x, y, z) { x = x?x:0; y = y?y:0; z = z?z:0; return x + y + z; } Or, using the more convenient ES6 syntax: function sum(x = 0, y = 0, z = 0) { return x + y + z; } These examples demonstrate that JavaScript supports overloading in a more immediate way than strong-typed languages. In strong-typed languages, overloading is sometimes called static polymorphism, since the correct method to invoke is detected statically by the compiler at compile time. Parametric polymorphism allows a method to work on parameters of any type. Often it is also called generics and many languages support it in built-in methods. For example, in C#, we can define a list of items whose type is not defined in advance using the List<T> generic type. This allows us to create lists of integers, strings, or any other type. We can also create our generic class as shown by the following C# code: public class Stack<T> { private T[] items; private int count; public void Push(T item) { ... } public T Pop() { ... } } This code defines a typical stack implementation whose item's type is not defined. We will be able to create, for example, a stack of strings with the following code: var stack = new Stack<String>(); Due to its dynamic data typing, JavaScript supports parametric polymorphism implicitly. In fact, the type of function's parameters is inherently generic, since its type is set when a value is assigned to it. The following is a possible implementation of a stack constructor in JavaScript: function Stack() { this.stack = []; this.pop = function(){ return this.stack.pop(); } this.push = function(item){ this.stack.push(item); } } Subtype polymorphism allows to consider objects of different types, but with an inheritance relationship, to be handled consistently. This means that wherever I can use an object of a specific type, here I can use an object of a type derived from it. Let's see a C# example to clarify this concept: public class Person { public string Name {get; set;} public string SurName {get; set;} } public class Programmer:Person { public String KnownLanguage {get; set;} } public void WriteFullName(Person p) { Console.WriteLine(p.Name + " " + p.SurName); } var a = new Person(); a.Name = "John"; a.SurName = "Smith"; var b = new Programmer(); b.Name = "Mario"; b.SurName = "Rossi"; b.KnownLanguage = "C#"; WriteFullName(a); //result: John Smith WriteFullName(b); //result: Mario Rossi In this code, we again present the definition of the Person class and its derived class Programmer and define the method WriteFullName() that accepts argument of type Person. Thanks to subtype polymorphism, we can pass to WriteFullName() also objects of type Programmer, since it is derived from Person. In fact, from a conceptual point of view a programmer is also a person, so subtype polymorphism fits to a concrete representation of reality. Of course, the C# example can be easily reproduced in JavaScript since we have no type constraint. Let's see the corresponding code: function Person() { this.name = ""; this.surname = ""; } function Programmer() { this.knownLanguage = ""; } Programmer.prototype = new Person(); function writeFullName(p) { console.log(p.name + " " + p.surname); } var a = new Person(); a.name = "John"; a.surname = "Smith"; var b = new Programmer(); b.name = "Mario"; b.surname = "Rossi"; b.knownLanguage = "JavaScript"; writeFullName(a); //result: John Smith writeFullName(b); //result: Mario Rossi As we can see, the JavaScript code is quite similar to the C# code and the result is the same. JavaScript OOP versus classical OOP The discussion conducted so far shows how JavaScript supports the fundamental Object-Oriented Programming principles and can be considered a true OOP language as many other. However, JavaScript differs from most other languages for certain specific features which can create some concern to the developers used to working with programming languages that implement the classical OOP. The first of these features is the dynamic nature of the language both in data type management and object creation. Since data types are dynamically evaluated, some features of OOP, such as polymorphism, are implicitly supported. Moreover, the ability to change an object structure at runtime breaks the common sense that binds an object to a more abstract entity like a class. The lack of the concept of class is another big difference with the classical OOP. Of course, we are talking about the class generalization, nothing to do with the class construct introduced by ES6 that represents just a syntactic convenience for standard JavaScript constructors. Classes in most Object-Oriented languages represent a generalization of objects, that is, an extra level of abstraction upon the objects. So, classical Object-Oriented programming has two types of abstractions—classes and objects. An object is an abstraction of a real-world entity while a class is an abstraction of an object or another class (in other words, it is a generalization). Objects in classical OOP languages can only be created by instantiating classes. JavaScript has a different approach in object management. It has just one type of abstraction—the objects. Unlike the classical OOP approach, an object can be created directly as an abstraction of a real-world entity or as an abstraction of another object. In the latter case the abstracted object is called prototype. As opposed to the classical OOP approach, the JavaScript approach is sometimes called Prototypal Object-Oriented Programming. Of course, the lack of a notion of class in JavaScript affects the inheritance mechanism. In fact, while in classical OOP inheritance is an operation allowed on classes, in prototypal OOP inheritance is an operation on objects. That does not mean that classical OOP is better than prototypal OOP or vice versa. They are simply different approaches. However, we cannot ignore that these differences lead to some impact in the way we manage objects. At least we note that while in classical OOP classes are immutable, that is we cannot add, change, or remove properties or methods at runtime, in prototypal OOP objects and prototypes are extremely flexible. Moreover, classical OOP adds an extra level of abstraction with classes, leading to a more verbose code, while prototypal OOP is more immediate and requires a more compact code. Summary In this article, we explored the basic principles of Object-Oriented Programming paradigm. We have been focusing on abstraction to define objects, association, aggregation, and composition to define relationships between objects, encapsulation, inheritance, and polymorphism principles to outline the basic principles required by OOP. We have seen how JavaScript supports all features that allows us to define it as a true OOP language and have compared classical OOP with prototypal OOP. Once we established that JavaScript is a true Object-Oriented language like other languages such as Java, C #, and C ++. Resources for Article:   Further resources on this subject: Just Object Oriented Programming (Object Oriented Programming, explained) [article] Introducing Object Oriented Programmng with TypeScript [article] Python 3 Object Oriented Programming: Managing objects [article]
Read more
  • 0
  • 0
  • 4857

article-image-interfaces
Packt
17 May 2016
19 min read
Save for later

Expert Python Programming: Interfaces

Packt
17 May 2016
19 min read
This article by, Michał Jaworski and Tarek Ziadé, the authors of the book, Expert Python Programming - Second Edition, will mainly focus on interfaces. (For more resources related to this topic, see here.) An interface is a definition of an API. It describes a list of methods and attributes a class should have to implement with the desired behavior. This description does not implement any code but just defines an explicit contract for any class that wishes to implement the interface. Any class can then implement one or several interfaces in whichever way it wants. While Python prefers duck-typing over explicit interface definitions, it may be better to use them sometimes. For instance, explicit interface definition makes it easier for a framework to define functionalities over interfaces. The benefit is that classes are loosely coupled, which is considered as a good practice. For example, to perform a given process, a class A does not depend on a class B, but rather on an interface I. Class B implements I, but it could be any other class. The support for such a technique is built-in in many statically typed languages such as Java or Go. The interfaces allow the functions or methods to limit the range of acceptable parameter objects that implement a given interface, no matter what kind of class it comes from. This allows for more flexibility than restricting arguments to given types or their subclasses. It is like an explicit version of duck-typing behavior: Java uses interfaces to verify a type safety at compile time rather than use duck-typing to tie things together at run time. Python has a completely different typing philosophy to Java, so it does not have native support for interfaces. Anyway, if you would like to have more explicit control on application interfaces, there are generally two solutions to choose from: Use some third-party framework that adds a notion of interfaces Use some of the advanced language features to build your methodology for handling interfaces. Using zope.interface There are a few frameworks that allow you to build explicit interfaces in Python. The most notable one is a part of the Zope project. It is the zope.interface package. Although, nowadays, Zope is not as popular as it used to be, the zope.interface package is still one of the main components of the Twisted framework. The core class of the zope.interface package is the Interface class. It allows you to explicitly define a new interface by subclassing. Let's assume that we want to define the obligatory interface for every implementation of a rectangle: from zope.interface import Interface, Attribute class IRectangle(Interface): width = Attribute("The width of rectangle") height = Attribute("The height of rectangle") def area(): """ Return area of rectangle """ def perimeter(): """ Return perimeter of rectangle """ Some important things to remember when defining interfaces with zope.interface are as follows: The common naming convention for interfaces is to use I as the name suffix. The methods of the interface must not take the self parameter. As the interface does not provide concrete implementation, it should consist only of empty methods. You can use the pass statement, raise NotImplementedError, or provide a docstring (preferred). An interface can also specify the required attributes using the Attribute class. When you have such a contract defined, you can then define new concrete classes that provide implementation for our IRectangle interface. In order to do that, you need to use the implementer() class decorator and implement all of the defined methods and attributes: @implementer(IRectangle) class Square: """ Concrete implementation of square with rectangle interface """ def __init__(self, size): self.size = size @property def width(self): return self.size @property def height(self): return self.size def area(self): return self.size ** 2 def perimeter(self): return 4 * self.size @implementer(IRectangle) class Rectangle: """ Concrete implementation of rectangle """ def __init__(self, width, height): self.width = width self.height = height def area(self): return self.width * self.height def perimeter(self): return self.width * 2 + self.height * 2 It is common to say that the interface defines a contract that a concrete implementation needs to fulfill. The main benefit of this design pattern is being able to verify consistency between contract and implementation before the object is being used. With the ordinary duck-typing approach, you only find inconsistencies when there is a missing attribute or method at runtime. With zope.interface, you can introspect the actual implementation using two methods from the zope.interface.verify module to find inconsistencies early on: verifyClass(interface, class_object): This verifies the class object for existence of methods and correctness of their signatures without looking for attributes verifyObject(interface, instance): This verifies the methods, their signatures, and also attributes of the actual object instance Since we have defined our interface and two concrete implementations, let's verify their contracts in an interactive session: >>> from zope.interface.verify import verifyClass, verifyObject >>> verifyObject(IRectangle, Square(2)) True >>> verifyClass(IRectangle, Square) True >>> verifyObject(IRectangle, Rectangle(2, 2)) True >>> verifyClass(IRectangle, Rectangle) True Nothing impressive. The Rectangle and Square classes carefully follow the defined contract so there is nothing more to see than a successful verification. But what happens when we make a mistake? Let's see an example of two classes that fail to provide full IRectangle interface implementation: @implementer(IRectangle) class Point: def __init__(self, x, y): self.x = x self.y = y @implementer(IRectangle) class Circle: def __init__(self, radius): self.radius = radius def area(self): return math.pi * self.radius ** 2 def perimeter(self): return 2 * math.pi * self.radius The Point class does not provide any method or attribute of the IRectangle interface, so its verification will show inconsistencies already on the class level: >>> verifyClass(IRectangle, Point) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "zope/interface/verify.py", line 102, in verifyClass return _verify(iface, candidate, tentative, vtype='c') File "zope/interface/verify.py", line 62, in _verify raise BrokenImplementation(iface, name) zope.interface.exceptions.BrokenImplementation: An object has failed to implement interface <InterfaceClass __main__.IRectangle> The perimeter attribute was not provided. The Circle class is a bit more problematic. It has all the interface methods defined but breaks the contract on the instance attribute level. This is the reason why, in most cases, you need to use the verifyObject() function to completely verify the interface implementation: >>> verifyObject(IRectangle, Circle(2)) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "zope/interface/verify.py", line 105, in verifyObject return _verify(iface, candidate, tentative, vtype='o') File "zope/interface/verify.py", line 62, in _verify raise BrokenImplementation(iface, name) zope.interface.exceptions.BrokenImplementation: An object has failed to implement interface <InterfaceClass __main__.IRectangle> The width attribute was not provided. Using zope.inteface is an interesting way to decouple your application. It allows you to enforce proper object interfaces without the need for the overblown complexity of multiple inheritance, and it also allows to catch inconsistencies early. However, the biggest downside of this approach is the requirement that you explicitly define that the given class follows some interface in order to be verified. This is especially troublesome if you need to verify instances coming from external classes of built-in libraries. zope.interface provides some solutions for that problem, and you can of course handle such issues on your own by using the adapter pattern, or even monkey-patching. Anyway, the simplicity of such solutions is at least arguable. Using function annotations and abstract base classes Design patterns are meant to make problem solving easier and not to provide you with more layers of complexity. The zope.interface is a great concept and may greatly fit some projects, but it is not a silver bullet. By using it, you may soon find yourself spending more time on fixing issues with incompatible interfaces for third-party classes and providing never-ending layers of adapters instead of writing the actual implementation. If you feel that way, then this is a sign that something went wrong. Fortunately, Python supports for building lightweight alternative to the interfaces. It's not a full-fledged solution like zope.interface or its alternatives but it generally provides more flexible applications. You may need to write a bit more code, but in the end you will have something that is more extensible, better handles external types, and may be more future proof. Note that Python in its core does not have explicit notions of interfaces, and probably will never have, but has some of the features that allow you to build something that resembles the functionality of interfaces. The features are: Abstract base classes (ABCs) Function annotations Type annotations The core of our solution is abstract base classes, so we will feature them first. As you probably know, the direct type comparison is considered harmful and not pythonic. You should always avoid comparisons as follows: assert type(instance) == list Comparing types in functions or methods that way completely breaks the ability to pass a class subtype as an argument to the function. The slightly better approach is to use the isinstance() function that will take the inheritance into account: assert isinstance(instance, list) The additional advantage of isinstance() is that you can use a larger range of types to check the type compatibility. For instance, if your function expects to receive some sort of sequence as the argument, you can compare against the list of basic types: assert isinstance(instance, (list, tuple, range)) Such a way of type compatibility checking is OK in some situations but it is still not perfect. It will work with any subclass of list, tuple, or range, but will fail if the user passes something that behaves exactly the same as one of these sequence types but does not inherit from any of them. For instance, let's relax our requirements and say that you want to accept any kind of iterable as an argument. What would you do? The list of basic types that are iterable is actually pretty long. You need to cover list, tuple, range, str, bytes, dict, set, generators, and a lot more. The list of applicable built-in types is long, and even if you cover all of them it will still not allow you to check against the custom class that defines the __iter__() method, but will instead inherit directly from object. And this is the kind of situation where abstract base classes (ABC) are the proper solution. ABC is a class that does not need to provide a concrete implementation but instead defines a blueprint of a class that may be used to check against type compatibility. This concept is very similar to the concept of abstract classes and virtual methods known in the C++ language. Abstract base classes are used for two purposes: Checking for implementation completeness Checking for implicit interface compatibility So, let's assume we want to define an interface which ensures that a class has a push() method. We need to create a new abstract base class using a special ABCMeta metaclass and an abstractmethod() decorator from the standard abc module: from abc import ABCMeta, abstractmethod class Pushable(metaclass=ABCMeta): @abstractmethod def push(self, x): """ Push argument no matter what it means """ The abc module also provides an ABC base class that can be used instead of the metaclass syntax: from abc import ABCMeta, abstractmethod class Pushable(metaclass=ABCMeta): @abstractmethod def push(self, x): """ Push argument no matter what it means """ Once it is done, we can use that Pushable class as a base class for concrete implementation and it will guard us from the instantiation of objects that would have incomplete implementation. Let's define DummyPushable, which implements all interface methods and the IncompletePushable that breaks the expected contract: class DummyPushable(Pushable): def push(self, x): return class IncompletePushable(Pushable): pass If you want to obtain the DummyPushable instance, there is no problem because it implements the only required push() method: >>> DummyPushable() <__main__.DummyPushable object at 0x10142bef0> But if you try to instantiate IncompletePushable, you will get TypeError because of missing implementation of the interface() method: >>> IncompletePushable() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Can't instantiate abstract class IncompletePushable with abstract methods push The preceding approach is a great way to ensure implementation completeness of base classes but is as explicit as the zope.interface alternative. The DummyPushable instances are of course also instances of Pushable because Dummy is a subclass of Pushable. But how about other classes with the same methods but not descendants of Pushable? Let's create one and see: >>> class SomethingWithPush: ... def push(self, x): ... pass ... >>> isinstance(SomethingWithPush(), Pushable) False Something is still missing. The SomethingWithPush class definitely has a compatible interface but is not considered as an instance of Pushable yet. So, what is missing? The answer is the __subclasshook__(subclass) method that allows you to inject your own logic into the procedure that determines whether the object is an instance of a given class. Unfortunately, you need to provide it by yourself, as abc creators did not want to constrain the developers in overriding the whole isinstance() mechanism. We got full power over it, but we are forced to write some boilerplate code. Although you can do whatever you want to, usually the only reasonable thing to do in the __subclasshook__() method is to follow the common pattern. The standard procedure is to check whether the set of defined methods are available somewhere in the MRO of the given class: from abc import ABCMeta, abstractmethod class Pushable(metaclass=ABCMeta): @abstractmethod def push(self, x): """ Push argument no matter what it means """ @classmethod def __subclasshook__(cls, C): if cls is Pushable: if any("push" in B.__dict__ for B in C.__mro__): return True return NotImplemented With the __subclasshook__() method defined that way, you can now confirm that the instances that implement the interface implicitly are also considered instances of the interface: >>> class SomethingWithPush: ... def push(self, x): ... pass ... >>> isinstance(SomethingWithPush(), Pushable) True Unfortunately, this approach to the verification of type compatibility and implementation completeness does not take into account the signatures of class methods. So, if the number of expected arguments is different in implementation, it will still be considered compatible. In most cases, this is not an issue, but if you need such fine-grained control over interfaces, the zope.interface package allows for that. As already said, the __subclasshook__() method does not constrain you in adding more complexity to the isinstance() function's logic to achieve a similar level of control. The two other features that complement abstract base classes are function annotations and type hints. Function annotation is the syntax element. It allows you to annotate functions and their arguments with arbitrary expressions. This is only a feature stub that does not provide any syntactic meaning. There is no utility in the standard library that uses this feature to enforce any behavior. Anyway, you can use it as a convenient and lightweight way to inform the developer of the expected argument interface. For instance, consider this IRectangle interface rewritten from zope.interface to abstract the base class: from abc import ( ABCMeta, abstractmethod, abstractproperty ) class IRectangle(metaclass=ABCMeta): @abstractproperty def width(self): return @abstractproperty def height(self): return @abstractmethod def area(self): """ Return rectangle area """ @abstractmethod def perimeter(self): """ Return rectangle perimeter """ @classmethod def __subclasshook__(cls, C): if cls is IRectangle: if all([ any("area" in B.__dict__ for B in C.__mro__), any("perimeter" in B.__dict__ for B in C.__mro__), any("width" in B.__dict__ for B in C.__mro__), any("height" in B.__dict__ for B in C.__mro__), ]): return True return NotImplemented If you have a function that works only on rectangles, let's say draw_rectangle(), you could annotate the interface of the expected argument as follows: def draw_rectangle(rectangle: IRectange): ... This adds nothing more than information for the developer about expected information. And even this is done through an informal contract because, as we know, bare annotations contain no syntactic meaning. However, they are accessible at runtime, so we can do something more. Here is an example implementation of a generic decorator that is able to verify interface from function annotation if it is provided using abstract base classes: def ensure_interface(function): signature = inspect.signature(function) parameters = signature.parameters @wraps(function) def wrapped(*args, **kwargs): bound = signature.bind(*args, **kwargs) for name, value in bound.arguments.items(): annotation = parameters[name].annotation if not isinstance(annotation, ABCMeta): continue if not isinstance(value, annotation): raise TypeError( "{} does not implement {} interface" "".format(value, annotation) ) function(*args, **kwargs) return wrapped Once it is done, we can create some concrete class that implicitly implements the IRectangle interface (without inheriting from IRectangle) and update the implementation of the draw_rectangle() function to see how the whole solution works: class ImplicitRectangle: def __init__(self, width, height): self._width = width self._height = height @property def width(self): return self._width @property def height(self): return self._height def area(self): return self.width * self.height def perimeter(self): return self.width * 2 + self.height * 2 @ensure_interface def draw_rectangle(rectangle: IRectangle): print( "{} x {} rectangle drawing" "".format(rectangle.width, rectangle.height) ) If we feed the draw_rectangle() function with an incompatible object, it will now raise TypeError with a meaningful explanation: >>> draw_rectangle('foo') Traceback (most recent call last): File "<input>", line 1, in <module> File "<input>", line 101, in wrapped TypeError: foo does not implement <class 'IRectangle'> interface But if we use ImplicitRectangle or anything else that resembles the IRectangle interface, the function executes as it should: >>> draw_rectangle(ImplicitRectangle(2, 10)) 2 x 10 rectangle drawing Our example implementation of ensure_interface() is based on the typechecked() decorator from the typeannotations project that tries to provide run-time checking capabilities (refer to https://github.com/ceronman/typeannotations). Its source code might give you some interesting ideas about how to process type annotations to ensure run-time interface checking. The last feature that can be used to complement this interface pattern landscape are type hints. Type hints are described in detail by PEP 484 and were added to the language quite recently. They are exposed in the new typing module and are available from Python 3.5. Type hints are built on top of function annotations and reuse this slightly forgotten syntax feature of Python 3. They are intended to guide type hinting and check for various yet-to-come Python type checkers. The typing module and PEP 484 document aim to provide a standard hierarchy of types and classes that should be used for describing type annotations. Still, type hints do not seem to be something revolutionary because this feature does not come with any type checker built-in into the standard library. If you want to use type checking or enforce strict interface compatibility in your code, you need to create your own tool because there is none worth recommendation yet. This is why we won't dig into details of PEP 484. Anyway, type hints and the documents describing them are worth mentioning because if some extraordinary solution emerges in the field of type checking in Python, it is highly probable that it will be based on PEP 484. Using collections.abc Abstract base classes are like small building blocks for creating a higher level of abstraction. They allow you to implement really usable interfaces but are very generic and designed to handle lot more than this single design pattern. You can unleash your creativity and do magical things but building something generic and really usable may require a lot of work. Work that may never pay off. This is why custom abstract base classes are not used so often. Despite that, the collections.abc module provides a lot of predefined ABCs that allow to verify interface compatibility of many basic Python types. With base classes provided in this module, you can check, for example, whether a given object is callable, mapping, or if it supports iteration. Using them with the isinstance() function is way better than comparing them against the base python types. You should definitely know how to use these base classes even if you don't want to define your own custom interfaces with ABCMeta. The most common abstract base classes from collections.abc that you will use from time to time are: Container: This interface means that the object supports the in operator and implements the __contains__() method Iterable: This interface means that the object supports the iteration and implements the __iter__() method Callable: This interface means that it can be called like a function and implements the __call__() method Hashable: This interface means that the object is hashable (can be included in sets and as key in dictionaries) and implements the __hash__ method Sized: This interface means that the object has size (can be a subject of the len() function) and implements the __len__() method A full list of the available abstract base classes from the collections.abc module is available in the official Python documentation (refer to https://docs.python.org/3/library/collections.abc.html). Summary Design patterns are reusable, somewhat language-specific solutions to common problems in software design. They are a part of the culture of all developers, no matter what language they use. We learned a small part of design patterns in this article. We covered what interfaces are and how they can be used in Python. Resources for Article: Further resources on this subject: Creating User Interfaces [article] Graphical User Interfaces for OpenSIPS 1.6 [article]
Read more
  • 0
  • 0
  • 9477

article-image-python-scripting-essentials
Packt
17 May 2016
15 min read
Save for later

Python Scripting Essentials

Packt
17 May 2016
15 min read
In this article by Rejah Rehim, author of the book Mastering Python Penetration Testing, we will cover: Setting up the scripting environment in different operating systems Installing third-party Python libraries Working with virtual environments Python language basics (For more resources related to this topic, see here.) Python is still the leading language in the world of penetration testing (pentesting) and information security. Python-based tools include all kinds oftools used for inputting massive amounts of random data to find errors and security loop holes, proxies, and even the exploit frameworks. If you are interested in tinkering with pentesting tasks, Python is the best language to learn because of its large number of reverse engineering and exploitation libraries. Over the years, Python has received numerous updates and upgrades. For example, Python 2 was released in 2000 and Python 3 in 2008. Unfortunately, Python 3 is not backward compatible; hence most of the programs written in Python 2 will not work in Python 3. Even though Python 3 was released in 2008, most of the libraries and programs still use Python 2. To do better penetration testing, the tester should be able to read, write, and rewrite python scripts. As a scripting language, security experts have preferred Python as a language to develop security toolkits. Its human-readable code, modular design, and large number of libraries provide a start for security experts and researchers to create sophisticated toolswith it. Python comes with a vast library (standard library) that accommodates almost everything from simple I/O to platform-specific APIcalls. Many of the default and user-contributed libraries and modules can help us in penetration testing with building tools to achieve interesting tasks. Setting up the scripting environment Your scripting environment is basically the computer you use for your daily workcombined with all the tools in it that you use to write and run Python programs. The best system to learn on is the one you are using right now. This section will help you to configure the Python scripting environment on your computer so that you can create and run your own programs. If you are using Mac OS X or Linux installation in your computer, you may have a Python Interpreter pre-installed in it. To find out if you have one, open terminal and type python. You will probably see something like this: $ python Python 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> From the preceding output, we can see that Python 2.7.6 is installed in this system. By issuing python in your terminal, you started Python interpreter in the interactive mode. Here, you can play around with Python commands; what you type will run and you'll see the outputs immediately. You can use your favorite text editor to write your Python programs. If you do not have one, then try installing Geany or Sublime Text and it should be perfect for you. These are simple editors and offer a straightforward way to write as well as run your Python programs. In Geany, the output is shown in a separate terminal window, whereas Sublime Text uses an embedded terminal window. Sublime Text is not free, but it has a flexible trial policy that allows you to use the editor without any stricture. It is one of the few cross-platform text editors that is quite apt for beginners and has a full range of functions targeting professionals. Setting up in Linux Linux system is built in a way that makes it smooth for users to get started with Python programming. Most Linux distributions already have Python installed. For example, the latest versions of Ubuntu and Fedora come with Python 2.7. Also, the latest versions of Redhat Enterprise (RHEL) and CentOS come with Python 2.6. Just for the records, you might want to check it. If it is not installed, the easiest way to install Python is to use the default package manger of your distribution, such as apt-get, yum, and so on. Install Python by issuing the following commands in the terminal. For Debian / Ubuntu Linux / Kali Linux users: sudo apt-get install python2 For Red Hat / RHEL / CentOS Linux user sudo yum install python To install Geany, leverage your distribution'spackage manger. For Debian / Ubuntu Linux / Kali Linux users: sudo apt-get install geany geany-common For Red Hat / RHEL / CentOS Linux users: sudo yum install geany Setting up in Mac Even though Macintosh is a good platform to learn Python, many people using Macs actually run some Linux distribution or the other on their computer or run Python within a virtual Linux machine. The latest version of Mac OS X, Yosemite, comes with Python 2.7 preinstalled. Once you verify that it is working, install Sublime Text. For Python to run on your Mac, you have to install GCC, which can be obtained by downloading XCode, the smaller command-line tool. Also, we need to install Homebrew, a package manager. To install Homebrew, open Terminal and run the following: $ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" After installing Homebrew, you have to insert the Homebrew directory into your PATH environment variable. You can do this by including the following line in your ~/.profile file: export PATH=/usr/local/bin:/usr/local/sbin:$PATH Now that we are ready to install Python 2.7, run the following command in your terminal that will do the rest: $ brew install python To install Sublime Text, go to Sublime Text's downloads page in http://www.sublimetext.com/3 and click on the OS X link. This will get you the Sublime Text installer for your Mac. Setting up in Windows Windows does not have Python preinstalled on it. To check whether it isinstalled, open a command prompt and type the word python, and press Enter. In most cases, you will get a message that says Windows does not recognize python as a command. We have to download an installer that will set Python for Windows. Then, we have to install and configure Geany to run Python programs. Go to Python's download page in https://www.python.org/downloads/windows/and download the Python 2.7 installer, which is compatible with your system. If you are not aware of your operating systems architecture, then download 32-bit installers, which will work on both the architectures, but 64-bit will only work on 64-bit systems. To install Geany, go to Geany'sdownload page viahttp://www.geany.org/Download/Releases and download the full installer variant, which has a description Full Installer including GTK 2.16. By default, Geany doesn't know where Python resides on your system. So, we need to configure it manually. For this, write a Hello world program in Geany, save it anywhere in your system as hello.py, and run it. There are three methods you can run a python program in Geany: Select Build | Execute. Press F5. Click the icon with three gears on it: When you have a running hello.py program in Geany, go to Build | Set Build Commands. Then, enter the python commands option withC:Python27python -m py_compile "%f"and execute command withC:Python27python "%f". Now, you can run your Python programs while coding in Geany. It is recommended to run a Kali Linux distribution as a virtual machine and use this as your scripting environment. Kali Linux comes with a number of tools preinstalled and is based on Debian Linux, so you'll also be able to install a wide variety of additional tools and libraries. Also, some of the libraries will not work properly on Windows systems. Installing third-party libraries We will be using many Python libraries and this section will help you install and use third-party libraries. Setuptools and pip One of the most useful pieces of third-party Python software is Setuptools. With Setuptools, you could download and install any compliant Python libraries with a single command. The best way to install Setuptools on any system is to download the ez_setup.py file from https://bootstrap.pypa.io/ez_setup.pyand run this file with your Python installation. In Linux, run this in terminal with the correct path to theez_setup.py script: sudo python path/to/ez_setup.py For Windows 8 or the older versions of Windows with PowerShell 3 installed, start Powershell with Administrative privileges and run this command in it: > (Invoke-WebRequest https://bootstrap.pypa.io/ez_setup.py).Content | python - For Windows systems without a PowerShell 3 installed, download the ez_setup.py file from the link provided previously using your web browser and run that file with your Python installation. pipis a package management system used to install and manage software packages written in Python.After the successful installation of Setuptools, you can install pip by simply opening a command prompt and running the following: $ easy_install pip Alternatively, you could also install pip using your default distribution package managers: On Debian, Ubuntu and Kali Linux: sudo apt-get install python-pip On Fedora: sudo yum install python-pip Now, you could run pip from the command line. Try installing a package with pip: $ pip install packagename Working with virtual environments Virtual environment helps separate dependencies required for different projects; by working inside the virtual environment, it also helps to keep our global site-packages directory clean. Using virtualenv and virtualwrapper virtualenv is a python module which helps to create isolated Python environments for our each scripting experiments, which creates a folder with all necessary executable files and modules for a basic python project. You can install virtual virtualenv with the following command: sudo pip install virtualenv To create a new virtual environment,create a folder and enter into the folder from commandline: $ cd your_new_folder $ virtualenv name-of-virtual-environment This will initiate a folder with the provided name in your current working directory with all the Python executable files and pip library, which will then help install other packages in your virtual environment. You can select a Python interpreter of your choice by providing more parameters, such as the following command: $ virtualenv -p /usr/bin/python2.7 name-of-virtual-environment This will create a virtual environment with Python 2.7 .We have to activate it before we start using this virtual environment: $ source name-of-virtual-environment/bin/activate Now, on the left-hand side of the command prompt, the name of the active virtual environment will appear. Any package that you install inside this prompt using pip will belong to the active virtual environment, which will be isolated from all the other virtual environments and global installation. You can deactivate and exit from the current virtual environment using this command: $ deactivate virtualenvwrapper provides a better way to use virtualenv. It also organize all the virtual environments in one place. To install, we can use pip, but let's make sure we have installed virtualenv before installing virtualwrapper. Linux and OS X users can install it with the following method: $ pip install virtualenvwrapper Also,add thesethree lines inyour shell startup file like .bashrc or .profile. export WORKON_HOME=$HOME/.virtualenvs export PROJECT_HOME=$HOME/Devel source /usr/local/bin/virtualenvwrapper.sh This will set theDevel folder in your home directory as the location of your virtual environment projects. For Windows users, we can use another package virtualenvwrapper-win . This can also be installed with pip. pip install virtualenvwrapper-win Create a virtual environment with virtualwrapper: $ mkvirtualenv your-project-name This creates a folder with the provided name inside ~/Envs. To activate this environment, we can use workon command: $ workon your-project-name These two commands can be combined with the single one,as follows: $ mkproject your-project-name We can deactivate the virtual environment with the same deactivate command in virtualenv. To delete a virtual environment, we can use the following command: $ rmvirtualenv your-project-name Python language essentials In this section, we will go through the idea of variables, strings, data types, networking, and exception handling. As an experienced programmer, this section will be just a summarization of what you already know about Python. Variables and types Python is brilliant in case of variables—variable point to data stored in a memory location. This memory location may contain different values, such as integer, real number, Booleans, strings, lists, and dictionaries. Python interprets and declares variables when you set some value to this variable. For example, if we set: a = 1 andb = 2 Then, we will print the sum of these two variables with: print (a+b) The result will be 3 as Python will figure out both a and b are numbers. However, if we had assigned: a = "1" and b = "2" Then,the output will be 12, since both a and b will be considered as strings. Here, we do not have to declare variables or their type before using them, as each variable is an object. The type() method can be used to getthe variable type. Strings As any other programming language, strings are one of the important things in Python. They are immutable. So, they cannot be changed once they are defined. There are many Python methods, which can modify string. They do nothing to the original one, but create a copy and return after modifications. Strings can be delimited with single quotes, double quotes, or in case of multiple lines, we can use triple quotes syntax. We can use the character to escape additional quotes, which come inside a string. Commonly used string methods are: string.count('x'):This returns the number of occurrences of 'x' in the string string.find('x'):This returns the position of character 'x'in the string string.lower():This converts the string into lowercase string.upper():This converts the string into uppercase string.replace('a', 'b'):This replaces alla with b in the string Also, we can get the number of characters including white spaces in a string with the len() method. #!/usr/bin/python a = "Python" b = "Pythonn" c = "Python" print len(a) print len(b) print len(c) You can read more about the string function via https://docs.python.org/2/library/string.html. Lists Lists allow to store more than one variable inside it and provide a better method for sorting arrays of objects in Python. They also have methods that will help to manipulate the values inside them. list = [1,2,3,4,5,6,7,8] print (list[1]) This will print 2, as the Python index starts from 0. Print out the whole list: list = [1,2,3,4,5,6,7,8] for x in list: print (x) This will loop through all the elements and print them. Useful list methods are: .append(value):This appends an element at the end of list .count('x'):This gets the the number of 'x' in list .index('x'):This returns the index of 'x' in list .insert('y','x'):This inserts 'x' at location 'y' .pop():This returns last element and also remove it from list .remove('x'):This removes first 'x' from list .reverse():This reverses the elements in the list .sort():This sorts the list alphabetically in ascending order, or numerical in ascending order Dictionaries A Python dictionary is a storage method for key:value pairs. In Python, dictionaries are enclosed in curly braces, {}. For example, dictionary = {'item1': 10, 'item2': 20} print(dictionary['item2']) This will output 20. We cannot create multiple values with the same key. This will overwrite the previous value of the duplicate keys. Operations on dictionaries are unique. Slicing is not supported in dictionaries We can combine two distinct dictionaries to one by using the update method. Also, the update method will merge existing elements if they conflict: a = {'apples': 1, 'mango': 2, 'orange': 3} b = {'orange': 4, 'lemons': 2, 'grapes ': 4} a.update(b) Print a This will return: {'mango': 2, 'apples': 1, 'lemons': 2, 'grapes ': 4, 'orange': 4} To delete elements from a dictionary, we can use the del method: del a['mango'] print a This will return: {'apples': 1, 'lemons': 2, 'grapes ': 4, 'orange': 4} Networking Sockets are the basic blocks behind all the network communications by a computer. All network communications go through a socket. So, sockets are the virtual endpoints of any communication channel that takes place between two applications, which may reside on the same or different computers. The socket module in Python provides us a better way to create network connections with Python. So, to make use of this module, we will have to import this in our script: import socket socket.setdefaulttimeout(3) newSocket = socket.socket() newSocket.connect(("localhost",22)) response = newSocket.recv(1024) print response This script will get the response header from the server. Handling Exceptions Even though we wrote syntactically correct scripts, there will be some errors while executing them. So, we will have to handle the errors properly. The simplest way to handle exception in Python is try-except: Try to divide a number with zero in your Python interpreter: >>> 10/0 Traceback (most recent call last): File "<stdin>", line 1, in <module> ZeroDivisionError: integer division or modulo by zero So, we can rewrite this script with thetry-except blocks: try: answer = 10/0 except ZeroDivisionError, e: answer = e print answer This will return the error integer division or modulo by zero. Summary Now, we have an idea about basic installations and configurations that we have to do before coding. Also, we have gone through the basics of Python, which may help us speed up scripting. Resources for Article:   Further resources on this subject: Exception Handling in MySQL for Python [article] An Introduction to Python Lists and Dictionaries [article] Python LDAP applications - extra LDAP operations and the LDAP URL library [article]
Read more
  • 0
  • 0
  • 67256
article-image-creating-horizon-desktop-pools
Packt
16 May 2016
17 min read
Save for later

Creating Horizon Desktop Pools

Packt
16 May 2016
17 min read
A Horizon desktop pool is a collection of desktops that users select when they log in using the Horizon client. A pool can be created based on a subset of users, such as finance, but this is not explicitly required unless you will be deploying multiple virtual desktop master images. The pool can be thought of as a central point of desktop management within Horizon; from it you create, manage, and entitle access to Horizon desktops. This article by Jason Ventresco, author of the book Implementing VMware Horizon View 6.X, will discuss how to create a desktop pool using the Horizon Administrator console, an important administrative task. (For more resources related to this topic, see here.) Creating a Horizon desktop pool This section will provide an example of how to create two different Horizon dedicated assignment desktop pools, one based on Horizon Composer linked clones and another based on full clones. Horizon Instant Clone pools only support floating assignment, so they have fewer options compared to the other types of desktop pools. Also discussed will be how to use the Horizon Administrator console and the vSphere client to monitor the provisioning process. The examples provided for full clone and linked clone pools created dedicated assignment pools, although floating assignment may be created as well. The options will be slightly different for each, so refer the information provided in the Horizon documentation (https://www.vmware.com/support/pubs/view_pubs.html), to understand what each setting means. Additionally, the Horizon Administrator console often explains each setting within the desktop pool configuration screens. Creating a pool using Horizon Composer linked clones The following steps outline how to use the Horizon Administrator console to create a dedicated assignment desktop pool using Horizon Composer linked clones. As discussed previously, it is assumed that you already have a virtual desktop master image that you have created a snapshot of. During each stage of the pool creation process, a description of many of the settings is displayed in the right-hand side of the Add Desktop Pool window. In addition, a question mark appears next to some of the settings; click on it to read important information about the specified setting. Log on to the Horizon Administrator console using an AD account that has administrative permissions within Horizon. Open the Catalog | Desktop Pools window within the console. Click on the Add… button in the Desktop Pools window to open the Add Desktop Pool window. In the Desktop Pool Definition | Type window, select the Automated Desktop Pool radio button as shown in the following screenshot, and then click on Next >: In the Desktop Pool Definition | User Assignment window, select the Dedicated radio button and check the Enable automatic assignment checkbox as shown in the following screenshot, and then click on Next >: In the Desktop Pool Definition | vCenter Server window, select the View Composer linked clones radio button, highlight the vCenter server as shown in the following screenshot, and then click on Next >: In the Setting | Desktop Pool Identification window, populate the pool ID: as shown in the following screenshot, and then click on Next >. Optionally, configure the Display Name: field. When finished, click on Next >: In the Setting | Desktop Pool Settings window, configure the various settings for the desktop pool. These settings can also be adjusted later if desired. When finished, click on Next >: In the Setting | Provisioning Settings window, configure the various provisioning options for the desktop pool that include the desktop naming format, the number of desktops, and the number of desktops that should remain available during Horizon Composer maintenance operations. When finished, click on Next >: When creating a desktop naming pattern, use a {n} to instruct Horizon to insert a unique number in the desktop name. For example, using Win10x64{n} as shown in the preceding screenshot will name the first desktop Win10x641, the next Win10x642, and so on. In the Setting | View Composer Disks window, configure the settings for your optional linked clone disks. By default, both a Persistent Disk for user data and a non-persistent disk for Disposable File Redirection are created. When finished, click on Next >: In the Setting | Storage Optimization window, we configure whether or not our desktop storage is provided by VMware Virtual SAN, and if not whether or not to separate our Horizon desktop replica disks from the individual desktop OS disks. In our example, we have checked the Use VMware Virtual SAN radio button as that is what our destination vSphere cluster is using. When finished, click on Next >: As all-flash storage arrays or all-flash or flash-dependent Software Defined Storage (SDS) platforms become more common, there is less of a need to place the shared linked clone replica disks on separate, faster datastores than the individual desktop OS disks. In the Setting | vCenter Settings window, we will need to configure six different options that include selecting the parent virtual machine, which snapshot of that virtual machine to use, what vCenter folder to place the desktops in, what vSphere cluster and resource pool to deploy the desktops to, and what datastores to use. Click on the Browse… button next to the Parent VM: field to begin the process and open the Select Parent VM window: In the Select Parent VM window, highlight the virtual desktop master image that you wish to deploy desktops from, as shown in the following screenshot. Click on OK when the image is selected to return to the previous window: The virtual machine will only appear if a snapshot has been created. In the Setting | vCenter Settings window, click on the Browse… button next to the Snapshot: field to open the Select default image window. Select the desired snapshot, as shown in the following screenshot, and click on OK to return to the previous window: In the Setting | vCenter Settings window, click on the Browse… button next to the VM folder location: field to open the VM Folder Location window, as shown in the following screenshot. Select the folder within vCenter where you want the desktop virtual machines to be placed, and click on OK to return to the previous window: In the Setting | vCenter Settings window, click on the Browse… button next to the Host or cluster: field to open the Host or Cluster window, as shown in the following screenshot. Select the cluster or individual ESXi server within vCenter where you want the desktop virtual machines to be created, and click on OK to return to the previous window: In the Setting | vCenter Settings window, click on the Browse… button next to the Resource pool: field to open the Resource Pool window, as shown in the following screenshot. If you intend to place the desktops within a resource pool you would select that here; if not select the same cluster or ESXi server you chose in the previous step. Once finished, click on OK to return to the previous window: In the Setting | vCenter Settings window, click on the Browse… button next to the Datastores: field to open the Select Linked Clone Datastores window, as shown in the following screenshot. Select the datastore or datastores where you want the desktops to be created, and click on OK to return to the previous window: If you were using storage other than VMware Virtual SAN, and had opted to use separate datastores for your OS and replica disks in step 11, you would have had to select unique datastores for each here instead of just one. Additionally, you would have had the option to configure the storage overcommit level. The Setting | vCenter Settings window should now have all options selected, enabling the Next > button. When finished, click on Next >: In the Setting | Advanced Storage Options window, if desired select and configure the Use View Storage Accelerator and Other Options check boxes to enable those features. In our example, we have enabled both the Use View Storage Accelerator and Reclaim VM disk space options, and configured Blackout Times to ensure that these operations do not occur between 8 A.M. (08:00) and 5 P.M. (17:00) on weekdays. When finished, click on Next >: The Use native NFS snapshots (VAAI) feature enables Horizon to leverage features of the a supported NFS storage array to offload the creation of linked clone desktops. If you are using an external array with your Horizon ESXi servers, consult the product documentation to understand if it supports this feature. Since we are using VMware Virtual SAN, this and other options under Other Options are greyed out as these settings are not needed. Additionally, if View Storage Accelerator is not enabled in the vCenter Server settings the option to use it would be greyed out here. In the Setting | Guest Customization window, select the Domain: where the desktops will be created, the AD container: where the computer accounts will be placed, whether to Use QuickPrep or Use a customization specification (Sysprep), and any other options as required. When finished, click on Next >: In the Setting | Ready to Complete window, verify that the settings we selected were correct, using the < Back button if needed to go back and make changes. If all the settings are correct, click on Finish to initiate the creation of the desktop pool. The Horizon desktop pool and virtual desktops will now be created. Creating a pool using Horizon Instant Clones The process used to create an Instant Clone desktop pool is similar to that used to create a linked clone pool. As discussed previously, it is assumed that you already have a virtual desktop master image that has the Instant Clone option enabled in the Horizon agent, and that you have taken a snapshot of that master image. A master image can have either the Horizon Composer (linked clone) option or Instant Clone option enabled in the Horizon agent, but not both. To get around this restriction you can configure one snapshot of the master image with the View Composer option installed, and a second with the Instant Clone option installed. The following steps outline the process used to create the Instant Clone desktop pool. Screenshots are included only when the step differs significantly from the same step in the Creating a pool using Horizon Composer linked clones section. Log on to the Horizon Administrator console using an AD account that has administrative permissions within Horizon. Open the Catalog | Desktop Pools window within the console. Click on the Add… button in the Desktop Pools window to open the Add Desktop Pool window. In the Desktop Pool Definition | Type window, select the Automated Desktop Pool radio button as shown in the following screenshot, and then click on Next >. In the Desktop Pool Definition | User Assignment window, select the Floating radio button (mandatory for Instant Clone desktops), and then click on Next >. In the Desktop Pool Definition | vCenter Server window, select the View Composer linked clones radio button as shown in the following screenshot, highlight the vCenter server, and then click on Next >: If Instant Clones is greyed out here, it is usually because you did not select Floating in the previous step. In the Setting | Desktop Pool Identification window, populate the pool ID:, and then click on Next >. Optionally, configure the Display Name: field. In the Setting | Desktop Pool Settings window, configure the various settings for the desktop pool. These settings can also be adjusted later if desired. When finished, click on Next >. In the Setting | Provisioning Settings window, configure the various provisioning options for the desktop pool that include the desktop naming format, the number of desktops, and the number of desktops that should remain available during maintenance operations. When finished, click on Next >. Instant Clones are required to always be powered on, so some options available to linked clones will be greyed out here. In the Setting | Storage Optimization window, we configure whether or not our desktop storage is provided by VMware Virtual SAN, and if not whether or not to separate our Horizon desktop replica disks from the individual desktop OS disks. When finished, click on Next >. In the Setting | vCenter Settings window, we will need to configure six different options that include selecting the parent virtual machine, which snapshot of that virtual machine to use, what vCenter folder to place the desktops in, what vSphere cluster and resource pool to deploy the desktops to, and what datastores to use. Click on the Browse… button next to the Parent VM: Horizon Instant Clone field to begin the process and open the Select Parent VM window. In the Select Parent VM window, highlight the virtual desktop master image that you wish to deploy desktops from. Click on OK when the image is selected to return to the previous window. In the Setting | vCenter Settings window, click on the Browse… button next to the Snapshot: field to open the Select default image window. Select the desired snapshot, and click on OK to return to the previous window. In the Setting | vCenter Settings window, click on the Browse… button next to the VM folder location: field to open the VM Folder Location window. Select the folder within vCenter where you want the desktop virtual machines to be placed, and click on OK to return to the previous window. In the Setting | vCenter Settings window, click on the Browse… button next to the Host or cluster: field to open the Host or Cluster window. Select the cluster or individual ESXi server within vCenter where you want the desktop virtual machines to be created, and click on OK to return to the previous window. In the Setting | vCenter Settings window, click on the Browse… button next to the Resource pool: field to open the Resource Pool window. If you intend to place the desktops within a resource pool you would select that here; if not select the same cluster or ESXi server you chose in the previous step. Once finished, click on OK to return to the previous window. In the Setting | vCenter Settings window, click on the Browse… button next to the Datastores: field to open the Select Instant Clone Datastores window. Select the datastore or datastores where you want the desktops to be created, and click on OK to return to the previous window. The Setting | vCenter Settings window should now have all options selected, enabling the Next > button. When finished, click on Next >. In the Setting | Guest Customization window, select the Domain: where the desktops will be created, the AD container: where the computer accounts will be placed, and any other options as required. When finished, click on Next >. Instant Clones only support ClonePrep for customization, so there are fewer options here than seen when deploying a linked clone desktop pool. In the Setting | Ready to Complete window, verify that the settings we selected were correct, using the < Back button if needed to go back and make changes. If all the settings are correct, click on Finish to initiate the creation of the desktop pool. The Horizon desktop pool and Instant Clone virtual desktops will now be created. Creating a pool using full clones The process used to create full clone desktops pool is similar to that used to create a linked clone pool. As discussed previously, it is assumed that you already have a virtual desktop master image that you have converted to a vSphere template. In addition, if you wish for Horizon to perform the virtual machine customization, you will need to create a Customization Specification using the vCenter Customization Specifications Manager. The Customization Specification is used by the Windows Sysprep utility to complete the guest customization process. Visit the VMware vSphere virtual machine administration guide (http://pubs.vmware.com/vsphere-60/index.jsp) for instructions on how to create a Customization Specification. The following steps outline the process used to create the full clone desktop pool. Screenshots are included only when the step differs significantly from the same step in the Creating a pool using Horizon Composer linked clones section. Log on to the Horizon Administrator console using an AD account that has administrative permissions within Horizon. Open the Catalog | Desktop Pools window within the console. Click on the Add… button in the Desktop Pools window to open the Add Desktop Pool window. In the Desktop Pool Definition | Type window select the Automated Pool radio button and then click on Next. In the Desktop Pool Definition | User Assignment window, select the Dedicated radio button, check the Enable automatic assignment checkbox, and then click on Next. In the Desktop Pool Definition | vCenter Server window, click the Full virtual machines radio button, highlight the desired vCenter server, and then click on Next. In the Setting | Desktop Pool Identification window, populate the pool ID: and Display Name: fields and then click on Next. In the Setting | Desktop Pool Settings window, configure the various settings for the desktop pool. These settings can also be adjusted later if desired. When finished, click on Next >. In the Setting | Provisioning Settings window, configure the various provisioning options for the desktop pool that include the desktop naming format and number of desktops. When finished, click on Next >. In the Setting | Storage Optimization window, we configure whether or not our desktop storage is provided by VMware Virtual SAN. When finished, click on Next >. In the Setting | vCenter Settings window, we will need to configure settings that set the virtual machine template, what vSphere folder to place the desktops in, which ESXi server or cluster to deploy the desktops to, and which datastores to use. Other than the Template setting described in the next step, each of these settings is identical to those seen when creating a Horizon Composer linked clone pool. Click on the Browse… button next to each of the settings in turn and select the appropriate options. To configure the Template: setting, select the vSphere template that you created from your virtual desktop master image as shown in the following screenshot, and then click OK to return to the previous window: A template will only appear if one is present within vCenter. Once all the settings in the Setting | vCenter Settings window have been configured, click on Next >. In the Setting | Advanced Storage Options window, if desired select and configure the Use View Storage Accelerator radio buttons and configure Blackout Times. When finished, click on Next >. In the Setting | Guest Customization window, select either the None | Customization will be done manually or Use this customization specification radio button, and if applicable select a customization specification. When finished, click on Next >. In the following screenshot, we have selected the Win10x64-HorizonFC customization specification that we previously created within vCenter: Manual customization is typically used when the template has been configured to run Sysprep automatically upon start up, without requiring any interaction from either Horizon or VMware vSphere. In the Setting | Ready to Complete window, verify that the settings we selected were correct, using the < Back button if needed to go back and make changes. If all the settings are correct, click on Finish to initiate the creation of the desktop pool. The desktop pool and virtual desktops will now be created. Summary In this article, we have learned about Horizon desktop pools. In addition to learning how to create three different types of desktop pools, we were introduced to a number of key concepts that are part of the pool creation process. Resources for Article: Further resources on this subject: Essentials of VMware vSphere [article] Cloning and Snapshots in VMware Workstation [article] An Introduction to VMware Horizon Mirage [article]
Read more
  • 0
  • 0
  • 13120

article-image-programming-raspberry-pi-robots-javascript
Anna Gerber
16 May 2016
6 min read
Save for later

Programming Raspberry-Pi Robots with JavaScript

Anna Gerber
16 May 2016
6 min read
The Raspberry Pi Foundation recently announced a smaller, cheaper single-board computer—the Raspberry Pi Zero. Priced at $5 and measuring about half the size of Model A+, the new Pi Zero is ideal for embedded applications and robotics projects. Johnny-Five supports programming Raspberry Pi-based robots via a Firmata-compatible interface that is implemented via the raspi-io IO Plugin for Node.js. This post steps you through building a robot with Raspberry-Pi and Johnny-Five. What you'll need Raspberry Pi (for example, B+, 2, or Zero) Robot chassis. We're using a laser-cut acrylic "Smart Robot Car" kit that includes two DC motors with wheels and a castor. You can find these on eBay for around $10. 5V power supply (USB battery packs used for charging mobile phones are ideal) 4 x AA battery holder for the motor Texas Instruments L293NE Motor Driver IC Solderless breadboard and jumper wires USB Keyboard and mouse Monitor or TV with HDMI cable USB Wi-Fi adapter For Pi Zero only: mini HDMI—HDMI adaptor or cable, USB-on-the-go connector and powered USB Hub   A laser cut robot chassis Attach peripherals If you are using a Raspberry Pi B+ or 2, you can attach a monitor or TV screen via HDMI, and plug in a USB keyboard, a USB Wi-Fi adapter, and a mouse directly. The Raspberry Pi Zero doesn't have as many ports as the older Raspberry Pi models, so you'll need to use a USB-on-the-go cable and a powered USB hub to attach the peripherals. You'll also need a micro-HDMI-to-HDMI cable (or micro-HDMI-to-HDMI adapter) for the monitor. The motors for the robot wheels will be connected via the GPIO pins, but first we'll install the operating system. Prepare the micro SD card Raspberry Pi runs the Linux operating system, which you can install on an 8 GB or larger micro SD card: Use a USB adapter or built-in SD card reader to format your micro SD card using SD Formatter for Windows or Mac. Download the "New Out Of the Box Software" install manager (NOOBS), unzip, and copy the extracted files to your micro SD card. Remove the micro SD card from your PC and insert it into the Raspberry Pi. Power The Raspberry Pi requires 5V power supplied via the micro-USB power port. If the power supplied drops suddenly, the Pi may restart, which can lead to corruption of the micro SD card. Use a 5V power bank or an external USB power adaptor to ensure that there will be an uninterrupted supply. When we plug in the motors, we'll use separate batteries so that they don't draw power from the board, which can potentially damage the Raspberry Pi. Install Raspbian OS Power up the Raspberry Pi and follow the on-screen prompts to install Raspbian Linux. This process takes about half an hour and the Raspberry Pi will reboot after the OS has finished installing. The latest version of Raspbian should log you in automatically and launch the graphical UI by default. If not, sign in using the username pi and password raspberry. Then type startx at the command prompt to start the X windows graphical UI. Set up networking The Raspberry Pi will need to be online to install the Johnny-Five framework. Connect the Wi-Fi adapter, select your access point from the network menu at the top right of the graphical UI, and then enter your network password and connect. We'll be running the Raspberry Pi headless (without a screen) for the robot, so if you want to be able to connect to your Raspberry Pi desktop later, now would be a good time to enable remote access via VNC. Make sure you have the latest version of the installed packages by running the following commands from the terminal: sudo apt-get updatesudo apt-get upgrade Install Node.js and Johnny-Five Raspbian comes with a legacy version of Node.js installed, but we'll need a more recent version. Launch a terminal to uninstall the legacy version, and download and update to the latest by running the following commands: sudo apt-get uninstall nodejs-legacy cd ~ wget http://node-arm.herokuapp.com/node_latest_armhf.deb sudo dpkg -i node_latest_armhf.deb If npm is not installed, you can install it with sudo apt-get install npm. Create a folder for your code and install the johnny-five framework and the raspi-io IO Plugin from npm: mkdir ~/myrobot cd myrobot npm install johnny-five npm install raspi-io Make the robot move A motor converts electricity into movement. You can control the speed by changing the voltage supplied and control the direction by switching the polarity of the voltage. Connect the motors as shown, with an H-bridge circuit: Pins 32 and 35 support PWM, so we'll use these to control the motor speed. We can use any of the digital IO pins to control the direction for each motor, in this case pins 13 and 15. See Raspi-io Pin Information for more details on pins. Use a text editor (for example, nano myrobot.js) to create the JavaScript program: var raspi = require('raspi-io'); var five = require('johnny-five'); var board = new five.Board({ io: new raspi() }); board.on('ready', function() { var leftMotor = new five.Motor({ pins: {pwm: "P1-35", dir: "P1-13"}, invertPWM: true }); var rightMotor = new five.Motor({ pins: {pwm: "P1-32", dir: "P1-15"}, invertPWM: true }); board.repl.inject({ l: leftMotor, r: rightMotor }); leftMotor.forward(150); rightMotor.forward(150); }); Accessing GPIO requires root permissions, so run the program using sudo: sudo node myrobot.js. Use differential drive to propel the robot, by controlling the motors on either side of the chassis independently. Experiment with driving each wheel using the Motor API functions (stop, start, forward, and reverse, providing different speed parameters) via the REPL. If both motors have the same speed and direction, the robot will move in a straight line. You can turn the robot by moving the wheels at different rates. Go wireless Now you can unplug the screen, keyboard, and mouse from the Raspberry Pi. You can attach it and the batteries and breadboard to the chassis using double-sided tape. Power the Raspberry Pi using the 5V power pack. Connect to your Raspberry Pi via ssh or VNC over Wi-Fi to run or modify the program. Eventually, you might want to add sensors and program some line-following or obstacle-avoidance behavior to make the robot autonomous. The raspi-io plugin supports 3.3V digital and I2C sensors. About the author Anna Gerber is a full-stack developer with 15 years of experience in the university sector. She was a technical project manager at The University of Queensland (ITEE eResearch). She specializes in digital humanities and is a research scientist at the Distributed System Technology Centre (DSTC). Anna is a JavaScript robotics enthusiast and maker who enjoys tinkering with soft circuits and 3D printers.
Read more
  • 0
  • 0
  • 15151
Modal Close icon
Modal Close icon