Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7009 Articles
article-image-starting-yarn-basics
Packt
01 Sep 2015
15 min read
Save for later

Starting with YARN Basics

Packt
01 Sep 2015
15 min read
In this article by Akhil Arora and Shrey Mehrotra, authors of the book Learning YARN, we will be discussing how Hadoop was developed as a solution to handle big data in a cost effective and easiest way possible. Hadoop consisted of a storage layer, that is, Hadoop Distributed File System (HDFS) and the MapReduce framework for managing resource utilization and job execution on a cluster. With the ability to deliver high performance parallel data analysis and to work with commodity hardware, Hadoop is used for big data analysis and batch processing of historical data through MapReduce programming. (For more resources related to this topic, see here.) With the exponential increase in the usage of social networking sites such as Facebook, Twitter, and LinkedIn and e-commerce sites such as Amazon, there was the need of a framework to support not only MapReduce batch processing, but real-time and interactive data analysis as well. Enterprises should be able to execute other applications over the cluster to ensure that cluster capabilities are utilized to the fullest. The data storage framework of Hadoop was able to counter the growing data size, but resource management became a bottleneck. The resource management framework for Hadoop needed a new design to solve the growing needs of big data. YARN, an acronym for Yet Another Resource Negotiator, has been introduced as a second-generation resource management framework for Hadoop. YARN is added as a subproject of Apache Hadoop. With MapReduce focusing only on batch processing, YARN is designed to provide a generic processing platform for data stored across a cluster and a robust cluster resource management framework. In this article, we will cover the following topics: Introduction to MapReduce v1 Shortcomings of MapReduce v1 An overview of the YARN components The YARN architecture How YARN satisfies big data needs Projects powered by YARN Introduction to MapReduce v1 MapReduce is a software framework used to write applications that simultaneously process vast amounts of data on large clusters of commodity hardware in a reliable, fault-tolerant manner. It is a batch-oriented model where a large amount of data is stored in Hadoop Distributed File System (HDFS), and the computation on data is performed as MapReduce phases. The basic principle for the MapReduce framework is to move computed data rather than move data over the network for computation. The MapReduce tasks are scheduled to run on the same physical nodes on which data resides. This significantly reduces the network traffic and keeps most of the I/O on the local disk or within the same rack. The high-level architecture of the MapReduce framework has three main modules: MapReduce API: This is the end-user API used for programming the MapReduce jobs to be executed on the HDFS data. MapReduce framework: This is the runtime implementation of various phases in a MapReduce job such as the map, sort/shuffle/merge aggregation, and reduce phases. MapReduce system: This is the backend infrastructure required to run the user's MapReduce application, manage cluster resources, schedule thousands of concurrent jobs, and so on. The MapReduce system consists of two components—JobTracker and TaskTracker. JobTracker is the master daemon within Hadoop that is responsible for resource management, job scheduling, and management. The responsibilities are as follows: Hadoop clients communicate with the JobTracker to submit or kill jobs and poll for jobs' progress JobTracker validates the client request and if validated, then it allocates the TaskTracker nodes for map-reduce tasks execution JobTracker monitors TaskTracker nodes and their resource utilization, that is, how many tasks are currently running, the count of map-reduce task slots available, decides whether the TaskTracker node needs to be marked as blacklisted node, and so on JobTracker monitors the progress of jobs and if a job/task fails, it automatically reinitializes the job/task on a different TaskTracker node JobTracker also keeps the history of the jobs executed on the cluster TaskTracker is a per node daemon responsible for the execution of map-reduce tasks. A TaskTracker node is configured to accept a number of map-reduce tasks from the JobTracker, that is, the total map-reduce tasks a TaskTracker can execute simultaneously. The responsibilities are as follows: TaskTracker initializes a new JVM process to perform the MapReduce logic. Running a task on a separate JVM ensures that the task failure does not harm the health of the TaskTracker daemon. TaskTracker monitors these JVM processes and updates the task progress to the JobTracker on regular intervals. TaskTracker also sends a heartbeat signal and its current resource utilization metric (available task slots) to the JobTracker every few minutes. Shortcomings of MapReducev1 Though the Hadoop MapReduce framework was widely used, the following are the limitations that were found with the framework: Batch processing only: The resources across the cluster are tightly coupled with map-reduce programming. It does not support integration of other data processing frameworks and forces everything to look like a MapReduce job. The emerging customer requirements demand support for real-time and near real-time processing on the data stored on the distributed file systems. Nonscalability and inefficiency: The MapReduce framework completely depends on the master daemon, that is, the JobTracker. It manages the cluster resources, execution of jobs, and fault tolerance as well. It is observed that the Hadoop cluster performance degrades drastically when the cluster size increases above 4,000 nodes or the count of concurrent tasks crosses 40,000. The centralized handling of jobs control flow resulted in endless scalability concerns for the scheduler. Unavailability and unreliability: The availability and reliability are considered to be critical aspects of a framework such as Hadoop. A single point of failure for the MapReduce framework is the failure of the JobTracker daemon. The JobTracker manages the jobs and resources across the cluster. If it goes down, information related to the running or queued jobs and the job history is lost. The queued and running jobs are killed if the JobTracker fails. The MapReduce v1 framework doesn't have any provision to recover the lost data or jobs. Partitioning of resources: A MapReduce framework divides a job into multiple map and reduce tasks. The nodes with running the TaskTracker daemon are considered as resources. The capability of a resource to execute MapReduce jobs is expressed as the number of map-reduce tasks a resource can execute simultaneously. The framework forced the cluster resources to be partitioned into map and reduce task slots. Such partitioning of the resources resulted in less utilization of the cluster resources. If you have a running Hadoop 1.x cluster, you can refer to the JobTracker web interface to view the map and reduce task slots of the active TaskTracker nodes. The link for the active TaskTracker list is as follows: http://JobTrackerHost:50030/machines.jsp?type=active Management of user logs and job resources: The user logs refer to the logs generated by a MapReduce job. Logs for MapReduce jobs. These logs can be used to validate the correctness of a job or to perform log analysis to tune up the job's performance. In MapReduce v1, the user logs are generated and stored on the local file system of the slave nodes. Accessing logs on the slaves is a pain as users might not have the permissions issued. Since logs were stored on the local file system of a slave, in case the disk goes down, the logs will be lost. A MapReduce job might require some extra resources for job execution. In the MapReduce v1 framework, the client copies job resources to the HDFS with the replication of 10. Accessing resources remotely or through HDFS is not efficient. Thus, there's a need for localization of resources and a robust framework to manage job resources. In January 2008, Arun C. Murthy logged a bug in JIRA against the MapReduce architecture, which resulted in a generic resource scheduler and a per job user-defined component that manages the application execution. You can see this at https://issues.apache.org/jira/browse/MAPREDUCE-279 An overview of YARN components YARN divides the responsibilities of JobTracker into separate components, each having a specified task to perform. In Hadoop-1, the JobTracker takes care of resource management, job scheduling, and job monitoring. YARN divides these responsibilities of JobTracker into ResourceManager and ApplicationMaster. Instead of TaskTracker, it uses NodeManager as the worker daemon for execution of map-reduce tasks. The ResourceManager and the NodeManager form the computation framework for YARN, and ApplicationMaster is an application-specific framework for application management.   ResourceManager A ResourceManager is a per cluster service that manages the scheduling of compute resources to applications. It optimizes cluster utilization in terms of memory, CPU cores, fairness, and SLAs. To allow different policy constraints, it has algorithms in terms of pluggable schedulers such as capacity and fair that allows resource allocation in a particular way. ResourceManager has two main components: Scheduler: This is a pure pluggable component that is only responsible for allocating resources to applications submitted to the cluster, applying constraint of capacities and queues. Scheduler does not provide any guarantee for job completion or monitoring, it only allocates the cluster resources governed by the nature of job and resource requirement. ApplicationsManager (AsM): This is a service used to manage application masters across the cluster that is responsible for accepting the application submission, providing the resources for application master to start, monitoring the application progress, and restart, in case of application failure. NodeManager The NodeManager is a per node worker service that is responsible for the execution of containers based on the node capacity. Node capacity is calculated based on the installed memory and the number of CPU cores. The NodeManager service sends a heartbeat signal to the ResourceManager to update its health status. The NodeManager service is similar to the TaskTracker service in MapReduce v1. NodeManager also sends the status to ResourceManager, which could be the status of the node on which it is running or the status of tasks executing on it. ApplicationMaster An ApplicationMaster is a per application framework-specific library that manages each instance of an application that runs within YARN. YARN treats ApplicationMaster as a third-party library responsible for negotiating the resources from the ResourceManager scheduler and works with NodeManager to execute the tasks. The ResourceManager allocates containers to the ApplicationMaster and these containers are then used to run the application-specific processes. ApplicationMaster also tracks the status of the application and monitors the progress of the containers. When the execution of a container gets complete, the ApplicationMaster unregisters the containers with the ResourceManager and unregisters itself after the execution of the application is complete. Container A container is a logical bundle of resources in terms of memory, CPU, disk, and so on that is bound to a particular node. In the first version of YARN, a container is equivalent to a block of memory. The ResourceManager scheduler service dynamically allocates resources as containers. A container grants rights to an ApplicationMaster to use a specific amount of resources of a specific host. An ApplicationMaster is considered as the first container of an application and it manages the execution of the application logic on allocated containers. The YARN architecture In the previous topic, we discussed the YARN components. Here we'll discuss the high-level architecture of YARN and look at how the components interact with each other. The ResourceManager service runs on the master node of the cluster. A YARN client submits an application to the ResourceManager. An application can be a single MapReduce job, a directed acyclic graph of jobs, a java application, or any shell script. The client also defines an ApplicationMaster and a command to start the ApplicationMaster on a node. The ApplicationManager service of resource manager will validate and accept the application request from the client. The scheduler service of resource manager will allocate a container for the ApplicationMaster on a node and the NodeManager service on that node will use the command to start the ApplicationMaster service. Each YARN application has a special container called ApplicationMaster. The ApplicationMaster container is the first container of an application. The ApplicationMaster requests resources from the ResourceManager. The RequestRequest will have the location of the node, memory, and CPU cores required. The ResourceManager will allocate the resources as containers on a set of nodes. The ApplicationMaster will connect to the NodeManager services and request NodeManager to start containers. The ApplicationMaster manages the execution of the containers and will notify the ResourceManager once the application execution is over. Application execution and progress monitoring is the responsibility of ApplicationMaster rather than ResourceManager. The NodeManager service runs on each slave of the YARN cluster. It is responsible for running application's containers. The resources specified for a container are taken from the NodeManager resources. Each NodeManager periodically updates ResourceManager for the set of available resources. The ResourceManager scheduler service uses this resource matrix to allocate new containers to ApplicationMaster or to start execution of a new application. How YARN satisfies big data needs We talked about the MapReduce v1 framework and some limitations of the framework. Let's now discuss how YARN solves these issues: Scalability and higher cluster utilization: Scalability is the ability of a software or product to implement well under an expanding workload. In YARN, the responsibility of resource management and job scheduling / monitoring is divided into separate daemons, allowing YARN daemons to scale the cluster without degrading the performance of the cluster. With a flexible and generic resource model in YARN, the scheduler handles an overall resource profile for each type of application. This structure makes the communication and storage of resource requests efficient for the scheduler resulting in higher cluster utilization. High availability for components: Fault tolerance is a core design principle for any multitenancy platform such as YARN. This responsibility is delegated to ResourceManager and ApplicationMaster. The application specific framework, ApplicationMaster, handles the failure of a container. The ResourceManager handles the failure of NodeManager and ApplicationMaster. Flexible resource model: In MapReduce v1, resources are defined as the number of map and reduce task slots available for the execution of a job. Every resource request cannot be mapped as map/reduce slots. In YARN, a resource-request is defined in terms of memory, CPU, locality, and so on. It results in a generic definition for a resource request by an application. The NodeManager node is the worker node and its capability is calculated based on the installed memory and cores of the CPU. Multiple data processing algorithms: The MapReduce framework is bounded to batch processing only. YARN is developed with a need to perform a wide variety of data processing over the data stored over Hadoop HDFS. YARN is a framework for generic resource management and allows users to execute multiple data processing algorithms over the data. Log aggregation and resource localization: As discussed earlier, accessing and managing user logs is difficult in the Hadoop 1.x framework. To manage user logs, YARN introduced a concept of log aggregation. In YARN, once the application is finished, the NodeManager service aggregates the user logs related to an application and these aggregated logs are written out to a single log file in HDFS. To access the logs, users can use either the YARN command-line options, YARN web interface, or can fetch directly from HDFS. A container might require external resources such as jars, files, or scripts on a local file system. These are made available to containers before they are started. An ApplicationMaster defines a list of resources that are required to run the containers. For efficient disk utilization and access security, the NodeManager ensures the availability of specified resources and their deletion after use. Projects powered by YARN Efficient and reliable resource management is a basic need of a distributed application framework. YARN provides a generic resource management framework to support data analysis through multiple data processing algorithms. There are a lot of projects that have started using YARN for resource management. We've listed a few of these projects here and discussed how YARN integration solves their business requirements: Apache Giraph: Giraph is a framework for offline batch processing of semistructured graph data stored using Hadoop. With the Hadoop 1.x version, Giraph had no control over the scheduling policies, heap memory of the mappers, and locality awareness for the running job. Also, defining a Giraph job on the basis of mappers / reducers slots was a bottleneck. YARN's flexible resource allocation model, locality awareness principle, and application master framework ease the Giraph's job management and resource allocation to tasks. Apache Spark: Spark enables iterative data processing and machine learning algorithms to perform analysis over data available through HDFS, HBase, or other storage systems. Spark uses YARN's resource management capabilities and framework to submit the DAG of a job. The spark user can focus more on data analytics' use cases rather than how spark is integrated with Hadoop or how jobs are executed. Some other projects powered by YARN are as follows: MapReduce: https://issues.apache.org/jira/browse/MAPREDUCE-279 Giraph: https://issues.apache.org/jira/browse/GIRAPH-13 Spark: http://spark.apache.org/ OpenMPI: https://issues.apache.org/jira/browse/MAPREDUCE-2911 HAMA: https://issues.apache.org/jira/browse/HAMA-431 HBase: https://issues.apache.org/jira/browse/HBASE-4329 Storm: http://hortonworks.com/labs/storm/ A page on Hadoop wiki lists a number of projects/applications that are migrating to or using YARN as their resource management tool. You can see this at http://wiki.apache.org/hadoop/PoweredByYarn. Summary This article covered an introduction to YARN, its components, architecture, and different projects powered by YARN. It also explained how YARN solves big data needs. Resources for Article: Further resources on this subject: YARN and Hadoop[article] Introduction to Hadoop[article] Hive in Hadoop [article]
Read more
  • 0
  • 0
  • 12629

article-image-connecting-open-ports
Packt
31 Aug 2015
6 min read
Save for later

Connecting to Open Ports

Packt
31 Aug 2015
6 min read
 Miroslav Vitula, the author of the book Learning zANTI2 for Android Pentesting, penned this article on Connecting to Open Ports, focusing on cracking passwords and setting up a remote desktop connection. Let's delve into the topics. (For more resources related to this topic, see here.) Cracking passwords THC Hydra is one of the best-known login crackers, supports numerous protocols, is flexible, and very fast. Hydra supports more than 30 protocols, including HTTP GET, HTTP HEAD, Oracle, pcAnywhere, rlogin, Telnet, SSH (v1 and v2 as well), and many, many more. As you might guess, THC Hydra is also implemented in zANTI2 and it eventually becomes an integral part of the app for its high functionality and usability. The zANTI2 developers named this section Password Complexity Audit and it is located under Attack Actions after a target is selected: After selecting this option, you've probably noticed there are several types of attack. First, there are multiple dictionaries: Small, Optimized, Big, and a Huge dictionary that contains the highest amount of usernames and passwords. To clarify, a dictionary attack is a method of breaking into a password-protected computer, service, or server by entering every word in a dictionary file as a username/password. Unlike a brute force attack, where any possible combinations are tried, a dictionary attack uses only those possibilities that are deemed most likely to succeed. Files used for dictionary attacks (also called wordlists) can be found anywhere on the Internet, starting from basic ones to huge ones containing more than 900,000,000 words for WPA2 WiFi cracking. zANTI2 also lets you use a custom wordlist for the attack: Apart from dictionary attacks, there is an Incremental option, which is used for brute force attacks. This attempts to guess the right combination using a custom range of letters/numbers: To set up the method properly, ensure the cracking options are correctly set. The area of searched combinations is defined by min-max charset, where min stands for minimum length of the password, max for maximum length, and charset for character set, which in our case will be defined as lowercase letters. The Automatic Mode, as the description says, automatically matches the list of protocols with the open ports on the target. To select a custom protocol manually, simply disable the Automatic Mode and select the protocol you want to perform the attack on: In our case that would be the SSH protocol for cracking a password used to establish the connection on port 22. Since incremental is a brute force method, this might take an extremely long time to find the right combination. For instance, the password zANTI2-hacks would take about 350 thousand years for a desktop PC to crack; there are 77 character combinations and 43 sextillion possible combinations. Therefore, it is generally better to use dictionary attacks for cracking passwords that might be longer than just a few characters. However, if you have a few thousand years to spare, feel free to use the brute force method. If everything went fine, you should now be able to view the access password with the username. You can easily connect to the target by tapping the finished result using one of the installed SSH clients: When connected, it's all yours. All Linux commands can be executed using the app and you now have the power to list directories, change the password, and more. Although connecting to port 22 might sound spicy, there is more to be discovered. A remote desktop connection Microsoft has made a handy feature called remote desktop. As the title suggests, this lets an ordinary user access his home computer when he is away, or be used for managing a server through a network. This is a great sign that we can intercept this connection and exploit an open port to set up a remote desktop connection between our mobile phone and a target. There is, however, one requirement. Since the RDP (Remote Desktop Protocol) port 3389 isn't open by default, a user has to allow connections from other computers. This option can be set in the control panel of Windows, and only then is port 3389 accessible. If the option Allow remote connections to this computer is ticked on the victim's machine, we're good to go. This will leave the 3389 port open and listening for incoming broadcasts, including the ones from malicious attackers. If we run a quick port discovery on the target, the remote desktop port with number 3389 will pop up. This is a good sign for us, indicating that this port is open and listening: Tap the port (ms-wbt-server). You will be asked for login credentials once again. Tap GO. Now, if you haven't got any remote desktop clients installed, zANTI2 will redirect you to Google Play to download one—the Parallels 2X RDP. This application, as you can tell, is capable of establishing remote desktop access from your Android device. It is stable, fast, and works very well. After downloading the application, go back to zANTI2 and connect to the port once again. You will now be redirected directly to the app and a connection will be established immediately. As you can see in the following screenshot, here's my computer—I'm currently working on the article! Apart from a simplified Windows user interface (using a basic XP look with no transparent bars and such), it is basically the same and you can take control over the whole system. The Parallels 2X RDP client offers a comfortable and easy way to move the mouse and use the keyboard. However, while connecting to port 445 a victim has no idea about an intruder accessing the files on his computer; connecting to this port will log the current user out from the current session. However, if the remote desktop is set to allow multiple sessions at once, it is possible for a victim to see what the attacker currently controls. The quality seems to be good, although the resolution  is only 804 x 496 pixels 32-bit color depth. Despite these conditions, it is still easy to access folders, view files, or open applications. As we can see in the practical demonstration, service ports should be accessible only by the authorized systems, not by anyone else. It is also a good way to teach you to secure login credentials on your machine to protect yourself not only from people behind your back but also mainly from people on the network. Summary In this article, we showed how a connection to these ports is established, how to crack password-protected ports, and how to access them afterwards using tools like ConnectBot or the remote desktop client. Resources for Article: Further resources on this subject: Saying Hello to Unity and Android[article] Speeding up Gradle builds for Android[article] Android and UDOO for Home Automation [article]
Read more
  • 0
  • 0
  • 10925

article-image-how-it-all-fits-together
Packt
31 Aug 2015
4 min read
Save for later

How It All Fits Together

Packt
31 Aug 2015
4 min read
 In this article by Jonathan Hayward author of the book Reactive Programming with JavaScript he explains that Google Maps were a big hit when it came out, and it remains quite important, but the new functionality it introduced was pretty much nothing. The contribution Google made with its maps site was taking things previously only available with a steep learning cliff and giving them its easy trademark simplicity. And that was quite a lot. (For more resources related to this topic, see here.) Similar things might be said about ReactJS. No one at Facebook invented functional reactive programming. No one at Facebook appears to have significantly expanded functional reactive programming. But ReactJS markedly lowered the bar to entry. Previously, with respect to functional reactive programming, there were repeated remarks among seasoned C++ programmers; they said, "I guess I'm just stupid, or at least, I don't have a PhD in computational mathematics." And it might be suggested that proficiency in C++ is no mean feat; getting something to work in Python is less of a feat than getting the same thing to work in C++, just as scaling the local park's winter sledding hill is less of an achievement than scaling Mount Everest. Also, ReactJS introduces enough of changes so that competent C++ programmers who do not have any kind of degree in math, computational or otherwise, stand a fair chance of using ReactJS and being productive in it. Perhaps they may be less effective than pure JavaScript programmers who are particularly interested in functional programming. But learning to effectively program C++ is a real achievement, and most good C++ programmers have a fair chance of usefully implementing functional reactive programming with ReactJS. However, the same cannot be said for following the computer math papers on Wikipedia and implementing something in the academic authors' generally preferred language of Haskell. Here we'll explore a very important topic that is ReactJS as just a view—but what a view! ReactJS is just a view, but what a view! Charles Cézanne famously said, "Monet is just an eye, but what an eye!" Monet didn't try to show off his knowledge of structure and anatomy, but just copy what his eye saw. The consensus judgment of his work holds on to both "just an eye," and "what an eye!" And indeed, the details may be indistinct in Monet, who rebelled against artistry that tried to impress with deep knowledge of anatomy and knowledge of structure that is far beyond what jumps out to the eye. ReactJS is a framework rather than a library, which means that you are supposed to build a solution within the structure provided by ReactJS instead of plugging ReactJS into a solution that you structure yourself. The canonical example of a library is jQuery, where you build a solution your way, and call on jQuery as it fits into a structure that you design. However, ReactJS is specialized as a view. It's not that this is necessarily good or bad, but ReactJS is not a complete web development framework, and does not have even the intension of being the only tool you will ever need. It focuses on being a view, and in Facebook's offering, this does not include any form of AJAX call. This is not a monumental oversight in developing ReactJS; the expectation is that you use ReactJS as a View to provide the user interface functionality, and other tools to meet other needs as appropriate. This text hasn't covered using ReactJS together with your favorite tools, but do combine your favorite tools with ReactJS if they are not going to step on each other's feet. ReactJS may or may not collide with other Views, but it is meant to work with non-View technologies. Summary In this article, we looked at ReactJS as a view and also learned that ReactJS is not a complete web development framework. Resources for Article: Further resources on this subject: An Introduction to Reactive Programming[article] Kivy – An Introduction to Mastering JavaScript Promises and Its Implementation in Angular.js[article] Object-Oriented JavaScript with Backbone Classes [article]
Read more
  • 0
  • 0
  • 1692

article-image-building-click-go-robot
Packt
28 Aug 2015
16 min read
Save for later

Building a "Click-to-Go" Robot

Packt
28 Aug 2015
16 min read
 In this article by Özen Özkaya and Giray Yıllıkçı, author of the book Arduino Computer Vision Programming, you will learn how to approach computer vision applications, how to divide an application development process into basic steps, how to realize these design steps and how to combine a vision system with the Arduino. Now it is time to connect all the pieces into one! In this article you will learn about building a vision-assisted robot which can go to any point you want within the boundaries of the camera's sight. In this scenario there will be a camera attached to the ceiling and, once you get the video stream from the robot and click on any place in the view, the robot will go there. This application will give you an all-in-one development application. Before getting started, let's try to draw the application scheme and define the potential steps. We want to build a vision-enabled robot which can be controlled via a camera attached to the ceiling and, when we click on any point in the camera view, we want our robot to go to this specific point. This operation requires a mobile robot that can communicate with the vision system. The vision system should be able to detect or recognize the robot and calculate the position and orientation of the robot. The vision system should also give us the opportunity to click on any point in the view and it should calculate the path and the robot movements to get to the destination. This scheme requires a communication line between the robot and the vision controller. In the following illustration, you can see the physical scheme of the application setup on the left hand side and the user application window on the right hand side: After interpreting the application scheme, the next step is to divide the application into small steps by using the computer vision approach. In the data acquisition phase, we'll only use the scene's video stream. There won't be an external sensor on the robot because, for this application, we don't need one. Camera selection is important and the camera distance (the height from the robot plane) should be enough to see the whole area. We'll use the blue and red circles above the robot to detect the robot and calculate its orientation. We don't need smaller details. A resolution of about 640x480 pixels is sufficient for a camera distance of 120 cm. We need an RGB camera stream because we'll use the color properties of the circles. We will use the Logitech C110, which is an affordable webcam. Any other OpenCV compatible webcam will work because this application is not very demanding in terms of vision input. If you need more cable length you can use a USB extension cable. In the preprocessing phase, the first step is to remove the small details from the surface. Blurring is a simple and effective operation for this purpose. If you need to, you can resize your input image to reduce the image size and processing time. Do not forget that, if you resize to too small a resolution, you won't be able to extract useful information. The following picture is of the Logitech C110 webcam: The next step is processing. There are two main steps in this phase. The first step is to detect the circles in the image. The second step is to calculate the robot orientation and the path to the destination point. The robot can then follow the path and reach its destination. In color processing with which we can apply color filters to the image to get the image masks of the red circle and the blue circle, as shown in the following picture. Then we can use contour detection or blob analysis to detect the circles and extract useful features. It is important to keep it simple and logical: Blob analysis detects the bounding boxes of two circles on the robot and, if we draw a line between the centers of the circles, once we calculate the line angle, we will get the orientation of the robot itself. The mid-point of this line will be the center of the robot. If we draw a line from the center of the robot to the destination point we obtain the straightest route. The circles on the robot can also be detected by using the Hough transform for circles but, because it is a relatively slow algorithm and it is hard to extract image statistics from the results, the blob analysis-based approach is better. Another approach is by using the SURF, SIFT or ORB features. But these methods probably won't provide fast real-time behavior, so blob analysis will probably work better. After detecting blobs, we can apply post-filtering to remove the unwanted blobs. We can use the diameter of the circles, the area of the bounding box, and the color information, to filter the unwanted blobs. By using the properties of the blobs (extracted features), it is possible to detect or recognize the circles, and then the robot. To be able to check if the robot has reached the destination or not, a distance calculation from the center of the robot to the destination point would be useful. In this scenario, the robot will be detected by our vision controller. Detecting the center of the robot is sufficient to track the robot. Once we calculate the robot's position and orientation, we can combine this information with the distance and orientation to the destination point and we can send the robot the commands to move it! Efficient planning algorithms can be applied in this phase but, we'll implement a simple path planning approach. Firstly, the robot will orientate itself towards the destination point by turning right or left and then it will go forward to reach the destination. This scenario will work for scenarios without obstacles. If you want to extend the application for a complex environment with obstacles, you should implement an obstacle detection mechanism and an efficient path planning algorithm. We can send the commands such as Left!, Right!, Go!, or Stop! to the robot over a wireless line. RF communication is an efficient solution for this problem. In this scenario, we need two NRF24L01 modules—the first module is connected to the robot controller and the other is connected to the vision controller. The Arduino is the perfect means to control the robot and communicate with the vision controller. The vision controller can be built on any hardware platform such as a PC, tablet, or a smartphone. The vision controller application can be implemented on lots of operating systems as OpenCV is platform-independent. We preferred Windows and a laptop to run our vision controller application. As you can see, we have divided our application into small and easy-to-implement parts. Now it is time to build them all! Building a robot It is time to explain how to build our Click-to-Go robot. Before going any further we would like to boldly say that robotic projects can teach us the fundamental fields of science such as mechanics, electronics, and programming. As we go through the building process of our Click-to-Go robot, you will see that we have kept it as simple as possible. Moreover, instead of buying ready-to-use robot kits, we have built our own simple and robust robot. Of course, if you are planning to buy a robot kit or already have a kit available, you can simply adapt your existing robot into this project. Our robot design is relatively simple in terms of mechanics. We will use only a box-shaped container platform, two gear motors with two individual wheels, a battery to drive the motors, one nRF24L01 Radio Frequency (RF) transceiver module, a bunch of jumper wires, an L293D IC and, of course, one Arduino Uno board module. We will use one more nRF24L01 and one more Arduino Uno for the vision controller communication circuit. Our Click-to-Go robot will be operated by a simplified version of a differential drive. A differential drive can be summarized as a relative speed change on the wheels, which assigns a direction to the robot. In other words, if both wheels spin at the same rate, the robot goes forward. To drive in reverse, the wheels spin in the opposite direction. To turn left, the left wheel turns backwards and the right wheel stays still or turns forwards. Similarly, to turn right, the right wheel turns backwards and the left stays still or turns forwards. You can get curved paths by varying the rotation speeds of the wheels. Yet, to cover every aspect of this comprehensive project, we will drive the wheels of both the motors forward to go forwards. To turn left, the left wheel stays still and the right wheel turns forward. Symmetrically, to turn right, the right motor stays still and the left motor runs forward. We will not use running motors in a reverse direction to go backwards. Instead, we will change the direction of the robot by turning right or left. Building mechanics As we stated earlier, the mechanics of the robot are fairly simple. First of all we need a small box-shaped container to use as both a rigid surface and the storage for the battery and electronics. For this purpose, we will use a simple plywood box. We will attach gear motors in front of the plywood box and any kind of support surface to the bottom of the box. As can be seen in the following picture, we used a small wooden rod to support the back of the robot to level the box: If you think that the wooden rod support is dragging, we recommend adding a small ball support similar to Pololu's ball caster, shown at https://www.pololu.com/product/950. It is not a very expensive component and it significantly improves the mobility of the robot. You may want to drill two holes next to the motor wirings to keep the platform tidy. The easiest way to attach the motors and the support rod is by using two-sided tape. Just make sure that the tape is not too thin. It is much better to use two-sided foamy tape. The topside of the robot can be covered with a black shell to enhance the contrast between the red and blue circles. We will use these circles to ascertain the orientation of the robot during the operation, as mentioned earlier. For now, don't worry too much about this detail. Just be aware that we need to cover the top of the robot with a flat surface. We will explain in detail on how these red and blue circles are used. It is worth mentioning that we used large water bottle lids. It is better to use matt surfaces instead of shiny surfaces to avoid glare in the image. The finished Click-to-Go robot should be similar to the robot shown in the following picture. The robot's head is on the side with the red circle: As we have now covered building the mechanics of our robot we can move on to building the electronics. Building the electronics We will use two separate Arduino Unos for this vision-enabled robot project, one each for the robot and the transmitter system. The electronic setup needs a little bit more attention than the mechanics. The electronic components of the robot and the transmitter units are similar. However, the robot needs more work. We have selected nRF24L01 modules for the wireless communication module,. These modules are reliable and easy to find from both the Internet and local hobby stores. It is possible to use any pair of wireless connectivity modules but, for this project, we will stick with nRF24L01 modules, as shown in this picture: For the driving motors we will need to use a quadruple half-H driver, L293D. Again, every electronic shop should have these ICs. As a reminder, you may need to buy a couple of spare L293D ICs in case you burn the IC by mistake. Following is the picture of the L293D IC: We will need a bunch of jumper wires to connect the components together. It is nice to have a small breadboard for the robot/receiver, to wire the L293D. The transmitter part is very simple so a breadboard is not essential. Robot/receiver and transmitter drawings The drawings of both the receiver and the transmitter have two common modules: Arduino Uno and nRF24L01 connectivity modules. The connections of the nRF24L01 modules on both sides are the same. In addition to these connectivity modules, for the receiver, we need to put some effort into connecting the L293D IC and the battery to power up the motors. In the following picture, we can see a drawing of the transmitter. As it will always be connected to the OpenCV platform via the USB cable, there is no need to feed the system with an external battery: As shown in the following picture of the receiver and the robot, it is a good idea to separate the motor battery from the battery that feeds the Arduino Uno board because the motors may draw high loads or create high loads, which can easily damage the Arduino board's pin outs. Another reason is to keep the Arduino working even if the battery motor has drained. Separating the feeder batteries is a very good practice to follow if you are planning to use more than one 12V battery. To keep everything safe, we fed the Arduino Uno with a 6V battery pack and the motors with a 9V battery: Drawings of receiver systems can be little bit confusing and lead to errors. It is a good idea to open the drawings and investigate how the connections are made by using Fritzing. You can download the Fritzing drawings of this project from https://github.com/ozenozkaya/click_to_go_robot_drawings. To download the Fritzing application, visit the Fritzing download page: http://fritzing.org/download/ Building the robot controller and communications We are now ready to go through the software implementation of the robot and the transmitter. Basically what we are doing here is building the required connectivity to send data to the remote robot continuously from OpenCV via a transmitter. OpenCV will send commands to the transmitter through a USB cable to the first Arduino board, which will then send the data to the unit on the robot. And it will send this data to the remote robot over the RF module. Follow these steps: Before explaining the code, we need to import the RF24 library. To download RF24 library drawings please go to the GitHub link at https://github.com/maniacbug/RF24. After downloading the library, go to Sketch | Include Library | Add .ZIP Library… to include the library in the Arduino IDE environment. After clicking Add .ZIP Library…, a window will appear. Go into the downloads directory and select the RF24-master folder that you just downloaded. Now you are ready to use the RF24 library. As a reminder, it is pretty much the same to include a library in Arduino IDE as on other platforms. It is time to move on to the explanation of the code! It is important to mention that we use the same code for both the robot and the transmitter, with a small trick! The same code works differently for the robot and the transmitter. Now, let's make everything simpler during the code explanation. The receiver mode needs to ground an analog 4 pin. The idea behind the operation is simple; we are setting the role_pin to high through its internal pull-up resistor. So, it will read high even if you don't connect it, but you can still safely connect it to ground and it will read low. Basically, the analog 4 pin reads 0 if the there is a connection with a ground pin. On the other hand, if there is no connection to the ground, the analog 4 pin value is kept as 1. By doing this at the beginning, we determine the role of the board and can use the same code on both sides. Here is the code: #include <SPI.h> #include "nRF24L01.h" #include "RF24.h" #define MOTOR_PIN_1 3 #define MOTOR_PIN_2 5 #define MOTOR_PIN_3 6 #define MOTOR_PIN_4 7 #define ENABLE_PIN 4 #define SPI_ENABLE_PIN 9 #define SPI_SELECT_PIN 10 const int role_pin = A4; typedef enum {transmitter = 1, receiver} e_role; unsigned long motor_value[2]; String input_string = ""; boolean string_complete = false; RF24 radio(SPI_ENABLE_PIN, SPI_SELECT_PIN); const uint64_t pipes[2] = { 0xF0F0F0F0E1LL, 0xF0F0F0F0D2LL }; e_role role = receiver; void setup() { pinMode(role_pin, INPUT); digitalWrite(role_pin, HIGH); delay(20); radio.begin(); radio.setRetries(15, 15); Serial.begin(9600); Serial.println(" Setup Finished"); if (digitalRead(role_pin)) { Serial.println(digitalRead(role_pin)); role = transmitter; } else { Serial.println(digitalRead(role_pin)); role = receiver; } if (role == transmitter) { radio.openWritingPipe(pipes[0]); radio.openReadingPipe(1, pipes[1]); } else { pinMode(MOTOR_PIN_1, OUTPUT); pinMode(MOTOR_PIN_2, OUTPUT); pinMode(MOTOR_PIN_3, OUTPUT); pinMode(MOTOR_PIN_4, OUTPUT); pinMode(ENABLE_PIN, OUTPUT); digitalWrite(ENABLE_PIN, HIGH); radio.openWritingPipe(pipes[1]); radio.openReadingPipe(1, pipes[0]); } radio.startListening(); } void loop() { // TRANSMITTER CODE BLOCK // if (role == transmitter) { Serial.println("Transmitter"); if (string_complete) { if (input_string == "Right!") { motor_value[0] = 0; motor_value[1] = 120; } else if (input_string == "Left!") { motor_value[0] = 120; motor_value[1] = 0; } else if (input_string == "Go!") { motor_value[0] = 120; motor_value[1] = 110; } else { motor_value[0] = 0; motor_value[1] = 0; } input_string = ""; string_complete = false; } radio.stopListening(); radio.write(motor_value, 2 * sizeof(unsigned long)); radio.startListening(); delay(20); } // RECEIVER CODE BLOCK // if (role == receiver) { Serial.println("Receiver"); if (radio.available()) { bool done = false; while (!done) { done = radio.read(motor_value, 2 * sizeof(unsigned long)); delay(20); } Serial.println(motor_value[0]); Serial.println(motor_value[1]); analogWrite(MOTOR_PIN_1, motor_value[1]); digitalWrite(MOTOR_PIN_2, LOW); analogWrite(MOTOR_PIN_3, motor_value[0]); digitalWrite(MOTOR_PIN_4 , LOW); radio.stopListening(); radio.startListening(); } } } void serialEvent() { while (Serial.available()) { // get the new byte: char inChar = (char)Serial.read(); // add it to the inputString: input_string += inChar; // if the incoming character is a newline, set a flag // so the main loop can do something about it: if (inChar == '!' || inChar == '?') { string_complete = true; Serial.print("data_received"); } } } This example code is taken from one of the examples in the RF24 library. We have changed it in order to serve our needs in this project. The original example can be found in the RF24-master/Examples/pingpair directory. Summary We have combined everything we have learned up to now and built an all-in-one application. By designing and building the Click-to-Go robot from scratch you have embraced the concepts. You can see that the vision approach very well, even for complex applications. You now know how to divide a computer vision application into small pieces, how to design and implement each design step, and how to efficiently use the tools you have. Resources for Article: Further resources on this subject: Getting Started with Arduino[article] Arduino Development[article] Programmable DC Motor Controller with an LCD [article]
Read more
  • 0
  • 0
  • 2651

article-image-asynchronous-programming-python
Packt
26 Aug 2015
20 min read
Save for later

Asynchronous Programming with Python

Packt
26 Aug 2015
20 min read
 In this article by Giancarlo Zaccone, the author of the book Python Parallel Programming Cookbook, we will cover the following topics: Introducing Asyncio GPU programming with Python Introducing PyCUDA Introducing PyOpenCL (For more resources related to this topic, see here.) An asynchronous model is of fundamental importance along with the concept of event programming. The execution model of asynchronous activities can be implemented using a single stream of main control, both in uniprocessor systems and multiprocessor systems. In the asynchronous model of a concurrent execution, various tasks intersect with each other along the timeline, and all of this happens under the action of a single flow of control (single-threaded). The execution of a task can be suspended and then resumed alternating in time with any other task. The asynchronous programming model As you can see in the preceding figure, the tasks (each with a different color) are interleaved with one another, but they are in a single thread of control. This implies that when one task is in execution, the other tasks are not. A key difference between a multithreaded programming model and single-threaded asynchronous concurrent model is that in the first case, the operating system decides on the timeline whether to suspend the activity of a thread and start another, while in the second case, the programmer must assume that a thread may be suspended and replaced with another at almost any time. Introducing Asyncio The Python module Asyncio provides facilities to manage events, coroutines, tasks and threads, and synchronization primitives to write concurrent code. When a program becomes very long and complex, it is convenient to divide it into subroutines, each of which realizes a specific task, for which the program implements a suitable algorithm. The subroutine cannot be executed independently but only at the request of the main program, which is then responsible for coordinating the use of subroutines. Coroutines are a generalization of the subroutine. Like a subroutine, the coroutine computes a single computational step, but unlike subroutines, there is no main program that is used to coordinate the results. This is because the coroutines link themselves together to form a pipeline without any supervising function responsible for calling them in a particular order. In a coroutine, the execution point can be suspended and resumed later, having kept track of its local state in the intervening time. In this example, we see how to use the coroutine mechanism of Asyncio to simulate a finite state machine of five states. A Finite-state automaton (FSA) is a mathematical model that is not only widely used in engineering disciplines but also in sciences, such as mathematics and computer science. The automata we want to simulate the behavior is as follows: Finite State Machine We have indicated with S0, S1, S2, S3, and S4 the states of the system, with 0 and 1 as the values for which the automata can pass from one state to the next state (this operation is called a transition). So for example, the state S0 can be passed to the state S1 only for the value 1 and S0 can pass the state S2 only to the value 0. The Python code that follows simulates a transition of the automaton from the state S0, the so-called Start State, up to the state S4, the End State: #Asyncio Finite State Machine import asyncio import time from random import randint @asyncio.coroutine def StartState(): print ("Start State called n") input_value = randint(0,1) time.sleep(1) if (input_value == 0): result = yield from State2(input_value) else : result = yield from State1(input_value) print("Resume of the Transition : nStart State calling " + result) @asyncio.coroutine def State1(transition_value): outputValue = str(("State 1 with transition value = %s n" %(transition_value))) input_value = randint(0,1) time.sleep(1) print("...Evaluating...") if (input_value == 0): result = yield from State3(input_value) else : result = yield from State2(input_value) result = "State 1 calling " + result return (outputValue + str(result)) @asyncio.coroutine def State2(transition_value): outputValue = str(("State 2 with transition value = %s n" %(transition_value))) input_value = randint(0,1) time.sleep(1) print("...Evaluating...") if (input_value == 0): result = yield from State1(input_value) else : result = yield from State3(input_value) result = "State 2 calling " + result return (outputValue + str(result)) @asyncio.coroutine def State3(transition_value): outputValue = str(("State 3 with transition value = %s n" %(transition_value))) input_value = randint(0,1) time.sleep(1) print("...Evaluating...") if (input_value == 0): result = yield from State1(input_value) else : result = yield from EndState(input_value) result = "State 3 calling " + result return (outputValue + str(result)) @asyncio.coroutine def EndState(transition_value): outputValue = str(("End State with transition value = %s n" %(transition_value))) print("...Stop Computation...") return (outputValue ) if __name__ == "__main__": print("Finite State Machine simulation with Asyncio Coroutine") loop = asyncio.get_event_loop() loop.run_until_complete(StartState()) After running the code, we have an output similar to this: C:Python CookBookChapter 4- Asynchronous Programmingcodes - Chapter 4>python asyncio_state_machine.py Finite State Machine simulation with Asyncio Coroutine Start State called ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Evaluating... ...Stop Computation... Resume of the Transition : Start State calling State 1 with transition value = 1 State 1 calling State 3 with transition value = 0 State 3 calling State 1 with transition value = 0 State 1 calling State 2 with transition value = 1 State 2 calling State 3 with transition value = 1 State 3 calling State 1 with transition value = 0 State 1 calling State 2 with transition value = 1 State 2 calling State 1 with transition value = 0 State 1 calling State 3 with transition value = 0 State 3 calling State 1 with transition value = 0 State 1 calling State 2 with transition value = 1 State 2 calling State 3 with transition value = 1 State 3 calling End State with transition value = 1 Each state of the automata has been defined with the annotation @asyncio.coroutine. For example, the state S0 is: @asyncio.coroutine def StartState(): print ("Start State called n") input_value = randint(0,1) time.sleep(1) if (input_value == 0): result = yield from State2(input_value) else : result = yield from State1(input_value) The transition to the next state is determined by input_value, which is defined by the randint(0,1) function of Python's module random. This function randomly provides the value 0 or 1, where it randomly determines to which state the finite-state machine will be passed: input_value = randint(0,1) After determining the value at which state the finite state machine will be passed, the coroutine calls the next coroutine using the command yield from: if (input_value == 0): result = yield from State2(input_value) else : result = yield from State1(input_value) The variable result is the value that each coroutine returns. It is a string, and at the end of the computation, we can reconstruct [NV1] the transition from the initial state of the automaton, the Start State, up to the final state, the End State. The main program starts the evaluation inside the event loop: if __name__ == "__main__": print("Finite State Machine simulation with Asyncio Coroutine") loop = asyncio.get_event_loop() loop.run_until_complete(StartState()) GPU programming with Python A graphics processing unit (GPU) is an electronic circuit that specializes in processing data to render images from polygonal primitives. Although they were designed to carry out rendering images, GPUs have continued to evolve, becoming more complex and efficient in serving both real-time and offline rendering community. GPUs have continued to evolve, becoming more complex and efficient in performing any scientific computation. Each GPU is indeed composed of several processing units called streaming multiprocessor (SM), representing the first logic level of parallelism; each SM in fact, works simultaneously and independently from the others. The GPU architecture Each SM is in turn divided into a group of Stream Processors (SP), each of which has a core of real execution and can run a thread sequentially. SP represents the smallest unit of execution logic and the level of finer parallelism. The division in SM and SP is structural in nature, but it is possible to outline a further logical organization of the SP of a GPU, which are grouped together in logical blocks characterized by a particular mode of execution—all cores that make up a group run at the same time with the same instructions. This is just the SIMD (Single Instruction, Multiple Data) model. The programming paradigm that characterizes GPU computing is also called stream processing because the data can be viewed as a homogeneous flow of values that are applied synchronously to the same operations. Currently, the most efficient solutions to exploit the computing power provided by the GPU cards are the software libraries CUDA and OpenCL. Introducing PyCUDA PyCUDA is a Python wrapper for CUDA (Compute Unified Device Architecture), the software library developed by NVIDIA for GPU programming. The PyCuda programming model is designed for the common execution of a program on the CPU and GPU so as to allow you to perform the sequential parts on the CPU and the numeric parts that are more intensive on the GPU. The phases to be performed in the sequential mode are implemented and executed on the CPU (host), while the steps to be performed in parallel are implemented and executed on the GPU (device). The functions to be performed in parallel on the device are called kernels. The skeleton general for the execution of a generic function kernel on the device is as follows: Allocation of memory on the device. Transfer of data from the host memory to that allocated on the device. Running the device: Running the configuration. Invocation of the kernel function. Transfer of the results from the memory on the device to the host memory. Release of the memory allocated on the device. The PyCUDA programming model To show the PyCuda workflow, let's consider a 5 × 5 random array and the following procedure: Create the array 5×5 on the CPU. Transfer the array to the GPU. Perform a Task[NV2]  on the array in the GPU (double all the items in the array). Transfer the array from the GPU to the CPU. Print the results. The code for this is as follows: import pycuda.driver as cuda import pycuda.autoinit from pycuda.compiler import SourceModule import numpy a = numpy.random.randn(5,5) a = a.astype(numpy.float32) a_gpu = cuda.mem_alloc(a.nbytes) cuda.memcpy_htod(a_gpu, a) mod = SourceModule(""" __global__ void doubleMatrix(float *a) { int idx = threadIdx.x + threadIdx.y*4; a[idx] *= 2; } """) func = mod.get_function("doubleMatrix") func(a_gpu, block=(5,5,1)) a_doubled = numpy.empty_like(a) cuda.memcpy_dtoh(a_doubled, a_gpu) print ("ORIGINAL MATRIX") print a print ("DOUBLED MATRIX AFTER PyCUDA EXECUTION") print a_doubled The example output should be like this : C:Python CookBookChapter 6 - GPU Programming with Python >python PyCudaWorkflow.py ORIGINAL MATRIX [[-0.59975582 1.93627465 0.65337795 0.13205571 -0.46468592] [ 0.01441949 1.40946579 0.5343408 -0.46614054 -0.31727529] [-0.06868593 1.21149373 -0.6035406 -1.29117763 0.47762445] [ 0.36176383 -1.443097 1.21592784 -1.04906416 -1.18935871] [-0.06960868 -1.44647694 -1.22041082 1.17092752 0.3686313 ]] DOUBLED MATRIX AFTER PyCUDA EXECUTION [[-1.19951165 3.8725493 1.3067559 0.26411143 -0.92937183] [ 0.02883899 2.81893158 1.0686816 -0.93228108 -0.63455057] [-0.13737187 2.42298746 -1.2070812 -2.58235526 0.95524889] [ 0.72352767 -1.443097 1.21592784 -1.04906416 -1.18935871] [-0.06960868 -1.44647694 -1.22041082 1.17092752 0.3686313 ]] The code starts with the following imports: import pycuda.driver as cuda import pycuda.autoinit from pycuda.compiler import SourceModule The pycuda.autoinit import automatically picks a GPU to run on based on the availability and number. It also creates a GPU context for subsequent code to run in. Both the chosen device and the created context are available from pycuda.autoinit as importable symbols if needed. While the SourceModule component is the object where a C-like code for the GPU must be written. The first step is to generate the input 5 × 5 matrix. Since most GPU computations involve large arrays of data, the NumPy module must be imported: import numpy a = numpy.random.randn(5,5) Then, the items in the matrix are converted in a single precision mode, many NVIDIA cards support only single precision: a = a.astype(numpy.float32) The first operation to be done in order to implement a GPU loads the input array from the host memory (CPU) to the device (GPU). This is done at the beginning of the operation and consists two steps that are performed by invoking two functions provided PyCuda[NV3] . Memory allocation on the device is done via the cuda.mem_alloc function. The device and host memory may not ever communicate while performing a function kernel. This means that to run a function in parallel on the device, the data relating to it must be present in the memory of the device itself. Before you copy data from the host memory to the device memory, you must allocate the memory required on the device: a_gpu = cuda.mem_alloc(a.nbytes). Copy of the matrix from the host memory to that of the device with the function: call cuda.memcpy_htod : cuda.memcpy_htod(a_gpu, a). We also note that a_gpu is one dimensional, and on the device, we need to handle it as such. All these operations do not require the invocation of a kernel and are made directly by the main processor. The SourceModule entity serves to define the (C-like) kernel function doubleMatrix that multiplies each array entry by 2: mod = SourceModule(""" __global__ void doubleMatrix(float *a) { int idx = threadIdx.x + threadIdx.y*4; a[idx] *= 2; } """) The __global__ qualifier is a directive that indicates that the doubleMatrix function will be processed on the device. It will be just the compiler Cuda nvcc that will be used to perform this task. Let's look at the function's body, which is as follows: int idx = threadIdx.x + threadIdx.y*4; The idx parameter is the matrix index that is identified by the thread coordinates threadIdx.x and threadIdx.y. Then, the element matrix with the index idx is multiplied by 2: a[idx] *= 2; We also note that this kernel function will be executed once in 16 different threads. Both the variables threadIdx.x and threadIdx.y contain indices between 0 and 3 , and the pair[NV4]  is different for each thread. Threads scheduling is directly linked to the GPU architecture and its intrinsic parallelism. A block of threads is assigned to a single SM. Here, threads are further divided into groups called warps. The size of a warp depends on the architecture under consideration. The threads of the same warp are managed by the control unit called the warp scheduler. To take full advantage of the inherent parallelism of the SM, the threads of the same warp must execute the same instruction. If this condition does not occur, we speak of divergence of threads. If the same warp threads execute different instructions, the control unit cannot handle all the warps. It must follow the sequences of instructions for every single thread (or for homogeneous subsets of threads) in a serial mode. Let's observe how the thread block is divided in various warps—threads are divided by the value of threadIdx. The threadIdx structure consists of 3 fields: threadIdx.x, threadIdx.y, and threadIdx.z. Thread blocks subdivision: T(x,y), where x = threadIdx.x and y = threadIdx.y We can see again that the code in the kernel function will be automatically compiled by the nvcc cuda compiler. If there are no errors, a pointer to this compiled function will be created. In fact, the mod.get_function[NV5] ("doubleMatrix") returns an identifier to the function created func: func = mod.get_function("doubleMatrix ") To perform a function on the device, you must first configure the execution appropriately. This means that we need to determine the size of the coordinates to identify and distinguish the thread belonging to different blocks. This will be done using the block parameter inside the func call: func(a_gpu, block = (5, 5, 1)) The block = (5, 5, 1) tells us that we are calling a kernel function with a_gpu linearized input matrix and a single thread block of size, 5 threads in the x direction, 5 threads in the y direction, and 1 thread in the z direction, 16 threads in total. This structure is designed with parallel implementation of the algorithm of interest. The division of the workload results is an early form of parallelism that is sufficient and necessary to make use of the computing resources provided by the GPU. Once you've configured the kernel's invocation, you can invoke the kernel function that executes instructions in parallel on the device. Each thread executes the same code kernel. After the computation in the gpu device, we use an array to store the results: a_doubled = numpy.empty_like(a) cuda.memcpy_dtoh(a_doubled, a_gpu) Introducing PyOpenCL As for programming with PyCuda, the first step to build a program for PyOpenCL is the encoding of the host application. In fact, this is performed on the host computer (typically, the user's PC) and then, it dispatches the kernel application on the connected devices (GPU cards). The host application must contain five data structures, which are as follows: Device: This identifies the hardware where the kernel code must be executed. A PyOpenCL application can be executed not only on CPU and GPU cards but also on embedded devices such as FPGA (Field Programmable Gate Array). Program: This is a group of kernels. A program selects which kernel must be executed on the device. Kernel: This is the code to be executed on the device. A kernel is essentially a (C-like) function that enables it to be compiled for an execution on any device that supports OpenCL drivers. The kernel is the only way the host can call a function that will run on a device. When the host invokes a kernel, many work items start running on the device. Each work item runs the code of the kernel but works on a different part of the dataset. Command queue: Each device receives kernels through this data structure. A command queue orders the execution of kernels on the device. Context: This is a group of devices. A context allows devices to receive kernels and transfer data. PyOpenCL programming The preceding figure shows how these data structures can work in a host application. Let's remember again that a program can contain multiple functions that are to be executed on the device and each kernel encapsulates only a single function from the program. In this example, we show you the basic steps to build a PyOpenCL program. The task to be executed is the parallel sum of two vectors. In order to maintain a readable output, let's consider two vectors, each of one out of 100 elements. The resulting vector will be for each element's i[NV6] th, which is the sum of the ith element vector_a plus the ith element vector_b. Of course, to be able to appreciate the parallel execution of this code, you can also increase some orders of magnitude the size of the input vector_dimension:[NV7]  import numpy as np import pyopencl as cl import numpy.linalg as la vector_dimension = 100 vector_a = np.random.randint(vector_dimension, size=vector_dimension) vector_b = np.random.randint(vector_dimension, size=vector_dimension) platform = cl.get_platforms()[0] device = platform.get_devices()[0] context = cl.Context([device]) queue = cl.CommandQueue(context) mf = cl.mem_flags a_g = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vector_a) b_g = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vector_b) program = cl.Program(context, """ __kernel void vectorSum(__global const int *a_g, __global const int *b_g, __global int *res_g) { int gid = get_global_id(0); res_g[gid] = a_g[gid] + b_g[gid]; } """).build() res_g = cl.Buffer(context, mf.WRITE_ONLY, vector_a.nbytes) program.vectorSum(queue, vector_a.shape, None, a_g, b_g, res_g) res_np = np.empty_like(vector_a) cl.enqueue_copy(queue, res_np, res_g) print ("PyOPENCL SUM OF TWO VECTORS") print ("Platform Selected = %s" %platform.name ) print ("Device Selected = %s" %device.name) print ("VECTOR LENGTH = %s" %vector_dimension) print ("INPUT VECTOR A") print vector_a print ("INPUT VECTOR B") print vector_b print ("OUTPUT VECTOR RESULT A + B ") print res_np assert(la.norm(res_np - (vector_a + vector_b))) < 1e-5 The output from Command Prompt should be like this: C:Python CookBook Chapter 6 - GPU Programming with PythonChapter 6 - codes>python PyOpenCLParallellSum.py Platform Selected = NVIDIA CUDA Device Selected = GeForce GT 240 VECTOR LENGTH = 100 INPUT VECTOR A [ 0 29 88 46 68 93 81 3 58 44 95 20 81 69 85 25 89 39 47 29 47 48 20 86 59 99 3 26 68 62 16 13 63 28 77 57 59 45 52 89 16 6 18 95 30 66 19 29 31 18 42 34 70 21 28 0 42 96 23 86 64 88 20 26 96 45 28 53 75 53 39 83 85 99 49 93 23 39 1 89 39 87 62 29 51 66 5 66 48 53 66 8 51 3 29 96 67 38 22 88] INPUT VECTOR B [98 43 16 28 63 1 83 18 6 58 47 86 59 29 60 68 19 51 37 46 99 27 4 94 5 22 3 96 18 84 29 34 27 31 37 94 13 89 3 90 57 85 66 63 8 74 21 18 34 93 17 26 9 88 38 28 14 68 88 90 18 6 40 30 70 93 75 0 45 86 15 10 29 84 47 74 22 72 69 33 81 31 45 62 81 66 69 14 71 96 91 51 35 4 63 36 28 65 10 41] OUTPUT VECTOR RESULT A + B [ 98 72 104 74 131 94 164 21 64 102 142 106 140 98 145 93 108 90 84 75 146 75 24 180 64 121 6 122 86 146 45 47 90 59 114 151 72 134 55 179 73 91 84 158 38 140 40 47 65 111 59 60 79 109 66 28 56 164 111 176 82 94 60 56 166 138 103 53 120 139 54 93 114 183 96 167 45 111 70 122 120 118 107 91 132 132 74 80 119 149 157 59 86 7 92 132 95 103 32 129] In the first line of the code after the required module import, we defined the input vectors: vector_dimension = 100 vector_a = np.random.randint(vector_dimension, size= vector_dimension) vector_b = np.random.randint(vector_dimension, size= vector_dimension) Each vector contain 100 integers items that are randomly selected thought the NumPy function: np.random.randint(max integer , size of the vector) Then, we must select the device to run the kernel code. To do this, we must first select the platform using the get_platform() PyOpenCL statement: platform = cl.get_platforms()[0] This platform, as you can see from the output, corresponds to the NVIDIA CUDA platform. Then, we must select the device using the get_device() platform's method: device = platform.get_devices()[0] In the following steps, the context and the queue are defined, PyOpenCL provides the method context (device selected) and queue (context selected): context = cl.Context([device]) queue = cl.CommandQueue(context) To perform the computation in the device, the input vector must be transferred to the device's memory. So, two input buffers in the device memory must be created: mf = cl.mem_flags a_g = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vector_a) b_g = cl.Buffer(context, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=vector_b) Also, we prepare the buffer for the resulting vector: res_g = cl.Buffer(context, mf.WRITE_ONLY, vector_a.nbytes) Finally, the core of the script, the kernel code is defined inside a program as follows: program = cl.Program(context, """ __kernel void vectorSum(__global const int *a_g, __global const int *b_g, __global int *res_g) { int gid = get_global_id(0); res_g[gid] = a_g[gid] + b_g[gid]; } """).build() The kernel's name is vectorSum. The parameter list defines the data types of the input arguments (vectors of integers) and output data type (a vector of integer again). Inside the kernel, the sum of the two vectors is simply defined as: Initialize the vector index: int gid = get_global_id(0) Sum the vector's components: res_g[gid] = a_g[gid] + b_g[gid]; In OpenCL and PyOpenCL, buffers are attached to a context[NV8]  and are only moved to a device once the buffer is used on that device. Finally, we execute vectorSum in the device: program.vectorSum(queue, vector_a.shape, None, a_g, b_g, res_g) To visualize the results, an empty vector is built: res_np = np.empty_like(vector_a) Then, the result is copied into this vector: cl.enqueue_copy(queue, res_np, res_g) Finally, the results are displayed: print ("VECTOR LENGTH = %s" %vector_dimension) print ("INPUT VECTOR A") print vector_a print ("INPUT VECTOR B") print vector_b print ("OUTPUT VECTOR RESULT A + B ") print res_np To check the result, we use the assert statement. It tests the result and triggers an error if the condition is false: assert(la.norm(res_np - (vector_a + vector_b))) < 1e-5 Summary In this article we discussed about Asyncio, GPU programming with Python, PyCUDA, and PyOpenCL. Resources for Article: Further resources on this subject: Bizarre Python[article] Scientific Computing APIs for Python[article] Optimization in Python [article]
Read more
  • 0
  • 0
  • 9146

article-image-installingupgrading-powershell
Packt
26 Aug 2015
9 min read
Save for later

Installing/upgrading PowerShell

Packt
26 Aug 2015
9 min read
In this article written by Michael Shepard, author of the book Getting Started with PowerShell, the author goes on to explain that if you don't have PowerShell installed or want a more recent version of PowerShell, you'll need to find the Windows Management Framework (WMF) download that matches the PowerShell version you want. WMF includes PowerShell as well as other related tools such as Windows Remoting (WinRM), Windows Management Instrumentation (WMI), and Desired State Configuration (DSC). The contents of the distribution change from version to version, so make sure to read the release notes included in the download. (For more resources related to this topic, see here.) Here are links to the installers: PowerShell Version URL 1.0 http://support.microsoft.com/kb/926139 2.0 http://support2.microsoft.com/kb/968929/en-us 3.0 http://www.microsoft.com/en-us/download/details.aspx?id=34595 4.0 http://www.microsoft.com/en-us/download/details.aspx?id=40855 5.0 (Feb. Preview) http://www.microsoft.com/en-us/download/details.aspx?id=45883 Note that PowerShell 5.0 has not been officially released, so the table lists the February 2015 preview, the latest at the time of writing. The PowerShell 1.0 installer was released as an executable (.exe), but since then the releases have all been as standalone Windows update installers (.msu). All of these are painless to execute. You can simply download the file and run it from the explorer or from the Run… option in the start menu. PowerShell installs don't typically require a reboot but it's best to plan on doing one, just in case. It's important to note that you can only have one version of PowerShell installed, and you can't install a lower version than the version that was shipped with your OS. Also, there are noted compatibility issues between various versions of PowerShell and Microsoft products such as Exchange, System Center, and Small Business Server, so make sure to read the system requirements section on the download page. Most of the conflicts can be resolved with a service pack of the software, but you should be sure of this before upgrading PowerShell on a server. Starting a PowerShell session We already started a PowerShell session earlier in the section on using PowerShell to find the installed version. So, what more is there to see? It turns out that there is more than one program used to run PowerShell, possibly more than one version of each of these programs, and finally, more than one way to start each of them. It might sound confusing but it will all make sense shortly. PowerShell hosts A PowerShell host is a program that provides access to the PowerShell engine in order to run PowerShell commands and scripts. The PowerShell.exe that we saw in the PSHOME directory is known as the console host. It is cosmetically similar to Command Prompt (cmd.exe) and only provides a command-line interface. Starting with Version 2.0 of PowerShell, a second host was provided. The Integrated Scripting Environment (ISE) is a graphical environment providing multiple editors in a tabbed interface along with menus and the ability to use plugins. While not as fully featured as an Integrated Development Environment (IDE), the ISE is a tremendous productivity tool used to build PowerShell scripts and is a great improvement over using an editor, such as notepad for development. The ISE executable is stored in PSHOME, and is named powershell_ise.exe. In Version 2.0 of the ISE, there were three sections, a tabbed editor, a console for input, and a section for output. Starting with Version 3.0, the input and output sections were combined into a single console that is more similar to the interface of the console host. The Version 4.0 ISE is shown as follows: I will be using the Light Console, Light Editor theme for the ISE in most of the screenshots for this book, because the dark console does not work well on the printed page. To switch to this theme, open the Options item in the Tools Menu and select Manage Themes... in the options window: Press the Manage Themes... button, select the Light Console, Light Editor option from the list and press OK. Press OK again to exit the options screen and your ISE should look something similar to the following: Note that you can customize the appearance of the text in the editor and the console pane in other ways as well. Other than switching to the light console display, I will try to keep the settings to default. 64-bit and 32-bit PowerShell In addition to the console host and the ISE, if you have a 64-bit operating system, you will also have 64-bit and 32-bit PowerShell installations that will include separate copies of both the hosts. As mentioned before, the main installation directory, or PSHOME, is found at %WINDIR%System32WindowsPowerShellv1.0. The version of PowerShell in PSHOME matches that of the the operating system. In other words, on a 64-bit OS, the PowerShell in PSHOME is 64-bit. On a 32-bit system, PSHOME has a 32-bit PowerShell install. On a 64-bit system, a second 32-bit system is found in %WINDIR%SysWOW64WindowsPowerShellv1.0. Isn't that backward? It seems backward that the 64-bit install is in the System32 folder and the 32-bit install is in SysWOW64. The System32 folder is always the primary system directory on a Windows computer, and this name has remained for backward compatibility reasons. SysWOW64 is short for Windows on Windows 64-bit. It contains the 32-bit binaries required for 32-bit programs to run in a 64-bit system, since 32-bit programs can't use the 64-bit binaries in System32. Looking in the Program FilesAccessoriesWindows PowerShell menu in the start menu of a 64-bit Windows 7 install, we see the following: Here, the 32-bit hosts are labeled as (x86) and the 64-bit versions are undesignated. When you run the 32-bit hosts on a 64-bit system, you will also see the (x86) designation in the title bar: PowerShell as an administrator When you run a PowerShell host, the session is not elevated. This means that even though you might be an administrator of the machine, the PowerShell session is not running with administrator privileges. This is a safety feature to help prevent users from inadvertently running a script that damages the system. In order to run a PowerShell session as an administrator, you have a couple of options. First, you can right-click on the shortcut for the host and select Run as administrator from the context menu. When you do this, unless you have disabled the UAC alerts, you will see a User Account Control (UAC) prompt verifying whether you want to allow the application to run as an administrator. Selecting Yes allows the program to run as an administrator, and the title bar reflects that this is the case: The second way to run one of the hosts as an administrator is to right-click on the shortcut and choose Properties. On the shortcut tab of the properties window, press the Advanced button. In the Advanced Properties window that pops up, check the Run as administrator checkbox and press OK, and OK again to exit out of the properties window: Using this technique will cause the shortcut to always launch as an administrator, although the UAC prompt will still appear. If you choose to disable UAC, PowerShell hosts always run as administrators. Note that disabling UAC alerts is not recommended. Simple PowerShell commands Now that we know all the ways that can get a PowerShell session started, what can we do in a PowerShell session? I like to introduce people to PowerShell by pointing out that most of the command-line tools that they already know work fine in PowerShell. For instance, try using DIR, CD, IPCONFIG, and PING. Commands that are part of Command Prompt (think DOS commands) might work slightly different in PowerShell if you look closely, but typical command-line applications work exactly the same as they have always worked in Command Prompt: PowerShell commands, called cmdlets, are named with a verb-noun convention. Approved verbs come from a list maintained by Microsoft and can be displayed using the get-verb cmdlet: By controlling the list of verbs, Microsoft has made it easier to learn PowerShell. The list is not very long and it doesn't contain verbs that have the same meaning (such as Stop, End, Terminate, and Quit), so once you learn a cmdlet using a specific verb, you can easily guess the meaning of the cmdlet names that include the verb. Some other easy to understand cmdlets are: Clear-Host (clears the screen) Get-Date (outputs the date) Start-Service (starts a service) Stop-Process (stops a process) Get-Help (shows help about something) Note that these use several different verbs. From this list, you can probably guess what cmdlet you would use to stop a service. Since you know there's a Start-Service cmdlet, and you know from the Stop-Process cmdlet that Stop is a valid verb, it is logical that Stop-Service is what you would use. The consistency of PowerShell cmdlet naming is a tremendous benefit to learners of PowerShell, and it is a policy that is important as you write the PowerShell code. What is a cmdlet? The term cmdlet was coined by Jeffery Snover, the inventor of PowerShell to refer to the PowerShell commands. The PowerShell commands aren't particularly different from other commands, but by giving a unique name to them, he ensured that PowerShell users would be able to use search engines to easily find PowerShell code simply by including the term cmdlet. Summary Here we focused on figuring out what version of PowerShell was installed and the many ways to start a PowerShell session. A quick introduction to PowerShell cmdlets showed that a lot of the command-line knowledge we have from DOS can be used in PowerShell and that aliases make this transition easier. Resources for Article: Further resources on this subject: PowerShell Troubleshooting: Replacing the foreach loop with the foreach-object cmdlet[article] Administration of Configuration Manager through PowerShell[article] Managing Recipients [article]
Read more
  • 0
  • 0
  • 10241
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-how-write-your-first-fabfile
Liz Tom
26 Aug 2015
5 min read
Save for later

How to Write Your First Fabfile

Liz Tom
26 Aug 2015
5 min read
Fabric is a Python library that makes it easy to run scripts over SSH. Fabric currently supports Python 2.5 - 2.7 but not Python 3 yet. Fabric has great documentation so you can also check out their site Why Use Fabric? Fabric is great to use because it makes executing commands over SSH super easy. I think the Fabric tutorial explains it best. Fabric is a Python (2.5-2.7) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks. More specifically, Fabric is: A tool that lets you execute arbitrary Python functions via the command line; A library of subroutines (built on top of a lower-level library) to make executing shell commands over SSH easy and Pythonic. Naturally, most users combine these two things, using Fabric to write and execute Python functions, or tasks, to automate interactions with remote servers. What I Use Fabric For At my job, we use Fabric as an API to interact with our servers. We can deploy apps from any of our servers using a series of fab tasks. Installing Fabric The first thing you'll want to do when you start building your first Fabfile is to install Fabric. $ pip install fabric If you haven't used pip before you can find out more here But basically, pip is a package manager for Python libraries. Write Your First Fabfile Ok! Let's start writing this Fabfile. In your project's root directory (You can actually do this anywhere but I'm assuming you are using Fabfile for a specific project). $ touch fabfile.py Then in fabfile.py: def add(a, b): print int(a) + int(b) In your console, run: $ fab add:1,2 Congratulations! That's your very first fab command. One thing to notice is the way you pass arguments to the fab task. Now, in your console, run: $ fab --list You should see an output of your fab tasks you can run. This comes in handy when your Fabfile gets larger. This isn't very interesting yet... Write Your First More Useful Fabfile One of the very first things I learned to do with command line was ls. In order to run ls on using Fabfile we just do the following: from fabric.api import run, env def sub_list_files(): run("ls") Now, if I run: $ fab -H [host_name] sub_list_files This is the same as me doing: $ ssh [host_name] $ ls $ exit Ok, so it's not that exciting yet. But let's say I love adding and removing files and checking to make sure things happened the way I intended. from fabric.api import run def sub_list_files(): run("ls") def sub_create_file(name): run("touch " + name) def sub_remove_file(name): run("rm " + name) def create_file(name): sub_create_file(name) sub_list_files() def delete_file(name): sub_remove_file(name) sub_list_files() Instead of running: $ ssh [host_name] $ touch my_super_cool_file.py $ ls $ exit  I can just do: $ fab -H [host_name] create_file:my_super_cool_file.py OR: $ fab -H [host_name] sub_create_file:my_super_cool_file.py sub_list_files Fabric with Different Environments So let's say I have one virtual machine that I need to SSH into often and I don't want to have to keep using the -H flag. I can set the host name in my fabfile. from fabric.api import env, run env.hosts = ['nameof.server'] def sub_list_files(): run("ls") Now instead of having to set the -H flag I can just use: $ fab sub_list_files Now let's say I have multiple environments. I'll need a way to differentiate between which environment I want to work in. For this example, let's say you have 2 servers. You have 'staging' and 'production'. with something.staging.com and something.production.com associated with them. You'll want to be able to use: $ fab staging sub_list_files And: $ fab production sub_list_files In order to get this working we just have to add the following code to our file. from fabric.api import env, run env.hosts = ['staging.server', 'production.server'] def sub_list_files(): run("ls") Now when you run $ fab sub_list_files Fabric loops over all the servers and runs ls on all the servers in the env.hosts array. You probably don't want to run commands across all of your servers everytime you run fab commands. In order to specify which server you'd like to communicate with you'll just need to restructure slightly by replacing: env.hosts = ['staging.server', 'production.server'] with: def staging(): env.hosts = ['staging.server'] def production(): env.hosts = ['production.server'] Now, you can call:  $ fab staging create_file:my_cool_file.py Fabric Fun The documentation for Fabric is pretty good. So I do suggest reading through it to see what the Fabric API has to offer. One thing I found to be fun is the colors module. from fabric.colors import red def hello_world(): print red("hello world!") This will print a red 'hello world!' to your console. Neat! I encourage you to have fun with it. Try and use Fabric with anything that requires you to SSH. About the Author Liz Tom is a Creative Technologist at iStrategyLabs in Washington D.C. Liz’s passion for full stack development and digital media makes her a natural fit at ISL. Before joining iStrategyLabs, she worked in the film industry doing everything from mopping blood off of floors to managing budgets. When she’s not in the office, you can find Liz attempting parkour and going to check out interactive displays at museums.
Read more
  • 0
  • 0
  • 11051

article-image-phalcons-orm
Packt
25 Aug 2015
9 min read
Save for later

Phalcon's ORM

Packt
25 Aug 2015
9 min read
In this article by Calin Rada, the author of the book Learning Phalcon PHP, we will go through a few of the ORM CRUD operations (update, and delete) and database transactions (For more resources related to this topic, see here.) By using the ORM, there is virtually no need to write any SQL in your code. Everything is OOP, and it is using the models to perform operations. The first, and the most basic, operation is retrieving data. In the old days, you would do this: $result = mysql_query("SELECT * FROM article"); The class that our models are extending is PhalconMvcModel. This  class has some very useful methods built in, such as find(), findFirst(), count(), sum(), maximum(), minimum(), average(), save(), create(), update(), and delete(). CRUD – updating data Updating data is as easy as creating it. The only thing that we need to do is find the record that we want to update. Open the article manager and add the following code: public function update($id, $data) { $article = Article::findFirstById($id); if (!$article) { throw new Exception('Article not found', 404); } $article->setArticleShortTitle($data[ 'article_short_title']); $article->setUpdatedAt(new PhalconDbRawValue('NOW()')); if (false === $article->update()) { foreach ($article->getMessages() as $message) { $error[] = (string) $message; } throw new Exception(json_encode($error)); } return $article; } As you can see, we are passing a new variable, $id, to the update method and searching for an article that has its ID equal to the value of the $id variable. For the sake of an example, this method will update only the article title and the updated_at field for now. Next, we will create a new dummy method as we did for the article, create. Open modules/Backoffice/Controllers/ArticleController.php and add the following code: public function updateAction($id) { $this->view->disable(); $article_manager = $this->getDI()->get( 'core_article_manager'); try { $article = $article_manager->update($id, [ 'article_short_title' => 'Modified article 1' ]); echo $article->getId(), " was updated."; } catch (Exception $e) { echo $e->getMessage(); } } If you access http://www.learning-phalcon.localhost/backoffice/article/update/1 now, you should be able to see the 1 was updated. response. Going back to the article list, you will see the new title, and the Updated column will have a new value. CRUD – deleting data Deleting data is easier, since we don't need to do more than calling the built-in delete() method. Open the article manager, and add the following code: public function delete($id) { $article = Article::findFirstById($id); if (!$article) { throw new Exception('Article not found', 404); } if (false === $article->delete()) { foreach ($article->getMessages() as $message) { $error[] = (string) $message; } throw new Exception(json_encode($error)); } return true; } We will once again create a dummy method to delete records. Open modules/Backoffice/Controllers/ArticleControllers.php, and add the following code: public function deleteAction($id) { $this->view->disable(); $article_manager = $this->getDI()->get('core_article_manager'); try { $article_manager->delete($id); echo "Article was deleted."; } catch (Exception $e) { echo $e->getMessage(); } } To test this, simply access http://www.learning-phalcon.localhost/backoffice/article/delete/1. If everything went well, you should see the Article was deleted. message. Going back to, article list, you won't be able to see the article with ID 1 anymore. These are the four basic methods: create, read, update, and delete. Later in this book, we will use these methods a lot. If you need/want to, you can use the Phalcon Developer Tools to generate CRUD automatically. Check out https://github.com/phalcon/phalcon-devtools for more information. Using PHQL Personally, I am not a fan of PHQL. I prefer using ORM or Raw queries. But if you are going to feel comfortable with it, feel free to use it. PHQL is quite similar to writing raw SQL queries. The main difference is that you will need to pass a model instead of a table name, and use a models manager service or directly call the PhalconMvcModelQuery class. Here is a method similar to the built-in find() method: public function find() { $query = new PhalconMvcModelQuery("SELECT * FROM AppCoreModelsArticle", $this->getDI()); $articles = $query->execute(); return $articles; } To use the models manager, we need to inject this new service. Open the global services file, config/service.php, and add the following code: $di['modelsManager'] = function () { return new PhalconMvcModelManager(); }; Now let's rewrite the find() method by making use of the modelsManager service: public function find() { $query = $this->modelsManager->createQuery( "SELECT * FROM AppCoreModelsArticle"); $articles = $query->execute(); return $articles; } If we need to bind parameters, the method can look like this one: public function find() { $query = $this->modelsManager->createQuery( "SELECT * FROM AppCoreModelsArticle WHERE id = :id:"); $articles = $query->execute(array( 'id' => 2 )); return $articles; } We are not going to use PHQL at all in our project. If you are interested in it, you can find more information in the official documentation at http://docs.phalconphp.com/en/latest/reference/phql.html. Using raw SQL Sometimes, using raw SQL is the only way of performing complex queries. Let's see what a raw SQL will look like for a custom find() method and a custom update() method : <?php use PhalconMvcModelResultsetSimple as Resultset; class Article extends Base { public static function rawFind() { $sql = "SELECT * FROM robots WHERE id > 0"; $article = new self(); return new Resultset(null, $article, $article->getReadConnection()->query($sql)); } public static function rawUpdate() { $sql = "UPDATE article SET is_published = 1"; $this->getReadConnection()->execute($sql); } } As you can see, the rawFind() method returns an instance of PhalconMvcModelResultsetSimple. The rawUpdate() method just executes the query (in this example, we will mark all the articles as published). You might have noticed the getReadConnection() method. This method is very useful when you need to iterate over a large amount of data or if, for example, you use a master-slave connection. As an example, consider the following code snippet: <?php class Article extends Base { public function initialize() { $this->setReadConnectionService('a_slave_db_connection_service'); // By default is 'db' $this->setWriteConnectionService('db'); } } Working with models might be a complex thing. We cannot cover everything in this book, but we will work with many common techniques to achieve this part of our project. Please spare a little time and read more about working with models at http://docs.phalconphp.com/en/latest/reference/models.html. Database transactions If you need to perform multiple database operations, then in most cases you need to ensure that every operation is successful, for the sake of data integrity. A good database architecture in not always enough to solve potential integrity issues. This is the case where you should use transactions. Let's take as an example a virtual wallet that can be represented as shown in the next few tables. The User table looks like the following: ID NAME 1 John Doe The Wallet table looks like this: ID USER_ID BALANCE 1 1 5000 The Wallet transactions table looks like the following: ID WALLET_ID AMOUNT DESCRIPTION 1 1 5000 Bonus credit 2 1 -1800 Apple store How can we create a new user, credit their wallet, and then debit it as the result of a purchase action? This can be achieved in three ways using transactions: Manual transactions Implicit transactions Isolated transactions A manual transactions example Manual transactions are useful when we are using only one connection and the transactions are not very complex. For example, if any error occurs during an update operation, we can roll back the changes without affecting the data integrity: <?php class UserController extends PhalconMvcController { public function saveAction() { $this->db->begin(); $user = new User(); $user->name = "John Doe"; if (false === $user->save() { $this->db->rollback(); return; } $wallet = new Wallet(); $wallet->user_id = $user->id; $wallet->balance = 0; if (false === $wallet->save()) { $this->db->rollback(); return; } $walletTransaction = new WalletTransaction(); $walletTransaction->wallet_id = $wallet->id; $walletTransaction->amount = 5000; $walletTransaction->description = 'Bonus credit'; if (false === $walletTransaction1->save()) { $this->db->rollback(); return; } $walletTransaction1 = new WalletTransaction(); $walletTransaction1->wallet_id = $wallet->id; $walletTransaction1->amount = -1800; $walletTransaction1->description = 'Apple store'; if (false === $walletTransaction1->save()) { $this->db->rollback(); return; } $this->db->commit(); } } An implicit transactions example Implicit transactions are very useful when we need to perform operations on related tables / exiting relationships: <?php class UserController extends PhalconMvcController { public function saveAction() { $walletTransactions[0] = new WalletTransaction(); $walletTransactions[0]->wallet_id = $wallet->id; $walletTransactions[0]->amount = 5000; $walletTransactions[0]->description = 'Bonus credit'; $walletTransactions[1] = new WalletTransaction(); $walletTransactions[1]->wallet_id = $wallet->id; $walletTransactions[1]->amount = -1800; $walletTransactions[1]->description = 'Apple store'; $wallet = new Wallet(); $wallet->user_id = $user->id; $wallet->balance = 0; $wallet->transactions = $walletTransactions; $user = new User(); $user->name = "John Doe"; $user->wallet = $wallet; } } An isolated transactions example Isolated transactions are always executed in a separate connection, and they require a transaction manager: <?php use PhalconMvcModelTransactionManager as TxManager, PhalconMvcModelTransactionFailed as TxFailed; class UserController extends PhalconMvcController { public function saveAction() { try { $manager = new TxManager(); $transaction = $manager->get(); $user = new User(); $user->setTransaction($transaction); $user->name = "John Doe"; if ($user->save() == false) { $transaction->rollback("Cannot save user"); } $wallet = new Wallet(); $wallet->setTransaction($transaction); $wallet->user_id = $user->id; $wallet->balance = 0; if ($wallet->save() == false) { $transaction->rollback("Cannot save wallet"); } $walletTransaction = new WalletTransaction(); $walletTransaction->setTransaction($transaction);; $walletTransaction->wallet_id = $wallet->id; $walletTransaction->amount = 5000; $walletTransaction->description = 'Bonus credit'; if ($walletTransaction1->save() == false) { $transaction->rollback("Cannot create transaction"); } $walletTransaction1 = new WalletTransaction(); $walletTransaction1->setTransaction($transaction); $walletTransaction1->wallet_id = $wallet->id; $walletTransaction1->amount = -1800; $walletTransaction1->description = 'Apple store'; if ($walletTransaction1->save() == false) { $transaction->rollback("Cannot create transaction"); } $transaction->commit(); } catch(TxFailed $e) { echo "Error: ", $e->getMessage(); } } Summary In this article, you learned something about ORM in general and how to use some of the main built-in methods to perform CRUD operations. You also learned about database transactions and how to use PHQL or raw SQL queries. Resources for Article: Further resources on this subject: Using Phalcon Models, Views, and Controllers[article] Your first FuelPHP application in 7 easy steps[article] PHP Magic Features [article]
Read more
  • 0
  • 0
  • 4225

article-image-and-now-something-extra
Packt
25 Aug 2015
9 min read
Save for later

And now for something extra

Packt
25 Aug 2015
9 min read
 In this article by Paul F. Johnson, author of the book Cross-platform UI Development with Xamarin.Forms, we'll look at how to add a custom renderer for Windows Phone in particular. (For more resources related to this topic, see here.) This article doesn't depend on anything because there is no requirement to have a Xamarin subscription; the Xamarin Forms library is available for free via NuGet. All you require is Visual Studio 2013 (or higher) running on Windows 8 (or higher—this is needed for the Windows Phone 8 emulator). Let's make a start Before we can create a custom renderer, we have to create something to render. In this case, we need to create a Xamarin Forms application. For this, create a new project in Visual Studio, as shown in the following screenshot: Selecting the OK button creates the project. Once the project is created, you will see the following screenshot on the right-hand side: In the preceding screenshot, there are four projects created: Portable (also known as the PCL—portable class library) Droid (Android 4.0.3 or higher) iOS (iOS 7 or higher) Windows Phone (8 or higher). By default, it is 8.0, but it can be set to 8.1 If we expand the WinPhone profile and examine References, we will see the following screenshot: Here, you can see that Xamarin.Forms is already installed. You can also see the link to the PCL at the bottom. Creating a button Buttons are available natively in Xamarin Forms. You can perform some very basic operations on a button (such as assign text, a Click event, and so on). When built, the platform will render their own version of Button. This is how the code looks: var button = new Button { Text = "Hello" }; button.Click += delegate {…}; For our purposes, we don't want a dull standard button, but we want a button that looks similar to the following image: We may also want to do something really different by having a button with both text and an image, where the image and text positions can look similar to the following image on either side: Creating the custom button The first part to creating the button is to create an empty class that inherits Button, as shown in the following code: using Xamarin.Forms; namespace CustomRenderer { public class NewButton : Button { public NewButton() { } } } As NewButton inherits Button, it will have all the properties and events that a standard Button has. Therefore, we can use the following code: var btnLogin = new NewButton() { Text = "Login", }; btnLogin.Clicked += delegate { if (!string.IsNullOrEmpty(txtUsername.Text) && !string.IsNullOrEmpty(txtPassword.Text)) LoginUser(txtUsername.Text, txtPassword.Text); }; However, the difference here is that as we will use something that inherits a class, we can use the default renderer or define our own renderer. The custom renderer To start with, we need to tell the platform that we will use a custom renderer as follows: [assembly: ExportRenderer(typeof(NewButton), typeof(NewButtonRenderer))] namespace WinPhone { class NewButtonRenderer : ButtonRenderer We start by saying that we will use a renderer on the NewButton object from the PCL with the NewButtonRenderer class. The class itself has to inherit ButtonRenderer that contains the code we need to create the renderer. The next part is to override OnElementChanged. This method is triggered when an element from within the object being worked on changes. Considerations for Windows Phone A prime consideration on Windows Phone is that the ViewRenderer base is actually a Canvas that has the control (in this case, a button) on it as a child. This is an advantage for us. If we clear the child from the canvas, the canvas can be manipulated, and the button can be added back. It is important to remember that we are dealing with two distinct entities, and each has its own properties. For example, the white rectangle that surrounds a Windows Phone button is part of the control, whereas the color and styling are part of the canvas, as shown in the following code: protected override void OnElementChanged(ElementChangedEventArgs<Xamarin.Forms.Button> e) { base.OnElementChanged(e); if (Control != null) { // clear the children of the canvas. We are not deleting the button. Children.Clear(); // create the new background var border = new Border { CornerRadius = new System.Windows.CornerRadius(10), Background = new SolidColorBrush(System.Windows.Media.Color.FromArgb(255, 130, 186, 132)), BorderBrush = new SolidColorBrush(System.Windows.Media.Color.FromArgb(255,45,176,51)), BorderThickness = new System.Windows.Thickness(0.8), Child = Control // this adds the control back to the border }; Control.Foreground = new SolidColorBrush(Colors.White); // make the text white Control.BorderThickness = new System.Windows.Thickness(0); // remove the button border that is always there Children.Add(border); // add the border to the canvas. Remember, this also contains the Control } } When compiled, the UI will give you a button, as shown in the following image: I'm sure you'll agree; it's much nicer than the standard Windows Phone button. The sound of music An image button is also fairly simple to create. Again, create a new Xamarin.Forms project in Visual Studio. Once created, as we did before, create a new empty class that inherits Button. Why is it empty? Unfortunately, it's not that simple to pass additional properties with a custom renderer, so to ensure an easier life, the class just inherits the base class, and anything else that is needed to go to the renderer is accessed through the pointer to app. Setting up the PCL code In the PCL, we will have the following code: App.text = "This is a cow"; App.filename = "cow.png"; App.onTheLeft = true; var btnComposite = new NewCompositeButton(){ }; Text, filename, and onTheLeft are defined in the App class and are accessed from the PCL using CompositeUI.App.filename (CompositeUI is the namespace I've used). The PCL is now set up, so the renderer is needed. The Windows Phone renderer As before, we need to tell the platform that we will use our own renderer and override the default OnElementChanged event, as shown in the following code: [assembly: ExportRenderer(typeof(NewCompositeButton), typeof(NewCompositeButtonRenderer))] namespace WinPhone { class NewCompositeButtonRenderer :ButtonRenderer { protected override void OnElementChanged(ElementChangedEventArgs<Xamarin.Forms.Button> e) { base.OnElementChanged(e); As with the first example, we will deal with a base class that is a Canvas with a single child. This child needs to be removed from the canvas before it can be manipulated as follows: Children.Clear(); Our next problem is that we have an image and text. Accessing the image It is recommended that images are kept either in the Assets directory in the project or in the dedicated Images directory. For my example, my image is in assets. To create the image, we need to create a bitmap image, set the source, and finally assign it to an image (for good measure, a small amount of padding is also added) as follows: var bitmap = new BitmapImage(); bitmap.SetSource(App.GetResourceStream(new Uri(@"Assets/"+CompositeUI.App.filename, UriKind.Relative)).Stream); var image = new System.Windows.Controls.Image { Source = bitmap, Margin = new System.Windows.Thickness(8,0,8,0) }; Adding the image to the button We now have a problem. If we add the image directly to the canvas, we can't specify whether it is on the left-hand side or on the right-hand side of the text. Moreover, how do you add the image to the canvas? Yes, you can use the child property, but this still leads to the issue of position. Thankfully, Windows Phone provides a StackPanel class. If you think of a stack panel as a set of ladders, you will quickly understand how it works. A ladder can be vertical or horizontal. If it's vertical, each object is directly before or after each other. If it is horizontal, each object is either at the left-hand side or the right-hand side of each other. With the Orientation property of a StackPanel class, we can create a horizontal or vertical ladder for whatever we need. In the case of the button, we want the Panel to be horizontal, as shown in the following code: var panel = new StackPanel { Orientation = Orientation.Horizontal, }; Then, we can set the text for the button and any other attributes: Control.Foreground = new SolidColorBrush(Colors.White); Control.BorderThickness = new System.Windows.Thickness(0); Control.Content = CompositeUI.App.text; Note that there isn't a Text property for the button on Windows Phone. Its equivalent is Content. Our next step is to decide which side the image goes on and add it to the panel, as shown in the following code: if (CompositeUI.App.onTheLeft) { panel.Children.Add(image); panel.Children.Add(Control); } else { panel.Children.Add(Control); panel.Children.Add(image); } We can now create the border and add the panel as the child: var border = new Border { CornerRadius = new System.Windows.CornerRadius(10), Background = new SolidColorBrush(System.Windows.Media.Color.FromArgb(255, 130, 186, 132)), BorderBrush = new SolidColorBrush(System.Windows.Media.Color.FromArgb(255, 45, 176, 51)), BorderThickness = new System.Windows.Thickness(0.8), Child = panel }; Lastly, add the border to the canvas: Children.Add(border); We now have a button with an image and text on it, as shown in the following image: This rendering technique can also be applied to Lists and anywhere else required. It's not difficult; it's just not as obvious as it really should be. Summary Creating styled buttons is certainly for the platform to work on, but the basics are there in the PCL. The code is not difficult to understand, and once you've used it a few times, you'll find that the styling buttons to create attractive user interfaces is not such as big effort. Xamarin Forms will always help you create your UI, but at the end of the day, it's only you who can make it stand out. Resources for Article: Further resources on this subject: Configuring Your Operating System[article] Heads up to MvvmCross[article] Code Sharing Between iOS and Android [article]
Read more
  • 0
  • 0
  • 6022

article-image-getting-know-libgdx
Packt
25 Aug 2015
15 min read
Save for later

Getting to Know LibGDX

Packt
25 Aug 2015
15 min read
In this article written by James Cook, author of the book LibGDX Game Development By Example, the author likes to state that, "Creating games is fun, and that is why I like to do it". The process of having an idea for a game to actually delivering it has changed over the years. Back in the 1980s, it was quite common that the top games around were created by either a single person or a very small team. However, anyone who is lucky enough (in my opinion) to see games grow from being quite a simplistic affair to the complex beast that the now AAA titles are, must have also seen the resources needed for these grow with them. The advent of mobile gaming reduced the barrier for entry; once again, the smaller teams could produce a game that could be a worldwide hit! Now, there are games of all genres and complexities available across major gaming platforms. Due to this explosion in the number of games being made, new general-purpose game-making tools appeared in the community. Previously, the in-house teams built and maintained very specific game engines for their games; however, this would have led to a lot of reinventing the wheel. I hate to think how much time I would have lost if for each of my games, I had to start from scratch. Now, instead of worrying about how to display a 2D image on the screen, I can focus on creating that fun player experience I have in my head. My tool of choice? LibGDX. (For more resources related to this topic, see here.) Before I dive into what LibGDX is, here is how LibGDX describes itself. From the LibGDX wiki—https://github.com/libgdx/libgdx/wiki/Introduction: LibGDX is a cross-platform game and visualization development framework. So what does that actually mean? What can LibGDX do for us game-makers that allows us to focus purely on the gameplay? To begin with, LibGDX is Java-based. This means you can reuse a lot, and I mean a lot, of tools that already exist in the Java world. I can imagine a few of you right now must be thinking, "But Java? For a game? I thought Java is supposed to be slow". To a certain extent, this can be true; after all, Java is still an interpreted language that runs in a virtual machine. However, to combat the need for the best possible performance, LibGDX takes advantage of the Java Native Interface (JNI) to implement native platform code and negate the performance disadvantage. One of the beauties of LibGDX is that it allows you to go as low-level as you would like. Direct access to filesystems, input devices, audio devices, and OpenGL (via OpenGL ES 2.0/3.0) is provided. However, the added edge LibGDX gives is that with the APIs that are built on top of these low-level facilities, displaying an image on the screen takes now a days only a few lines of code. A full list of the available features for LibGDX can be found here:http://libgdx.badlogicgames.com/features.html I am happy to wait here while you go and check it out. Impressive list of features, no? So, how cross-platform is this gaming platform? This is probably what you are thinking now. Well, as mentioned before, games are being delivered on many different platforms, be it consoles, PCs, or mobiles. LibGDX currently supports the following platforms: Windows Linux Mac OS X Android BlackBerry iOS HTML/WebGL That is a pretty comprehensive list. Being able to write your game once and have it delivered to all the preceding platforms is pretty powerful. At this point, I would like to mention that LibGDX is completely free and open source. You can go to https://github.com/libGDX/libGDX and check out all the code in all its glory. If the code does something and you would like to understand how, it is all possible; or, if you find a bug, you can make a fix and offer it back to the community. Along with the source code, there are plenty of tests and demos showcasing what LibGDX can do, and more importantly, how to do it. Check out the wiki for more information: https://github.com/libgdx/libgdx/wiki/Running-Demos https://github.com/libgdx/libgdx/wiki/Running-Tests "Who else uses LibGDX?" is quite a common query that comes up during a LibGDX discussion. Well it turns out just about everyone has used it. Google released a game called "Ingress" (https://play.google.com/store/apps/details?id=com.nianticproject.ingress&hl=en) on the play store in 2013, which uses LibGDX. Even Intel (https://software.intel.com/en-us/articles/getting-started-with-libgdx-a-cross-platform-game-development-framework) has shown an interest in LibGDX. Finally, I would like to end this section with another quote from the LibGDX website: LibGDX aims to be a framework rather than an engine, acknowledging that there is no one-size-fits-all solution. Instead we give you powerful abstractions that let you chose how you want to write your game or application. libGDX wiki—https://github.com/libgdx/libgdx/wiki/Introduction This means that you can use the available tools if you want to; if not, you can dive deeper into the framework and create your own! Setting up LibGDX We know by now that LibGDX is this awesome tool for creating games across many platforms with the ability to iterate on our code at superfast speeds. But how do we start using it? Thankfully, some helpful people have made the setup process quite easy. However, before we get to that part, we need to ensure that we have the prerequisites installed, which are as follows: Java Development Kit 7+ (at the time of writing, version 8 is available) Android SDK Not that big a list! Follow the given steps: First things first. Go to http://www.oracle.com/technetwork/java/javase/downloads/index.html. Download and install the latest JDK if you haven't already done so. Oracle developers are wonderful people and have provided a useful installation guide, which you can refer to if you are unsure on how to install the JDK, at http://docs.oracle.com/javase/8/docs/technotes/guides/install/install_overview.html. Once you have installed the JDK, open up the command line and run the following command: java -version If it is installed correctly, you should get an output similar to this: If you generate an error while doing this, consult the Oracle installation documentation and try again. One final touch would be to ensure that we have JAVA_HOME configured. On the command line, perform the following:    For Windows, set JAVA_HOME = C:PathToJDK    For Linux and Mac OSX, export JAVA_HOME = /Path/ToJDK/ Next, on to the Android SDK. At the time of writing, Android Studio has just been released. Android Studio is an IDE offered by Google that is built upon JetBrains IntelliJ IDEA Java IDE. If you feel comfortable using Android Studio as your IDE, and as a developer who has used IntelliJ for the last 5 years, I suggest that you at least give it a go. You can download Android Studio + Android SDK in a bundle from here: http://developer.android.com/sdk/index.html Alternatively, if you plan to use a different IDE (Eclipse or NetBeans, for example) you can just install the tools from the following URL: http://developer.android.com/sdk/index.html#Other You can find the installation instructions here: https://developer.android.com/sdk/installing/index.html?pkg=tools However, I would like to point out that the official IDE for Android is now Android Studio and no longer Eclipse with ADT. For the sake of simplicity, we will only focus on making games for desktops for the greater part of this article. We will look at exporting to Android and iOS later on. Once the Android SDK is installed, it would be well worth running the SDK manager application; so, finalize the set up. If you opt to use Android Studio, you can access this from the SDK Manager icon in the toolbar. Alternatively, you can also access it as follows: On Windows: Double-click on the SDK's Manager.exe file at the root of the Android SDK directory On Mac/Linux: Open a terminal and navigate to the tools/ directory in the location where the Android SDK is installed, then execute Android SDK. The following screen might appear: As a minimum configuration, select: Android SDK Tools Android SDK Platform-tools Android SDK Build-tools (latest available version) Latest version of SDK Platform Let them download and install the selected configuration. Then that's it! Well, not really. We just need to set the ANDROID_HOME environment variable. To do this, we can open up a command line and run the following command: On Windows: Set ANDROID_HOME=C:/Path/To/Your/Android/Sdk On Linux and Mac OS X: Export ANDROID_HOME=/Path/To/Your/Android/Sdk Phew! With that done, we can now move on to the best part—creating our first ever LibGDX game! Creating a project Follow the given steps to create your own project: As mentioned earlier, LibGDX comes with a really useful project setup tool. Download the application from here: http://libgdx.badlogicgames.com/download.html At the time of writing, it is the big red "Download Setup App" button in the middle of your screen. Once downloaded, open the command line and navigate to the location of the application. You will notice that it is a JAR file type. This means we need to use Java to run it. Running this will open the setup UI: Before we hit the Generate button, let's just take a look at what we are creating here: Name: This is the name of our game. Package: This is the Java package our game code will be developed in. Game class: This parameter sets the name of our game class, where the magic happens! Destination: This is the project's directory. You can change this to any location of your choice. Android SDK: This is the location of the SDK. If this isn't set correctly, we can change it here. Going forward, it might be worth setting the ANDROID_HOME environment variable. Next is the version of LibGDX we want to use. At time of writing, the version is 1.5.4. Now, let's move on to the subprojects. As we are only interested in desktops at the moment, let's deselect the others. Finally, we come to extensions. Feel free to uncheck any that are checked. We won't be needing any of them at this point in time. For more information on available extensions, check out the LibGDX wiki (https://github.com/libgdx/libgdx/wiki). Once all is set, let's hit the Generate button! There is a little window at the bottom of the UI that will now spring to life. Here, it will show you the setup progress as it downloads the necessary setup files. Once complete, open that command line, navigate to the directory, and run your preferred tree command (in Windows, it is just "tree").   Hopefully, you will have the same directory layout as the previous image shows. The astute among you will now ask, "What is this Gradle?" and quite rightly so. I haven't mentioned it yet, although it appears twice in our projects directory. What is Gradle? Well, Gradle is a very excellent build tool and LibGDX leverages its abilities to look after the dependencies, build process, and IDE integration. This is especially useful if you are going to be working in a team with a shared code base. Even if you are not, the dependency management aspect is worth it alone. Anyone who isn't familiar with dependency management may well be used to downloading Java JARs manually and placing them in a libs folder, but they might run into problems later when the JAR they just downloaded needs another JAR, and so on. The dependency management will take care of this for you and even better is that the LibGDX setup application takes care of this for you by already describing the dependencies that you need to run! Within LibGDX, there is something called the Gradle Wrapper. This is essentially the Gradle application embedded into the project. This allows portability of our project, as now if we want someone else to run it, they can. I guess this leads us to the question, how do we use Gradle to run our project? In the LibGDX wiki (https://github.com/libgdx/libgdx/wiki/Gradle-on-the-Commandline), you will find a comprehensive list of commands that can be used while developing your game. However, for now, we will only cover the desktop project. What you may not have noticed is that the setup application actually generates a very simple "Hello World" game for us. So, we have something we can run from the command line right away! Let's go for it! On our command line, let's run the following:    On Windows: gradlew desktop:run    On Linux and Mac OS X: ./gradlew desktop:run The following screen will appear once you execute the preceding command:   You will get an output similar to the preceding screenshot. Don't worry if it suddenly wants to start downloading the dependencies. This is our dependency management in action! All those JARs and native binaries are being downloaded and put on to classpaths. But, we don't care. We are here to create games! So, after the command prompt has finished downloading the files, it should then launch the "Hello World" game. Awesome! You have just launched your very first LibGDX game! Although, before we get too excited, you will notice that not much actually happens here. It is just a red screen with the Bad Logic Games logo. I think now is the time to look at the code! Importing a project So far, we have launched the "Hello World" game via the command line, and haven't seen a single line of code so far. Let's change that. To do this, I will use IntelliJ IDEA. If you are using Android Studio, the screenshots will look familiar. If you are using Eclipse, I am sure you will be able to see the common concepts. To begin with, we need to generate the appropriate IDE project files. Again, this is using Gradle to do the heavy lifting for us. Once again, on the command line, run the following (pick the one that applies): On Windows: gradlew idea or gradlew eclipse On Linux and Mac OS X: ./gradlew idea or ./gradlew eclipse Now, Gradle will have generated some project files. Open your IDE of choice and open the project. If you require more help, check out the following wiki pages: https://github.com/libgdx/libgdx/wiki/Gradle-and-Eclipse https://github.com/libgdx/libgdx/wiki/Gradle-and-Intellij-IDEA https://github.com/libgdx/libgdx/wiki/Gradle-and-NetBeans Once the project is open, have a poke around and look at some of the files. I think our first port of call should be the build.gradle file in the root of the project. Here, you will see that the layout of our project is defined and the dependencies we require are on display. It is a good time to mention that going forward, there will be new releases of LibGDX, and to update our project to the latest version, all we need to do is update the following property: gdxVersion = '1.6.4' Now, run your game and Gradle will kick in and download everything for you! Next, we should look for our game class, remember the one we specified in the setup application—MyGdxGame.java? Find it, open it, and be in awe of how simple it is to display that red screen and Bad Logic Games logo. In fact, I am going to paste the code here for you to see how simple it is: public class MyGdxGame extends ApplicationAdapter { SpriteBatch batch; Texture img; @Override public void create () { batch = new SpriteBatch(); img = new Texture("badlogic.jpg"); } @Override public void render () { Gdx.gl.glClearColor(1, 0, 0, 1); Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT); batch.begin(); batch.draw(img, 0, 0); batch.end(); } } Essentially, we can see that when the create() method is called, it sets up a SpriteBatch batch and creates a texture from a given JPEG file. Then, on the render() method, this is called on every iteration of the game loop; it covers the screen with the color red, then it draws the texture at the (0, 0) coordinate location. Finally, we will look at the DesktopLauncher class, which is responsible for running the game in the desktop environment. Let's take a look at the following code snippet: public class DesktopLauncher { public static void main (String[] arg) { LwjglApplicationConfiguration config = new LwjglApplicationConfiguration(); new LwjglApplication(new MyGdxGame(), config); } } The preceding code shows how simple it is. We have a configuration object that will define how our desktop application runs, setting things like screen resolution and framerate, amongst others. In fact, this is an excellent time to utilize the open source aspect of LibGDX. In your IDE, click through to the LwjglApplicationConfiguration class. You will see all the properties that can be tweaked and notes on what they mean. The instance of the LwjglApplicationConfiguration class is then passed to the constructor of another class LwjglApplication, along with an instance of our MyGdxGame class. Finally, those who have worked with Java a lot in the past will recognize that it is wrapped in a main method—a traditional entry point for a Java application. That is all that is needed to create and launch a desktop-only LibGDX game. Summary In this article, we looked at what LibGDX is about and how to go about creating a standard project, running it from the command line and importing it into your preferred IDE ready for development. Resources for Article: Further resources on this subject: 3D Modeling[article] Using Google's offerings[article] Animations in Cocos2d-x [article]
Read more
  • 0
  • 0
  • 13741
article-image-releasing-and-maintaining-application
Packt
25 Aug 2015
11 min read
Save for later

Releasing and Maintaining the Application

Packt
25 Aug 2015
11 min read
In this article by Andrey Kovalenko author of the book PhoneGap by Example we implemented several unit and integration tests with the Jasmine tool for our application. We used the headless browser PhantomJS, and we measured performance with Appium. All this is great and helps us automate the testing approach to find bugs in the early stages of application development. Once we finish creating our application and test it, we can think of delivering our application to other people. We can distribute the application in several different ways. Once we finish these tasks, we will be ready to do a full cycle of the application creation and distribution processes. We already know how to set up development environments to develop for iOS and Android. We will reuse these skills in this article as well to prepare our builds for distribution. This article read as a step-by-step tutorial for the setup of different tools. (For more resources related to this topic, see here.) We already know how to build our application using IDE (Xcode or Android Studio). However, now, we will explore how to build the application for different platforms using the PhoneGap Build service. PhoneGap Build helps us stay away from different SDKs. It works for us by compiling in the cloud. First of all, we should register on https://build.phonegap.com. It is pretty straightforward. Once we register, we can log in, and under the apps menu section, we will see something like this:   We entered a link to our git repository with source files or upload the zip archive with the same source code. However, there is a specific requirement for the structure of the folders for upload. We should take only the www directory of the Cordova/PhoneGap application, add config.xml inside it, and compress this folder. Let's look at this approach using an example of the Crazy Bubbles application. PhoneGap config.xml In the root folder of the game, we will place the following config.xml file: <?xml version="1.0" encoding="UTF-8" ?> <widget id = "com.cybind.crazybubbles" versionCode = "10" version = "1.0.0" > <name>Crazy Bubbles</name> <description> Nice PhoneGap game </description> <author href="https://build.phonegap.com" email="support@phonegap.com"> Andrew Kovalenko </author> <gap:plugin name="com.phonegap.plugin.statusbar" /> </widget> This configuration file specifies the main setup for the PhoneGap Build application. The setup is made up of these elements: widget is a root element of our XML file based on the W3C specification, with the following attributes: id: This is the application name in the reverse-domain style version: This is the version of the application in numbers format versionCode: This is optional and used only for Android name of the application description of the application name of the author with website link and e-mail List of plugins if required by the application We can use this XML file or enter the same information using a web interface. When we go to Settings | Configuration, we will see something like the following screenshot: PhoneGap plugins As you can see, we included one plugin in config.xml: <gap:plugin name="com.phonegap.plugin.statusbar" /> There are several attributes that the gap:plugin tag has. They are as follows: name: This is required, plugin ID in the reverse-domain format version: This is optional, plugin version source: This is optional, can be pgb, npm, or plugins.cordova.io. The default is pgb params: This is optional, configuration for plugin if needed We included the StatusBar plugin, which doesn't require JavaScript code. However, there are some other plugins that need JavaScript in the index.html file. So, we should not forget to add the code. Initial upload and build Once we finish the configuration steps and create a Zip archive of the www folder, we can upload it. Then, we will see the following screen:   Here, we can see generic information about the application, where we can enable remote debugging with Weinre. Weinre is a remote web inspector. It allows access to the DOM and JavaScript. Now, we can click on the Ready to build button, and it will trigger the build for us. Here, you can see that the iOS build has failed. Let's click on the application title and figure out what is going on. Once the application properties page loads, we will see the following screenshot: When we click on the Error button, we will see the reason why it failed:   So, we need to provide a signing key. Basically, you need a provisioning profile and certificate needed to build the application. We already downloaded the provisioning profile from the Apple Development portal, but we should export the certificate from the Keychain Access. We are going to open it, find our certificate in the list, and export it:   When we export it, we will be asked for the destination to store the .p12 file:   Add a password to protect the file:   Once we save the file, we can go back to the PhoneGap Build portal and create a signing key:   Just click on the No key selected button in the dropdown and upload the exported certificate and provisioning profile for the application. Once the upload is finished, the build will be triggered:   Now, we will get a successful result and can see all the build platforms:   Now, we can download the application for both iOS and Android and install it on the device. Alternatively, we can install the application by scanning the QR code on the application main page. We can do this with any mobile QR scanner application on our device. It will return a direct link for the build download for a specific platform. Once it is downloaded, we can install it and see it running on our device. Congratulations! We just successfully created the build with the PhoneGap Build service! Now, let's take a closer look at the versioning approach for the application. Beta release of the iOS application For the beta release of our application, we will use the TestFlight service from the Apple. As a developer, we need to be a member of the iOS Developer program. As a tester, we will need to install the application for beta testing and the TestFlight application from the App Store. After that, the tester can leave feedback about the application. First of all, let's go to https://itunesconnect.apple.com and login there. After that, we can go to the My Apps section and click on the plus sign in the top-left corner. We will get a popup with a request to enter some main information about the application. Let's add the information about our application so that it looks like this:   All the fields in the preceding screenshot are well known and do not require additional explanation. Once we click on the Create button, the application is created, and we can see the Versions tab of the application. Now, we need to build and upload our application. We can do this in two ways: Using Xcode Using Application Loader However, before submitting to beta testing, we need to generate a provisioning profile for distribution. Let's do it on the Developer portal. Generate a distribution provisioning profile Go to the Provisioning Profiles, and perform the following steps: Click on + to add a new provisioning profile and go to Distribution | App Store as presented in the following screenshot: Then, select the application ID. In my case, it is Travelly: After that, select the certificates to include in the provisioning profile. The certificate should be for distribution as well: Finally, generate the provisioning profile, set a name for the file, and download it: Now, we can build and upload our application to iTunes Connect. Upload to iTunes Connect with Xcode Let's open the Travelly application in Xcode. Go to cordova/platforms/ios and open Travelly.xcodeproj. After that, we have to select iOS Device to run our application. In this case, we will see the Archive option available. It would not be available if the emulator option is selected. Now, we can initiate archiving by going to Product | Archive:   Once the build is completed, we will see the list of archives:   Now, click on the Submit to App Store… button. It will ask us to select a development team if we have several teams:   At this stage, Xcode is looking for the provisioning profile we generated earlier. We would be notified if there is no distribution provisioning profile for our application. Once we click on Choose, we are redirected to the screen with binary and provisioning information:   When we click on the Submit button, Xcode starts to upload the application to iTunes Connect: Congratulations! We have successfully uploaded our build with Xcode: Upload to iTunes Connect with Application Loader Before the reviewing process of build upload with Application Loader, we need to install the tool first. Let's go to iTunes Connect | Resources and Help | App Preparation and Delivery and click on the Application Loader link. It will propose the installation file for download. We will download and install it. After that, we can review the upload process. Uploading with Application Loader is a little different than with XCode. We will follow the initial steps until we get the following screen:   In this case, on the screen, we will click on the Export button, where we can save the .ipa file. However, before that, we have to select the export method:   We are interested in distribution to the App Store, so we selected the first option. We need to save the generated file somewhere to the filesystem. Now, we will launch Application Loader and log in using our Apple Developer account:   After that, we will select Deliver Your App and pick the generated file:   In the following screenshot, we can see the application's generic information: name, version, and so on:   When we click on the Next button, we will trigger upload to iTunes Connect, which is successfully executed: During the process, the package will be uploaded to the iTunes Store, as shown here: Once the application is added, it will show you the following screenshot: Now, if we go to iTunes Connect | My Apps | Travelly | Prerelease | Builds, we will see our two uploaded builds:   As you can see, they are both inactive. We need to send our application to internal and external testers. Invite internal and external testers Let's work with version 0.0.2 of the application. First of all, we need to turn on the check box to the right of the TestFlight Beta Testing label. There are two types of testers we can invite: Internal testers are iTunes Connect users. It is possible to invite up to 25 internal testers. External testers are independent users who can install the application using the TestFlight mobile tool. To invite internal testers, let's go to the Internal Testers tab, add the e-mail of the desired tester, place the check mark, and click on the Invite button:   The user will receive an e-mail with the following content:   Users can click on the link and follow the instructions to install the application. To allow testing for external users, we will go to the External Testers tab. Before becoming available for external testing, the application should be reviewed. For the review, some generic information is needed. We need to add: Instructions for the testers on what to test Description of the application Feedback information Once this information is entered, we can click on the Next button and answer questions about cryptography usage in the application:   We do not use cryptography, so we select No and click on Submit. Now, our application is waiting for review approval:   Now, there is a button available to add external testers:   We can invite up to 1000 external testers. After the tester accepts the invite on their device, the invite will be linked to their current Apple ID. Once the application review is finished, it will become available for external testers. Summary In this article of the book, you learned how to release the PhoneGap application with the PhoneGap Build service. Also, we released the application through TestFlight for beta testing. Now, we will be able to develop different types of Cordova/PhoneGap applications, test them. I think it is pretty awesome, don't you? Resources for Article: Further resources on this subject: Geolocation – using PhoneGap features to improve an app's functionality, write once use everywhere[article] Getting Ready to Launch Your PhoneGap App in the Real World[article] Using Location Data with PhoneGap [article]
Read more
  • 0
  • 0
  • 3281

article-image-designing-api-scratch
Jonathan Pollack
24 Aug 2015
7 min read
Save for later

Designing an API from Scratch

Jonathan Pollack
24 Aug 2015
7 min read
Designing an API from scratch is a frustrating affair; there are a million best-practices to keep in mind, granularity to argue over, and the inherent struggle between consumer & developer. This article deals with all of the issues above, by concretely developing an API for a courier service–addressing the needs of the business, how they translate into data relationships & schema, and how they could/should be translated into consumable REST end-points. Make your life easy There is a shelf of software; take from it greedily! Instead of busting your hump to design your own API spec from scratch, make your life easy, and look to the world for already validated and accepted API specs. I advocate heavily for JSON:API, as I believe it offers the best combination of best-practices and granularity (I’d call it medium to medium-fine grained), thus we’ll be writing our API out according to the JSON:API spec. If you are a masochist and/or truly wish to design your own API spec then I would highly suggest wisely brushing up on the state-of-the-art, via the following articles: Using HTTP Methods for RESTful ServicesREST API Design - Resource ModelingCQRS summary Design step 1: determine business concerns Before we even begin coding, we need to identify our business concerns. These will lead us to our resources, and thus the resources our API consumers will want/need to deal with. It is very important that we do not confuse necessary business logic/resources with necessary consumer endpoints. As far as the API consumer is concerned, they should be given a tool to perform introspection on the state of their affairs, and no one else’s. This means that they should be able to command new deliveries and inspect the state of current & past deliveries; they should not have insight into any other aspects of our operation. Nouns and their actions Because we’re dealing with a courier service, try to estimate the consumer-facing nouns, and their relevant actions: Customers Command the delivery of a package, specifying origin and destination Couriers Receive and deliver packages Packages Get delivered from a sender to a recipient by a courier Senders Hand the package off to the Courier Recipients Receive the package from the Courier In performing this exercise, we’ve hopefully made clear the relationship between customers, couriers, packages, senders, and recipients. Not all nouns are equal You may have noticed that couriers, senders, and recipients appear to be cyclically linked. When relationships like this occur, it likely means that these nouns (pieces of data) do not deserve to be independent resources, but rather properties of another resource–specifically, packages. Be sure to keep an eye out for this, as you no doubt will be tempted to make all of your nouns resources (albeit heavily related ones), a practice that will bloat your API. Design step 2: formalize relationships Armed with your list of resources, nouns, and their actions, formalizing the relationship between resources should follow naturally. In our case, because we identified a cyclical relationship, we were able to consolidate our nouns into only two resources: customers and packages. The relationship between these (customers and packages) is thankfully quite obvious: one-to-many. That is because each customer can have any number of packages while a package may only belong to one customer. Internally, we’ve already collapsed the relationship between packages and couriers, senders, & recipients by making them simple properties–implicitly identifying these as one-to-one relationships. Design step 3: formalize the schema A dovetail to step 2, we now must concretely start listing the properties of each resource. customer: { id: "string", //UUID packages: ["string"] //Array of UUIDs } package: { id: "string", //UUID origin: "string", //Address destination: "string", //Address customer: "string", //UUID sender: "string", //Name recipient: "string", //Name courier: "string" //UUID } As we mentioned in step 2, the relationship between customers and packages is one-to-many–this is represented by the array of package ids–and each package has the consolidated nouns as properties. Design step 4: mock it out It is critical to use authorization tokens to avoid leaking package and customer information to the wrong parties. That said, for the case of this step, we are primarily concerned with mocking the API via JSON:API spec to see how it looks and feels. If you want a motivated example, just think of the case of whenever anyone queries http://example.com/customers. If the whole world has access to the collection of every customer you’ve got a huge problem. You can see an interactive example here–built off of a RAML file. Example: GET http://example.com/packages => { "links": { "self": "http://example.com/packages" }, "data": [{ "type": "packages", "id": "1", "origin": "1600 Pennsylvania Ave NW, Washington, DC 20500", "destination": "2 Lincoln Memorial Cir NW, Washington, DC 20037", "sender": "Barry Obama", "recipient": "Abraham Lincoln", "courier": "1", "links": { "self": "http://example.com/packages/1", "customer": { "self": "http://example.com/packages/1/links/customer", "related": "http://example.com/packages/1/customer", "linkage": { "type": "customers", "id": "1" } } } }, { "type": "packages", "id": "2", "origin": "437 N Wabash Ave., Chicago, 60611", "destination": "111 S Michigan Ave., Chicago, IL 60603", "sender": "Donald Trump", "recipient": "Marshal Fields", "courier": "1", "links": { "self": "http://example.com/packages/2", "customer": { "self": "http://example.com/packages/2/links/customer", "related": "http://example.com/packages/2/customer", "linkage": { "type": "customers", "id": "1" } } } }], "included": [{ "type": "customers", "id": "1", "links": { "self": "http://example.com/customers/1" } }] } GET http://example.com/customers => { "links": { "self": "http://example.com/customers" }, "data": [{ "type": "customers", "id": "1", "links": { "self": "http://example.com/customers/1", "packages": { "self": "http://example.com/customers/1/links/packages", "related": "http://example.com/customers/1/links/packages", "linkage": [ {"type": "packages", "id": 1}, {"type": "packages", "id": 2} ] } } }], "included": [{ "type": "packages", "id": "1", "origin": "1600 Pennsylvania Ave NW, Washington, DC 20500", "destination": "2 Lincoln Memorial Cir NW, Washington, DC 20037", "sender": "Barry Obama", "recipient": "Abraham Lincoln", "courier": "1", "links": { "self": "http://example.com/packages/1" } }, { "type": "packages", "id": "2", "origin": "437 N Wabash Ave., Chicago, 60611", "destination": "111 S Michigan Ave., Chicago, IL 60603", "sender": "Donald Trump", "recipient": "Marshal Fields", "courier": "1", "links": { "self": "http://example.com/packages/2" } }] } Yes, these responses seem fairly verbose, but they are designed to lower the number of necessary requests per interaction–making your server gather the necessary data you would have called for anyway, in anticipation of those requests rather than just in time. Believe me, after a few minutes of reading the JSON, you’re visceral reaction won’t lean so strongly towards disgust. Design step 5: get feedback Once you have a mock written out that people can try for themselves, like the interactive example in step 4, it’s time for you to ask the hard question: what are the pain points? Warning: unless you are the sole consumer of your API, try to avoid making changes that conflict with the JSON:API spec. Remember the whole point of using the spec was so to decrease your design overhead, and the consumer’s learning curve. Design step 6: finalize with tests While you are waiting for feedback, you should start writing tests. If you spec’d your API with something like RAML or API Blueprint it’s really easy to use your spec files to auto-generate mock API servers–so testing is a breeze. Conclusion By following the JSON:API spec we do away with a lot of the head scratching and frustration typically involved in designing an API. What’s left behind is simply the tedious task of implementation, and while that’s not a terribly exciting conclusion, I’ll take that over 10 hours of meetings on whether to use PUT or PATCH, and which color to paint the bike shed. And who knows? Maybe with your new-found time, you’ll write a JSON:API generator (connected to your favorite frame work) so that next time, you won’t even have to think about implementation either! About the author Jonathan Pollack is a full stack developer living in Berlin. He previously worked as a web developer at a public shoe company, and prior to that, worked at a start up that’s trying to build the world’s best pan-cloud virtualization layer. He can be found on Twitter @murphydanger.
Read more
  • 0
  • 0
  • 2959

article-image-rendering-stereoscopic-3d-models-using-opengl
Packt
24 Aug 2015
8 min read
Save for later

Rendering Stereoscopic 3D Models using OpenGL

Packt
24 Aug 2015
8 min read
In this article, by Raymond C. H. Lo and William C. Y. Lo, authors of the book OpenGL Data Visualization Cookbook, we will demonstrate how to visualize data with stunning stereoscopic 3D technology using OpenGL. Stereoscopic 3D devices are becoming increasingly popular, and the latest generation's wearable computing devices (such as the 3D vision glasses from NVIDIA, Epson, and more recently, the augmented reality 3D glasses from Meta) can now support this feature natively. The ability to visualize data in a stereoscopic 3D environment provides a powerful and highly intuitive platform for the interactive display of data in many applications. For example, we may acquire data from the 3D scan of a model (such as in architecture, engineering, and dentistry or medicine) and would like to visualize or manipulate 3D objects in real time. Unfortunately, OpenGL does not provide any mechanism to load, save, or manipulate 3D models. Thus, to support this, we will integrate a new library named Open Asset Import Library (Assimp) into our code. The main dependencies include the GLFW library that requires OpenGL version 3.2 and higher. (For more resources related to this topic, see here.) Stereoscopic 3D rendering 3D television and 3D glasses are becoming much more prevalent with the latest trends in consumer electronics and technological advances in wearable computing. In the market, there are currently many hardware options that allow us to visualize information with stereoscopic 3D technology. One common format is side-by-side 3D, which is supported by many 3D glasses as each eye sees an image of the same scene from a different perspective. In OpenGL, creating side-by-side 3D rendering requires asymmetric adjustment as well as viewport adjustment (that is, the area to be rendered) – asymmetric frustum parallel projection or equivalently to lens-shift in photography. This technique introduces no vertical parallax and widely adopted in the stereoscopic rendering. To illustrate this concept, the following diagram shows the geometry of the scene that a user sees from the right eye: The intraocular distance (IOD) is the distance between two eyes. As we can see from the diagram, the Frustum Shift represents the amount of skew/shift for asymmetric frustrum adjustment. Similarly, for the left eye image, we perform the transformation with a mirrored setting. The implementation of this setup is described in the next section. How to do it... The following code illustrates the steps to construct the projection and view matrices for stereoscopic 3D visualization. The code uses the intraocular distance, the distance of the image plane, and the distance of the near clipping plane to compute the appropriate frustum shifts value. In the source file, common/controls.cpp, we add the implementation for the stereo 3D matrix setup: void computeStereoViewProjectionMatrices(GLFWwindow* window, float IOD, float depthZ, bool left_eye){ int width, height; glfwGetWindowSize(window, &width, &height); //up vector glm::vec3 up = glm::vec3(0,-1,0); glm::vec3 direction_z(0, 0, -1); //mirror the parameters with the right eye float left_right_direction = -1.0f; if(left_eye) left_right_direction = 1.0f; float aspect_ratio = (float)width/(float)height; float nearZ = 1.0f; float farZ = 100.0f; double frustumshift = (IOD/2)*nearZ/depthZ; float top = tan(g_initial_fov/2)*nearZ; float right = aspect_ratio*top+frustumshift*left_right_direction; //half screen float left = -aspect_ratio*top+frustumshift*left_right_direction; float bottom = -top; g_projection_matrix = glm::frustum(left, right, bottom, top, nearZ, farZ); // update the view matrix g_view_matrix = glm::lookAt( g_position-direction_z+ glm::vec3(left_right_direction*IOD/2, 0, 0), //eye position g_position+ glm::vec3(left_right_direction*IOD/2, 0, 0), //centre position up //up direction ); In the rendering loop in main.cpp, we define the viewports for each eye (left and right) and set up the projection and view matrices accordingly. For each eye, we translate our camera position by half of the intraocular distance, as illustrated in the previous figure: if(stereo){ //draw the LEFT eye, left half of the screen glViewport(0, 0, width/2, height); //computes the MVP matrix from the IOD and virtual image plane distance computeStereoViewProjectionMatrices(g_window, IOD, depthZ, true); //gets the View and Model Matrix and apply to the rendering glm::mat4 projection_matrix = getProjectionMatrix(); glm::mat4 view_matrix = getViewMatrix(); glm::mat4 model_matrix = glm::mat4(1.0); model_matrix = glm::translate(model_matrix, glm::vec3(0.0f, 0.0f, -depthZ)); model_matrix = glm::rotate(model_matrix, glm::pi<float>() * rotateY, glm::vec3(0.0f, 1.0f, 0.0f)); model_matrix = glm::rotate(model_matrix, glm::pi<float>() * rotateX, glm::vec3(1.0f, 0.0f, 0.0f)); glm::mat4 mvp = projection_matrix * view_matrix * model_matrix; //sends our transformation to the currently bound shader, //in the "MVP" uniform variable glUniformMatrix4fv(matrix_id, 1, GL_FALSE, &mvp[0][0]); //render scene, with different drawing modes if(drawTriangles) obj_loader->draw(GL_TRIANGLES); if(drawPoints) obj_loader->draw(GL_POINTS); if(drawLines) obj_loader->draw(GL_LINES); //Draw the RIGHT eye, right half of the screen glViewport(width/2, 0, width/2, height); computeStereoViewProjectionMatrices(g_window, IOD, depthZ, false); projection_matrix = getProjectionMatrix(); view_matrix = getViewMatrix(); model_matrix = glm::mat4(1.0); model_matrix = glm::translate(model_matrix, glm::vec3(0.0f, 0.0f, -depthZ)); model_matrix = glm::rotate(model_matrix, glm::pi<float>() * rotateY, glm::vec3(0.0f, 1.0f, 0.0f)); model_matrix = glm::rotate(model_matrix, glm::pi<float>() * rotateX, glm::vec3(1.0f, 0.0f, 0.0f)); mvp = projection_matrix * view_matrix * model_matrix; glUniformMatrix4fv(matrix_id, 1, GL_FALSE, &mvp[0][0]); if(drawTriangles) obj_loader->draw(GL_TRIANGLES); if(drawPoints) obj_loader->draw(GL_POINTS); if(drawLines) obj_loader->draw(GL_LINES); } The final rendering result consists of two separate images on each side of the display, and note that each image is compressed horizontally by a scaling factor of two. For some display systems, each side of the display is required to preserve the same aspect ratio depending on the specifications of the display. Here are the final screenshots of the same models in true 3D using stereoscopic 3D rendering: Here's the rendering of the architectural model in stereoscopic 3D: How it works... The stereoscopic 3D rendering technique is based on the parallel axis and asymmetric frustum perspective projection principle. In simpler terms, we rendered a separate image for each eye as if the object was seen at a different eye position but viewed on the same plane. Parameters such as the intraocular distance and frustum shift can be dynamically adjusted to provide the desired 3D stereo effects. For example, by increasing or decreasing the frustum asymmetry parameter, the object will appear to be moved in front or behind the plane of the screen. By default, the zero parallax plane is set to the middle of the view volume. That is, the object is set up so that the center position of the object is positioned at the screen level, and some parts of the object will appear in front of or behind the screen. By increasing the frustum asymmetry (that is, positive parallax), the scene will appear to be pushed behind the screen. Likewise, by decreasing the frustum asymmetry (that is, negative parallax), the scene will appear to be pulled in front of the screen. The glm::frustum function sets up the projection matrix, and we implemented the asymmetric frustum projection concept illustrated in the drawing. Then, we use the glm::lookAt function to adjust the eye position based on the IOP value we have selected. To project the images side by side, we use the glViewport function to constrain the area within which the graphics can be rendered. The function basically performs an affine transformation (that is, scale and translation) which maps the normalized device coordinate to the window coordinate. Note that the final result is a side-by-side image in which the graphic is scaled by a factor of two vertically (or compressed horizontally). Depending on the hardware configuration, we may need to adjust the aspect ratio. The current implementation supports side-by-side 3D, which is commonly used in most wearable Augmented Reality (AR) or Virtual Reality (VR) glasses. Fundamentally, the rendering technique, namely the asymmetric frustum perspective projection described in our article, is platform-independent. For example, we have successfully tested our implementation on the Meta 1 Developer Kit (https://www.getameta.com/products) and rendered the final results on the optical see-through stereoscopic 3D display: Here is the front view of the Meta 1 Developer Kit, showing the optical see-through stereoscopic 3D display and 3D range-sensing camera: The result is shown as follows, with the stereoscopic 3D graphics rendered onto the real world (which forms the basis of augmented reality): See also In addition, we can easily extend our code to support shutter glasses-based 3D monitors by utilizing the Quad Buffered OpenGL APIs (refer to the GL_BACK_RIGHT and GL_BACK_LEFT flags in the glDrawBuffer function). Unfortunately, such 3D formats require specific hardware synchronization and often require higher frame rate display (for example, 120Hz) as well as a professional graphics card. Further information on how to implement stereoscopic 3D in your application can be found at http://www.nvidia.com/content/GTC-2010/pdfs/2010_GTC2010.pdf. Summary In this article, we covered how to visualize data with stunning stereoscopic 3D technology using OpenGL. OpenGL does not provide any mechanism to load, save, or manipulate 3D models. Thus, to support this, we have integrated a new library named Assimp into the code. Resources for Article: Further resources on this subject: Organizing a Virtual Filesystem [article] Using OpenCL [article] Introduction to Modern OpenGL [article]
Read more
  • 0
  • 1
  • 16295
article-image-css-grids-rwd
Packt
24 Aug 2015
12 min read
Save for later

CSS Grids for RWD

Packt
24 Aug 2015
12 min read
In this article by the author, Ricardo Zea, of the book, Mastering Responsive Web Design, we're going to learn how to create a custom CSS grid. Responsive Web Design (RWD) has introduced a new layer of work for everyone building responsive websites and apps. When we have to test our work on different devices and in different dimensions, wherever the content breaks, we need to add a breakpoint and test again. (For more resources related to this topic, see here.) This can happen many, many times. So, building a website or app will take a bit longer than it used to. To make things a little more interesting, as web designers and developers, we need to be mindful of how the content is laid out at different dimensions and how a grid can help us structure the content to different layouts. Now that we have mentioned grids, have you ever asked yourself, "what do we use a grid for anyway?" To borrow a few terms from the design industry and answer that question, we use a grid to allow the content to have rhythm, proportion, and balance. The objective is that those who use our websites/apps will have a more pleasant experience with our content, since it will be easier to scan (rhythm), easier to read (proportion) and organized (balance). In order to speed up the design and build processes while keeping all the content properly formatted in different dimensions, many authors and companies have created CSS frameworks and CSS grids that contain not only a grid but also many other features and styles than can be leveraged by using a simple class name. As time goes by and browsers start supporting more and more CSS3 properties, such as Flexbox, it'll become easier to work with layouts. This will render the grids inside CSS frameworks almost unnecessary. Let's see what CSS grids are all about and how they can help us with RWD. In this article, we're going to learn how to create a custom CSS grid. Creating a custom CSS grid Since we're mastering RWD, we have the luxury of creating our own CSS grid. However, we need to work smart, not hard. Let's lay out our CSS grid requirements: It should have 12 columns. It should be 1200px wide to account for 1280px screens. It should be fluid, with relative units (percentages) for the columns and gutters. It should use the mobile-first approach. It should use the SCSS syntax. It should be reusable for other projects. It should be simple to use and understand. It should be easily scalable. Here's what our 1200 pixel wide and 12-column width 20px grid looks like: The left and right padding in black are 10px each. We'll convert those 10px into percentages at the end of this process. Doing the math We're going to use the RWD magic formula:  (target ÷ context) x 100 = result %. Our context is going to be 1200px. So let's convert one column:  80 ÷ 1200 x 100 = 6.67%. For two columns, we have to account for the gutter that is 20px. In other words, we can't say that two columns are exactly 160px. That's not entirely correct. Two columns are: 80px + 20px + 80px = 180px. Let's now convert two columns:  180 ÷ 1200 x 100 = 15%. For three columns, we now have to account for two gutters: 80px + 20px + 80px + 20px + 80px = 280px. Let's now convert three columns:  280 ÷ 1200 x 100 = 23.33%. Can you see the pattern now? Every time we add a column, all that we need to do is add 100 to the value. This value accounts for the gutters too! Check the screenshot of the grid we saw moments ago, you can see the values of the columns increment by 100. So, all the equations are as follows: 1 column: 80 ÷ 1200 x 100 = 6.67% 2 columns: 180 ÷ 1200 x 100 = 15% 3 columns: 280 ÷ 1200 x 100 = 23.33% 4 columns: 380 ÷ 1200 x 100 = 31.67% 5 columns: 480 ÷ 1200 x 100 = 40% 6 columns: 580 ÷ 1200 x 100 = 48.33% 7 columns: 680 ÷ 1200 x 100 = 56.67% 8 columns: 780 ÷ 1200 x 100 = 65% 9 columns: 880 ÷ 1200 x 100 = 73.33% 10 columns: 980 ÷ 1200 x 100 = 81.67% 11 columns:1080 ÷ 1200 x 100 = 90% 12 columns:1180 ÷ 1200 x 100 = 98.33% Let's create the SCSS for the 12-column grid: //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } Using hyphens (-) to separate words allows for easier selection of the terms when editing the code. Adding the UTF-8 character set directive and a Credits section Don't forget to include the UTF-8 encoding directive at the top of the file to let browsers know the character set we're using. Let's spruce up our code by adding a Credits section at the top. The code is as follows: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } Notice the Credits are commented with CSS style comments: /* */. These types of comments, depending on the way we compile our SCSS files, don't get stripped out. This way, the Credits are always visible so that others know who authored the file. This may or may not work for teams. Also, the impact on file size of having the Credits display is imperceptible, if any. Including the box-sizing property and the mobile-first mixin Including the box-sizing property allows the browser's box model to account for the padding inside the containers; this means the padding gets subtracted rather than added, thus maintaining the defined width(s). Since the structure of our custom CSS grid is going to be mobile-first, we need to include the mixin that will handle this aspect: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } The main container and converting 10px to percentage value Since we're using the mobile-first approach, our main container is going to be 100% wide by default; but we're also going to give it a maximum width of 1200px since the requirement is to create a grid of that size. We're also going to convert 10px into a percentage value, so using the RWD magic formula: 10 ÷ 1200 x 100 = 0.83%. However, as we've seen before, 10px, or in this case 0.83%, is not enough padding and makes the content appear too close to the edge of the main container. So we're going to increase the padding to 20px:  20 ÷ 1200 x 100 = 1.67%. We're also going to horizontally center the main container with margin:auto;. There's no need to declare zero values to the top and bottom margins to center horizontally. In other words, margin: 0 auto; isn't necessary. Just declaring margin: auto; is enough. Let's include these values now: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Main Container .container-12 { width: 100%; //Change this value to ANYTHING you want, no need to edit anything else. max-width: 1200px; padding: 0 1.67%; margin: auto; } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } In the padding property, it's the same if we type 0.83% or .83%. We can omit the zero. It's always a good practice to keep our code as streamlined as possible. This is the same principle as when we use hexadecimal shorthand values: #3336699 is the same as #369. Making it mobile-first On small screens, all the columns are going to be 100% wide. Since we're working with a single column layout, we don't use gutters; this means we don't have to declare margins, at least yet. At 640px, the grid will kick in and assign corresponding percentages to each column, so we're going to include the columns in a 40em (640px) media query and float them to the left. At this point, we need gutters. Thus, we declare the margin with .83% to the left and right padding. I chose 40em (640px) arbitrarily and only as a starting point. Remember to create content-based breakpoints rather than device-based ones. The code is as follows: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Main Container .container-12 { width: 100%; //Change this value to ANYTHING you want, no need to edit anything else. max-width: 1200px; padding: 0 1.67%; margin: auto; } //Grid .grid { //Global Properties - Mobile-first &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { width: 100%; } @include forLargeScreens(640) { //Totally arbitrary width, it's only a starting point. //Global Properties - Large screens &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { float: left; margin: 0 .83%; } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } } Adding the row and float clearing rules If we use rows in our HTML structure or add the class .clear to a tag, we can declare all the float clearing values in a single nested rule with the :before and :after pseudo-elements. It's the same thing to use single or double colons when declaring pseudo-elements. The double colon is a CSS3 syntax and the single colon is a CSS2.1 syntax. The idea was to be able to differentiate them at a glance so a developer could tell which CSS version they were written on. However, IE8 and below do not support the double-colon syntax. The float clearing technique is an adaptation of David Walsh's CSS snippet (http://davidwalsh.name/css-clear-fix). We're also adding a rule for the rows with a bottom margin of 10px to separate them from each other, while removing that margin from the last row to avoid creating unwanted extra spacing at the bottom. Finally, we add the clearing rule for legacy IEs. Let's include these rules now: @charset "UTF-8"; /* Custom Fluid & Responsive Grid System Structure: Mobile-first (min-width) Syntax: SCSS Grid: Float-based Created by: Your Name Date: MM/DD/YY */ *, *:before, *:after { box-sizing: border-box; } //Moble-first Media Queries Mixin @mixin forLargeScreens($width) { @media (min-width: $width/16+em) { @content } } //Main Container .container-12 { width: 100%; //Change this value to ANYTHING you want, no need to edit anything else. max-width: 1200px; padding: 0 1.67%; margin: auto; } //Grid .grid { //Global Properties - Mobile-first &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { width: 100%; } @include forLargeScreens(640) { //Totally arbitrary width, it's only a starting point. //Global Properties - Large screens &-1, &-2, &-3, &-4, &-5, &-6, &-7, &-8, &-9, &-10, &-11, &-12 { float: left; margin: 0 .83%; } //Grid 12 Columns .grid { &-1 { width:6.67%; } &-2 { width:15%; } &-3 { width:23.33%; } &-4 { width:31.67%; } &-5 { width:40%; } &-6 { width:48.33%; } &-7 { width:56.67%; } &-8 { width:65%; } &-9 { width:73.33%; } &-10 { width:81.67%; } &-11 { width:90%; } &-12 { width:98.33%; } } } //Clear Floated Elements - http://davidwalsh.name/css-clear-fix .clear, .row { &:before, &:after { content: ''; display: table; } &:after { clear: both; } } //Use rows to nest containers .row { margin-bottom: 10px; &:last-of-type { margin-bottom: 0; } } //Legacy IE .clear { zoom: 1; } Let's recap our CSS grid requirements: 12 columns: Starting from .grid-1 to .grid-12. 1200px wide to account for 1280px screens: The .container-12 container has max-width: 1200px; Fluid and relative units (percentages) for the columns and gutters: The percentages go from 6.67% to 98.33%. Mobile-first: We added the mobile-first mixin (using min-width) and nested the grid inside of it. The SCSS syntax: The whole file is Sass-based. Reusable: As long as we're using 12 columns and we're using the mobile-first approach, we can use this CSS grid multiple times. Simple to use and understand: The class names are very straightforward. The .grid-6 grid is used for an element that spans 6 columns, .grid-7 is used for an element that spans 7 columns, and so on. Easily scalable: If we want to use 980px instead of 1200px, all we need to do is change the value in the .container-12 max-width property. Since all the elements are using relative units (percentages), everything will adapt proportionally to the new width—to any width for that matter. Pretty sweet if you ask me. Summary A lot to digest here, eh? Creating our custom CSS with the traditional floats technique was a matter of identifying the pattern where the addition of a new column was a matter of increasing the value by 100. Now, we can create a 12-column grid at any width we want. Resources for Article: Further resources on this subject: Role of AngularJS[article] Managing Images[article] Angular Zen [article]
Read more
  • 0
  • 0
  • 12511

article-image-core-ephesoft-features
Packt
24 Aug 2015
15 min read
Save for later

Core Ephesoft Features

Packt
24 Aug 2015
15 min read
In this article by Pat Myers, author of the book Intelligent Document Capture with Ephesoft- Second Edition will explain about the following: Different classification types Other techniques for exporting your documents and metadata (For more resources related to this topic, see here.) As we know how to configure search classification to enable Ephesoft to recognize an invoice document. There are several other classification types available; we will explain these alternatives now. Classification types You can select the process that Ephesoft will use to classify documents by editing your batch class, editing the Document Assembly module, editing the Document Assembler plugin module within that module, and then selecting a value for DA Classification Type. Search Search classification (also sometimes called Lucene classification) is the default classification method and is recommended for most content. When configured to perform search classification, Ephesoft compares the text on each input page to the text on training documents to determine its confidence that a document is of a certain type. Image Image classification is the best option when classification cannot be made based on content. This occurs on forms that do not have a lot of text, or where the textual content is unpredictable but the physical appearance (such as layout, graphics, and formatting) is consistent. Credit card applications that are red dropout forms (where only the user-entered text is visible to the OCR engine) are candidates for this classification technique. Barcodes Barcodes can be used for documents that vary in content and layout, like white mail (unformatted correspondence received in the mail). If a barcode is found on the page with a name that matches an Ephesoft document type, Ephesoft will set the current document's type to that type. Automatic The automatic classification type tells Ephesoft to use the scores of every classification plugin that is enabled. This may be necessary when no single classification technique will suffice for your batch class, but configuring multiple classification plugins will have a negative impact on Ephesoft's performance. One document classification One document classification is a variant of automatic classification. It assembles all the pages in the batch into a single document. Confidence Ephesoft calculates confidence scores for each page in a batch. The page scores represent Ephesoft's certainty that the page being considered is the first, middle, or last page within each document type. They are used to classify and assemble the pages into documents. Ephesoft also uses these page scores to create an aggregate score for each document. This score is compared to the confidence threshold for each document type in the batch class definition. Any document that receives a confidence score below the minimum threshold will be flagged for review. A batch with one or more flagged documents will be placed in a queue for review by an operator. Confidence scores are calculated differently for each classification type. Search classification The default classification type is search classification. Search classification separates and classifies documents by using a two-step process. The first step is to collect information about the pages. The Search Classification plugin of the Page Processing module performs this function. The second step is to separate documents and determine their type. This is the responsibility of the Document Assembler plugin. The Search Classification plugin calculates the initial page scores by comparing the text on the page to the text on the training documents. Multiple scores are generated for each page as Ephesoft finds several matches from samples for any given page. The page scores are then adjusted using weighted values that can be modified in the administrative interface by editing the Search Classification plugin of the Page Processing module. Pages can be weighted on the basis of the page type (first, middle, or last). By default, Ephesoft is configured to reduce the scores for the middle and last pages by 10 percent and 20 percent, respectively, as the first pages are more important when it comes to the separation of documents. This effectively biases Ephesoft in favor of using a page to create a new document (over using it as the middle or last page of a document). The plugin properties of search classification Using the page scores calculated in the previous step (and adjusted using the weighted values from the Search Classification plugin), Ephesoft calculates all possible document assemblies and selects the result with the highest score. The score is calculated as follows: First, the scores of each page in the assembly are averaged. Ephesoft then adjusts the average by using a multiplier in the Document Assembler plugin. You will notice, looking at the following plugin settings screen, that there are several multipliers available. If the assembly has a first and a last page, for example, the DA Rule first-last Page multiplier will be chosen. An assembly with the first, last, and middle pages will use the "DA Rule First-middle-last Page" multiplier. The plugin properties of Document Assembler Suppose, for example, that you have trained a batch class to recognize the first and middle pages of an invoice. If you run a three-page batch through Ephesoft, you might get the following results: Page 1 is determined to be the first page of an invoice because Invoice_First_Page received the highest score: Page 1 compared to Invoice_First_Page receives a score of 30.2 Page 1 compared to Invoice_Middle_Page receives a score of 4.2 Page 2 is determined to be the second page of an invoice because Invoice_Middle_Page received the highest score. Because of the order of this page in the batch, it is determined to be the second page of the invoice found in page 1. Page 2 compared to Invoice_First_Page receives a score of 2.6 Page 2 compared to Invoice_Middle_Page receives a score of 12.2 Page 3 was determined to be the first page of an invoice because Invoice_First_Page received the highest score. Since, it was determined to be a first page, it is the first page of a new document. Page 3 compared to Invoice_First_Page receives a score of 31.6 Page 3 compared to Invoice_Middle_Page receives a score of 3.8 In this case, there is no score for Invoice_Last_Page as there were no last page samples used to train this Ephesoft instance. When using the drag and drop classification training in Batch Class Management, Ephesoft will automatically place a last page for any document having more than one page. If that is not the only possible last page of the document type, you will have to go into Folder Management and move all samples and files from the last page training for the document type into the middle pages. Once the files are moved, go back into Batch Class Management and click on the Learn Files button to retrain the system. The first document assembled will be a two-page invoice because Ephesoft found a first page of an invoice followed by a middle page of an invoice. The second document assembled will be a one-page invoice since only the first page of an invoice was found. The confidence scores that each of these documents received are calculated as follows: Document 1 (page 1 and 2): (30.2 + 12.2)/2 = 21.2 × 50% = 10.6 Average score of pages times the page weight factor, DA Rule First-middle Page Document 2 (page 3): (31.6)/1 = 31.6 × 50%= 15.8 Average score of pages times the page weight factor, DA Rule First Page If the Minimum Confidence Scores setting of the Invoice document type is set to 10, then this batch will skip the review step and move directly to extraction. If the Minimum Confidence Scores for the Invoice document type is set to 15, then this batch will stop in review with the first document requiring review. Barcode classification Barcode classification is also a two-step process similar to search classification. In the Page Processing module, pages with barcodes are processed using either the Recostar plugin or the Barcode Reader plugin. In the Document Assembler plugin, Ephesoft creates documents when the first barcode is found and all the other pages are appended to the document until a new page with a barcode is found. The barcode value found by the barcode or the RecoStar plugin has to match one of the document type names. On Linux, Ephesoft will always use the Barcode Reader plugin. Image classification Image classification compares the pixels on the provided documents to the pixels on the trained documents. The more pixels that match the trained document, the higher is the confidence score that the document will attain. This is in contrast to search classification which OCRs the pages and then compares the text. When image classification is selected, the Document Assembler plugin uses the image confidence scores to separate and classify documents. The assembly is done using the same algorithm explained in the search classification section. Automatic classification Automatic classification uses all enabled classification types. The scores will be combined to come up with an aggregate score per page. This value will be used for assembly and then classification scoring. Export We use the Copy Batch XML plugin to export content to the Ephesoft server's file system. There are a number of additional export options. The CMIS and DB export plugins use standard-based interfaces to allow export to a large number of enterprise content management systems and relational databases. Let's take a look at how to configure these two plugins and then, review the other plugins that are available. CMIS export The Content Management Interoperability Services (CMIS) API is an open standard for interacting with enterprise document repositories. You can use the CMIS Export plugin to export your scanned content (and associated metadata) to any repository that supports the CMIS standard, such as Alfresco, Documentum, FileNet, or SharePoint. Let's look at how to configure the CMIS Export plugin to send content to Alfresco, a popular open source enterprise content management system. Ephesoft 4.0 supports CMIS 1.0 and 1.1 Establish a content model in your CMIS Suppose that you have an Invoice document type in Ephesoft that has fields for Vendor Name, Invoice Date, and Invoice Total. The first thing that you will want to do is define a custom content model in Alfresco to represent your scanned content. Alfresco defines custom content models in XML files that look like the following: <type name="acme:invoice"> <parent>cm:content</parent> <properties> <property name="acme:vendorName"> <title>Vendor Name</title> <type>d:text</type> <mandatory enforced="false">false</mandatory> <index enabled="true"> <atomic>true</atomic> <stored>false</stored> <tokenised>false</tokenised> </index> </property> Alfresco document type and property name have prefixes to prevent namespace collisions in the content models. We have used an acme prefix in our examples, as would be the case if this implementation were for Acme Corporation. The example above shows a document type acme:invoice that extends Alfresco's base document type cm:content. This custom type has a text property called acme:vendorName. Not shown here are the date property called acme:invoiceDate and the float property called acme:invoiceTotal. Configure the CMIS Export plugin After creating the content model, you will need to configure Ephesoft to use CMIS to send the processed content to Alfresco. There are three places in Ephesoft where you need to configure the CMIS export: The plugin settings in the administrative user interface The mapping files, in your batch class cmis-plugin-mapping folder The global configuration file, located in your Ephesoft installation folder here: Application/WEB-INF/classes/META-INF/dcma-cmis/dcma-cmis.properties Let's start with the plugin settings. From the batch class management interface, select and edit your batch class, the export module, and then the CMIS Export plugin. This comes configured by default with a disabled sample connection to Alfresco's public CMIS server. The plugin properties of CMIS Export The CMIS plugin can be configured as follows: Root Folder Name: This is the name of the destination folder in the document repository where Ephesoft should load the exported documents. In Alfresco, this folder will be created underneath the root folder (which is typically named Company Home). Upload File Extension: This setting controls whether the documents are uploaded to your document management system as PDF or TIF images. Server URL: The services provided by CMIS are defined in an XML service document; this is the location of that document. Alfresco 4.0 hosts this file at /alfresco/service/cmis. Alfresco 5.0 hosts this file at /alfresco/api/-default-/public/cmis/versions/1.1/atom. User Name and Password: This is the authentication information required to connect to the document management system. Repository Id: Some document management systems are capable of hosting multiple repositories. When this is the case, each repository is listed in the service document with an associated identifier. You should examine the service document to find the identifier for your repository. Server Switch: This can be used to enable and disable export to your document management system. Aspect Switch: Alfresco manages dynamically assignable groups of properties called aspects. This switch enables support for aspects. Export File Name: Naming convention for the documents exported. Export Client Key, Secret Key, Refresh Token, Redirect URL, and Export Network: These properties are used to implement OAuth authentication. Document type and property mapping Next, you need to associate Ephesoft document types with Alfresco document types. Ephesoft's fields also need to be mapped to the properties of Alfresco documents. Edit this file in your batch class configuration area: cmis-plugin-mapping/DLF-Attribute-mapping.properties. This file contains some examples of content mapping. Delete the examples and set up your own mapping, as follows: Invoice=D:acme:invoice Invoice.VendorName=acme:vendorName Invoice.InvoiceDate=acme:invoiceDate Invoice.InvoiceTotal=acme:invoiceTotal The first line of this property file associates the document types, and the last three lines associate the fields. When mapping document types, you will need to prepend D: to the beginning of your document repository's type name. This is the CMIS syntax for representing a document (as opposed to, for example, a folder) in Alfresco. Aspects are configured in the following batch class configuration file: cmis-plugin-mapping/aspects-mapping.properties. Global CMIS configuration The final area where CMIS is configured in Ephesoft is the following file: Application/WEB-INF/classes/META-INF/dcma-cmis/dcma-cmis.properties. This file affects the CMIS configuration of all batch classes. The most commonly modified setting in this file is the date format. When you map a date field, Ephesoft needs to parse the date in order to reformat the information to match the CMIS specification. The cmis.date_format parameter specifies how Ephesoft fields that will be exported using CMIS will be formatted. See the JavaDoc for the SimpleDateFormat class to learn how to specify date formats. If your content management system uses Web Service Security (WSS) to secure its CMIS web services, you will need to adjust the value of the cmis.security.mode property. This specifies the security mode to use when attempting to connect to the CMIS web services. There are two possible values: basic and wssecurity. HTTP Basic Authentication is the default setting for the Ephesoft CMIS connection. This corresponds to the basic setting for the cmis.security.mode property. The cmis.security.mode property is set to wssecurity in order to have the CMIS credentials that are configured in the CMIS_EXPORT plugin included in the WS-Security SOAP header of the CMIS web service requests. If your CMIS web services are not addressable from a single URL, you can configure the location of each service used by Ephesoft. You will see a set of properties that begin with cmis.url. These can be edited to specify where your content management system hosts this service's WDSL. Database Export DB Export allows document level fields values and metadata to be exported to relational databases using JDBC. Administrators can map the Ephesoft document fields to the database table columns. First, go to the system configuration area to create a new connection in Connection Manager: Connection Manager with connection properties for database export Next, return to the batch class management area and configure your batch class. If the DB Export plugin is configured into this batch class' workflow, then you will be able to configure the plugin from the Modules section. The configuration of the plugin is simple; there is simply a switch to enable the plugin. Plugin properties of database export In Batch Class Management under the document type, you can configure DB Export Configuration. Select the correct database connection, and then, map the document type fields to the table and column. Click on Apply to save your changes. Database export mapping When the DB Export plugin runs, it will export the extracted field data for each document in the batch. Sample results of database export Other export plugins Thus far, we have shown you how to export to the local file system or use CMIS and JDBC. These are general-purpose plugins that can be used in a variety of situations. Ephesoft comes with a few other general-purpose plugins such as the CSV plugin and the tabbed PDF plugin. Ephesoft also provides a handful of plugins to facilitate export into specific content management systems such as Docushare, HPII FileNet, and IBM CM. To see the list of available plugins, you should edit your batch class and then edit the export module. Summary In this article, you have learned how to process forms with many different layouts, additional extraction techniques. At this point, you should be able to use Ephesoft to implement intelligent document capture for a wide variety of organizations. Resources for Article: Further resources on this subject: A Quick Tour Of Ephesoft[article] Loading, Submitting, and Validating Forms using Ext JS 4[article] Mastering the Newer Prezi Features [article]
Read more
  • 0
  • 0
  • 2556
Modal Close icon
Modal Close icon