Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-clustering-and-other-unsupervised-learning-methods
Packt
09 Jul 2015
19 min read
Save for later

Clustering and Other Unsupervised Learning Methods

Packt
09 Jul 2015
19 min read
In this article by Ferran Garcia Pagans, author of the book Predictive Analytics Using Rattle and Qlik Sense, we will learn about the following: Define machine learning Introduce unsupervised and supervised methods Focus on K-means, a classic machine learning algorithm, in detail We'll create clusters of customers based on their annual money spent. This will give us a new insight. Being able to group our customers based on their annual money spent will allow us to see the profitability of each customer group and deliver more profitable marketing campaigns or create tailored discounts. Finally, we'll see hierarchical clustering, different clustering methods, and association rules. Association rules are generally used for market basket analysis. Machine learning – unsupervised and supervised learning Machine Learning (ML) is a set of techniques and algorithms that gives computers the ability to learn. These techniques are generic and can be used in various fields. Data mining uses ML techniques to create insights and predictions from data. In data mining, we usually divide ML methods into two main groups – supervisedlearning and unsupervisedlearning. A computer can learn with the help of a teacher (supervised learning) or can discover new knowledge without the assistance of a teacher (unsupervised learning). In supervised learning, the learner is trained with a set of examples (dataset) that contains the right answer; we call it the training dataset. We call the dataset that contains the answers a labeled dataset, because each observation is labeled with its answer. In supervised learning, you are supervising the computer, giving it the right answers. For example, a bank can try to predict the borrower's chance of defaulting on credit loans based on the experience of past credit loans. The training dataset would contain data from past credit loans, including if the borrower was a defaulter or not. In unsupervised learning, our dataset doesn't have the right answers and the learner tries to discover hidden patterns in the data. In this way, we call it unsupervised learning because we're not supervising the computer by giving it the right answers. A classic example is trying to create a classification of customers. The model tries to discover similarities between customers. In some machine learning problems, we don't have a dataset that contains past observations. These datasets are not labeled with the correct answers and we call them unlabeled datasets. In traditional data mining, the terms descriptive analytics and predictive analytics are used for unsupervised learning and supervised learning. In unsupervised learning, there is no target variable. The objective of unsupervised learning or descriptive analytics is to discover the hidden structure of data. There are two main unsupervised learning techniques offered by Rattle: Cluster analysis Association analysis Cluster analysis Sometimes, we have a group of observations and we need to split it into a number of subsets of similar observations. Cluster analysis is a group of techniques that will help you to discover these similarities between observations. Market segmentation is an example of cluster analysis. You can use cluster analysis when you have a lot of customers and you want to divide them into different market segments, but you don't know how to create these segments. Sometimes, especially with a large amount of customers, we need some help to understand our data. Clustering can help us to create different customer groups based on their buying behavior. In Rattle's Cluster tab, there are four cluster algorithms: KMeans EwKm Hierarchical BiCluster The two most popular families of cluster algorithms are hierarchical clustering and centroid-based clustering: Centroid-based clustering the using K-means algorithm I'm going to use K-means as an example of this family because it is the most popular. With this algorithm, a cluster is represented by a point or center called the centroid. In the initialization step of K-means, we need to create k number of centroids; usually, the centroids are initialized randomly. In the following diagram, the observations or objects are represented with a point and three centroids are represented with three colored stars: After this initialization step, the algorithm enters into an iteration with two operations. The computer associates each object with the nearest centroid, creating k clusters. Now, the computer has to recalculate the centroids' position. The new position is the mean of each attribute of every cluster member. This example is very simple, but in real life, when the algorithm associates the observations with the new centroids, some observations move from one cluster to the other. The algorithm iterates by recalculating centroids and assigning observations to each cluster until some finalization condition is reached, as shown in this diagram: The inputs of a K-means algorithm are the observations and the number of clusters, k. The final result of a K-means algorithm are k centroids that represent each cluster and the observations associated with each cluster. The drawbacks of this technique are: You need to know or decide the number of clusters, k. The result of the algorithm has a big dependence on k. The result of the algorithm depends on where the centroids are initialized. There is no guarantee that the result is the optimum result. The algorithm can iterate around a local optimum. In order to avoid a local optimum, you can run the algorithm many times, starting with different centroids' positions. To compare the different runs, you can use the cluster's distortion – the sum of the squared distances between each observation and its centroids. Customer segmentation with K-means clustering We're going to use the wholesale customer dataset we downloaded from the Center for Machine Learning and Intelligent Systems at the University of California, Irvine. You can download the dataset from here – https://archive.ics.uci.edu/ml/datasets/Wholesale+customers#. The dataset contains 440 customers (observations) of a wholesale distributor. It includes the annual spend in monetary units on six product categories – Fresh, Milk, Grocery, Frozen, Detergents_Paper, and Delicatessen. We've created a new field called Food that includes all categories except Detergents_Paper, as shown in the following screenshot: Load the new dataset into Rattle and go to the Cluster tab. Remember that, in unsupervised learning, there is no target variable. I want to create a segmentation based only on buying behavior; for this reason, I set Region and Channel to Ignore, as shown here: In the following screenshot, you can see the options Rattle offers for K-means. The most important one is Number of clusters; as we've seen, the analyst has to decide the number of clusters before running K-means: We have also seen that the initial position of the centroids can have some influence on the result of the algorithm. The position of the centroids is random, but we need to be able to reproduce the same experiment multiple times. When we're creating a model with K-means, we'll iteratively re-run the algorithm, tuning some options in order to improve the performance of the model. In this case, we need to be able to reproduce exactly the same experiment. Under the hood, R has a pseudo-random number generator based on a starting point called Seed. If you want to reproduce the exact same experiment, you need to re-run the algorithm using the same Seed. Sometimes, the performance of K-means depends on the initial position of the centroids. For this reason, sometimes you need to able to re-run the model using a different initial position for the centroids. To run the model with different initial positions, you need to run with a different Seed. After executing the model, Rattle will show some interesting information. The size of each cluster, the means of the variables in the dataset, the centroid's position, and the Within cluster sum of squares value. This measure, also called distortion, is the sum of the squared differences between each point and its centroid. It's a measure of the quality of the model. Another interesting option is Runs; by using this option, Rattle will run the model the specified number of times and will choose the model with the best performance based on the Within cluster sum of squares value. Deciding on the number of clusters can be difficult. To choose the number of clusters, we need a way to evaluate the performance of the algorithm. The sum of the squared distance between the observations and the associated centroid could be a performance measure. Each time we add a centroid to KMeans, the sum of the squared difference between the observations and the centroids decreases. The difference in this measure using a different number of centroids is the gain associated to the added centroids. Rattle provides an option to automate this test, called Iterative Clusters. If you set the Number of clusters value to 10 and check the Iterate Clusters option, Rattle will run KMeans iteratively, starting with 3 clusters and finishing with 10 clusters. To compare each iteration, Rattle provides an iteration plot. In the iteration plot, the blue line shows the sum of the squared differences between each observation and its centroid. The red line shows the difference between the current sum of squared distances and the sum of the squared distance of the previous iteration. For example, for four clusters, the red line has a very low value; this is because the difference between the sum of the squared differences with three clusters and with four clusters is very small. In the following screenshot, the peak in the red line suggests that six clusters could be a good choice. This is because there is an important drop in the Sum of WithinSS value at this point: In this way, to finish my model, I only need to set the Number of clusters to 3, uncheck the Re-Scale checkbox, and click on the Execute button: Finally, Rattle returns the six centroids of my clusters: Now we have the six centroids and we want Rattle to associate each observation with a centroid. Go to the Evaluate tab, select the KMeans option, select the Training dataset, mark All in the report type, and click on the Execute button as shown in the following screenshot. This process will generate a CSV file with the original dataset and a new column called kmeans. The content of this attribute is a label (a number) representing the cluster associated with the observation (customer), as shown in the following screenshot: After clicking on the Execute button, you will need to choose a folder to save the resulting file to and will have to type in a filename. The generated data inside the CSV file will look similar to the following screenshot: In the previous screenshot, you can see ten lines of the resulting file; note that the last column is kmeans. Preparing the data in Qlik Sense Our objective is to create the data model, but using the new CSV file with the kmeans column. We're going to update our application by replacing the customer data file with this new data file. Save the new file in the same folder as the original file, open the Qlik Sense application, and go to Data load editor. There are two differences between the original file and this one. In the original file, we added a line to create a customer identifier called Customer_ID, and in this second file we have this field in the dataset. The second difference is that in this new file we have the kmeans column. From Data load editor, go to the Wholesale customer data sheet, modify line 2, and add line 3. In line 2, we just load the content of Customer_ID, and in line 3, we load the content of the kmeans field and rename it to Cluster, as shown in the following screenshot. Finally, update the name of the file to be the new one and click on the Load data button: When the data load process finishes, open the data model viewer to check your data model, as shown here: Note that you have the same data model with a new field called Cluster. Creating a customer segmentation sheet in Qlik Sense Now we can add a sheet to the application. We'll add three charts to see our clusters and how our customers are distributed in our clusters. The first chart will describe the buying behavior of each cluster, as shown here: The second chart will show all customers distributed in a scatter plot, and in the last chart we'll see the number of customers that belong to each cluster, as shown here: I'll start with the chart to the bottom-right; it's a bar chart with Cluster as the dimension and Count([Customer_ID]) as the measure. This simple bar chart has something special – colors. Each customer's cluster has a special color code that we use in all charts. In this way, cluster 5 is blue in the three charts. To obtain this effect, we use this expression to define the color as color(fieldindex('Cluster', Cluster)), which is shown in the following screenshot: You can find this color trick and more in this interesting blog by Rob Wunderlich – http://qlikviewcookbook.com/. My second chart is the one at the top. I copied the previous chart and pasted it onto a free place. I kept the dimension but I changed the measure by using six new measures: Avg([Detergents_Paper]) Avg([Delicassen]) Avg([Fresh]) Avg([Frozen]) Avg([Grocery]) Avg([Milk]) I placed my last chart at the bottom-left. I used a scatter plot to represent all of my 440 customers. I wanted to show the money spent by each customer on food and detergents, and its cluster. I used the y axis to show the money spent on detergents and the x axis for the money spent on food. Finally, I used colors to highlight the cluster. The dimension is Customer_Id and the measures are Delicassen+Fresh+Frozen+Grocery+Milk (or Food) and [Detergents_Paper]. As the final step, I reused the color expression from the earlier charts. Now our first Qlik Sense application has two sheets – the original one is 100 percent Qlik Sense and helps us to understand our customers, channels, and regions. This new sheet uses clustering to give us a different point of view; this second sheet groups the customers by their similar buying behavior. All this information is useful to deliver better campaigns to our customers. Cluster 5 is our least profitable cluster, but is the biggest one with 227 customers. The main difference between cluster 5 and cluster 2 is the amount of money spent on fresh products. Can we deliver any offer to customers in cluster 5 to try to sell more fresh products? Select retail customers and ask yourself, who are our best retail customers? To which cluster do they belong? Are they buying all our product categories? Hierarchical clustering Hierarchical clustering tries to group objects based on their similarity. To explain how this algorithm works, we're going to start with seven points (or observations) lying in a straight line: We start by calculating the distance between each point. I'll come back later to the term distance; in this example, distance is the difference between two positions in the line. The points D and E are the ones with the smallest distance in between, so we group them in a cluster, as shown in this diagram: Now, we substitute point D and point E for their mean (red point) and we look for the two points with the next smallest distance in between. In this second iteration, the closest points are B and C, as shown in this diagram: We continue iterating until we've grouped all observations in the dataset, as shown here: Note that, in this algorithm, we can decide on the number of clusters after running the algorithm. If we divide the dataset into two clusters, the first cluster is point G and the second cluster is A, B, C, D, E, and F. This gives the analyst the opportunity to see the big picture before deciding on the number of clusters. The lowest level of clustering is a trivial one; in this example, seven clusters with one point in each one. The chart I've created while explaining the algorithm is a basic form of a dendrogram. The dendrogram is a tree diagram used in Rattle and in other tools to illustrate the layout of the clusters produced by hierarchical clustering. In the following screenshot, we can see the dendrogram created by Rattle for the wholesale customer dataset. In Rattle's dendrogram, the y axis represent all observations or customers in the dataset, and the x axis represents the distance between the clusters: Association analysis Association rules or association analysis is also an important topic in data mining. This is an unsupervised method, so we start with an unlabeled dataset. An unlabeled dataset is a dataset without a variable that gives us the right answer. Association analysis attempts to find relationships between different entities. The classic example of association rules is market basket analysis. This means using a database of transactions in a supermarket to find items that are bought together. For example, a person who buys potatoes and burgers usually buys beer. This insight could be used to optimize the supermarket layout. Online stores are also a good example of association analysis. They usually suggest to you a new item based on the items you have bought. They analyze online transactions to find patterns in the buyer's behavior. These algorithms assume all variables are categorical; they perform poorly with numeric variables. Association methods need a lot of time to be completed; they use a lot of CPU and memory. Remember that Rattle runs on R and the R engine loads all data into RAM memory. Suppose we have a dataset such as the following: Our objective is to discover items that are purchased together. We'll create rules and we'll represent these rules like this: Chicken, Potatoes → Clothes This rule means that when a customer buys Chicken and Potatoes, he tends to buy Clothes. As we'll see, the output of the model will be a set of rules. We need a way to evaluate the quality or interest of a rule. There are different measures, but we'll use only a few of them. Rattle provides three measures: Support Confidence Lift Support indicates how often the rule appears in the whole dataset. In our dataset, the rule Chicken, Potatoes → Clothes has a support of 48.57 percent (3 occurrences / 7 transactions). Confidence measures how strong rules or associations are between items. In this dataset, the rule Chicken, Potatoes → Clothes has a confidence of 1. The items Chicken and Potatoes appear three times in the dataset and the items Chicken, Potatoes, and Clothes appear three times in the dataset; and 3/3 = 1. A confidence close to 1 indicates a strong association. In the following screenshot, I've highlighted the options on the Associate tab we have to choose from before executing an association method in Rattle: The first option is the Baskets checkbox. Depending on the kind of input data, we'll decide whether or not to check this option. If the option is checked, such as in the preceding screenshot, Rattle needs an identification variable and a target variable. After this example, we'll try another example without this option. The second option is the minimum Support value; by default, it is set to 0.1. Rattle will not return rules with a lower Support value than the one you have set in this text box. If you choose a higher value, Rattle will only return rules that appear many times in your dataset. If you choose a lower value, Rattle will return rules that appear in your dataset only a few times. Usually, if you set a high value for Support, the system will return only the obvious relationships. I suggest you start with a high Support value and execute the methods many times with a lower value in each execution. In this way, in each execution, new rules will appear that you can analyze. The third parameter you have to set is Confidence. This parameter tells you how strong the rule is. Finally, the length is the number of items that contains a rule. A rule like Beer è Chips has length of two. The default option for Min Length is 2. If you set this variable to 2, Rattle will return all rules with two or more items in it. After executing the model, you can see the rules created by Rattle by clicking on the Show Rules button, as illustrated here: Rattle provides a very simple dataset to test the association rules in a file called dvdtrans.csv. Test the dataset to learn about association rules. Further learning In this article, we introduced supervised and unsupervised learning, the two main subgroups of machine learning algorithms; if you want to learn more about machine learning, I suggest you complete a MOOC course called Machine Learning at Coursera: https://www.coursera.org/learn/machine-learning The acronym MOOC stands for Massive Open Online Course; these are courses open to participation via the Internet. These courses are generally free. Coursera is one of the leading platforms for MOOC courses. Machine Learning is a great course designed and taught by Andrew Ng, Associate Professor at Stanford University; Chief Scientist at Baidu; and Chairman and Co-founder at Coursera. This course is really interesting. A very interesting book is Machine Learning with R by Brett Lantz, Packt Publishing. Summary In this article, we were introduced to machine learning, and supervised and unsupervised methods. We focused on unsupervised methods and covered centroid-based clustering, hierarchical clustering, and association rules. We used a simple dataset, but we saw how a clustering algorithm can complement a 100 percent Qlik Sense approach by adding more information. Resources for Article: Further resources on this subject: Qlik Sense's Vision [article] Securing QlikView Documents [article] Conozca QlikView [article]
Read more
  • 0
  • 0
  • 32729

article-image-essentials-vmware-vsphere
Packt
09 Jul 2015
7 min read
Save for later

Essentials of VMware vSphere

Packt
09 Jul 2015
7 min read
In this article by Puthiyavan Udayakumar, author of the book VMware vSphere Design Essentials, we will cover the following topics: Essentials of designing VMware vSphere The PPP framework The challenges and encounters faced on virtual infrastructure (For more resources related to this topic, see here.) Let's get started with understanding the essentials of designing VMware vSphere. Designing is nothing but assembling and integrating VMware vSphere infrastructure components together to form the baseline for a virtualized datacenter. It has the following benefits: Saves power consumption Decreases the datacenter footprint and helps towards server consolidation Fastest server provisioning On-demand QA lab environments Decreases hardware vendor dependency Aids to move to the cloud Greater savings and affordability Superior security and High Availability Designing VMware vSphere Architecture design principles are usually developed by the VMware architect in concurrence with the enterprise CIO, Infrastructure Architecture Board, and other key business stakeholders. From my experience, I would always urge you to have frequent meetings to observe functional requirements as much as possible. This will create a win-win situation for you and the requestor and show you how to get things done. Please follow your own approach, if it works. Architecture design principles should be developed by the overall IT principles specific to the customer's demands, if they exist. If not, they should be selected to ensure positioning of IT strategies in line with business approaches. In nutshell, architect should aim to form an effective architecture principles that fulfills the infrastructure demands, following are high level principles that should be followed across any design: Design mission and plans Design strategic initiatives External influencing factors When you release a design to the customer, keep in mind that the design must have the following principles: Understandable and robust Complete and consistent Stable and capable of accepting continuous requirement-based changes Rational and controlled technical diversity Without the preceding principles, I wouldn't recommend you to release your design to anyone even for peer review. For every design, irrespective of the product that you are about to design, try the following approach; it should work well but if required I would recommend you make changes to the approach. The following approach is called PPP, which will focus on people's requirements, the product's capacity, and the process that helps to bridge the gap between the product capacity and people requirements: The preceding diagram illustrates three entities that should be considered while designing VMware vSphere infrastructure. Please keep in mind that your design is just a product designed by a process that is based on people's needs. In the end, using this unified framework will aid you in getting rid of any known risks and its implications. Functional requirements should be meaningful; while designing, please make sure there is a meaning to your design. Selecting VMware vSphere from other competitors should not be a random pick, you should always list the benefits of VMware vSphere. Some of them are as follows: Server consolidation and easy hardware changes Dynamic provisioning of resources to your compute node Templates, snapshots, vMotion, DRS, DPM, High Availability, fault tolerance, auto monitoring, and solutions for warnings and alerts Virtual Desktop Infrastructure (VDI), building a disaster recovery site, fast deployments, and decommissions The PPP framework Let's explore the components that integrate to form the PPP framework. Always keep in mind that the design should consist of people, processes, and products that meet the unified functional requirements and performance benchmark. Always expect the unexpected. Without these metrics, your design is incomplete; PPP always retains its own decision metrics. What does it do, who does it, and how is it done? We will see the answers in the following diagrams: The PPP Framework helps you to get started with requirements gathering, design vision, business architecture, infrastructure architecture, opportunities and solutions, migration planning, fixing the tone for implementing and design governance. The following table illustrates the essentials of the three-dimensional approach and the basic questions that are required to be answered before you start designing or documenting about designing, which will in turn help to understand the real requirements for a specific design: Phase Description Key components Product Results of what? In what hardware will the VM reside? What kind of CPU is required? What is the quantity of CPU, RAM, storage per host/VM? What kind of storage is required? What kind of network is required? What are the standard applications that need to be rolled out? What kind of power and cooling are required? How much rack and floor space is demanded? People Results of who? Who is responsible for infrastructure provisioning? Who manages the data center and supplies the power? Who is responsible for implementation of the hardware and software patches? Who is responsible for storage and back up? Who is responsible for security and hardware support? Process Results of how? How should we manage the virtual infrastructure? How should we manage hosted VMs? How should we provision VM on demand? How should a DR site be active during a primary site failure? How should we provision storage and backup? How should we take snapshots of VMs? How should we monitor and perform periodic health checks? Before we start to apply the PPP framework on VMware vSphere, we will discuss the list of challenges and encounters faced on the virtual infrastructure. List of challenges and encounters faced on the virtual infrastructure In this section, we will see a list of challenges and encounters faced with virtual infrastructure due to the simple reason that we fail to capture the functional and non-functional demands of business users, or do not understand the fit-for-purpose concept: Resource Estimate Misfire: If you underestimate the amount of memory required up-front, you could change the number of VMs you attempt to run on the VMware ESXi host hardware. Resource unavailability: Without capacity management and configuration management, you cannot create dozens or hundreds of VMs on a single host. Some of the VMs could consume all resources, leaving other VMs unknown. High utilization: An army of VMs can also throw workflows off-balance due to the complexities they can bring to provisioning and operational tasks. Business continuity: Unlike a PC environment, VMs cannot be backed up to an actual hard drive. This is why 80 percent of IT professionals believe that virtualization backup is a great technological challenge. Security: More than six out of ten IT professionals believe that data protection is a top technological challenge. Backward compatibility: This is especially challenging for certain apps and systems that are dependent on legacy systems. Monitoring performance: Unlike physical servers, you cannot monitor the performance of VMs with common hardware resources such as CPU, memory, and storage. Restriction of licensing: Before you install software on virtual machines, read the license agreements; they might not support this; hence, by hosting on VMs, you might violate the agreement. Sizing the database and mailbox: Proper sizing of databases and mailboxes is really critical to the organization's communication systems and for applications. Poor design of storage and network: A poor storage design or a networking design resulting from a failure to properly involve the required teams within an organization is a sure-fire way to ensure that this design isn't successful. Summary In this article we covered a brief introduction of the essentials of designing VMware vSphere which focused on the PPP framework. We also had look over the challenges and encounters faced on the virtual infrastructure. Resources for Article: Further resources on this subject: Creating and Managing VMFS Datastores [article] Networking Performance Design [article] The Design Documentation [article]
Read more
  • 0
  • 0
  • 8677

article-image-responsive-web-design-wordpress
Packt
09 Jul 2015
13 min read
Save for later

Responsive Web Design with WordPress

Packt
09 Jul 2015
13 min read
Welcome to the world of the Responsive Web Design! This article is written by Dejan Markovic, author of the book WordPress Responsive Theme Design, and it will introduce you to the Responsive Web Design and its concepts and techniques. It will also present crisp notes from WordPress Responsive Theme Design. (For more resources related to this topic, see here.) Responsive web design (RWD) is a web design approach aimed at crafting sites to provide an optimal viewing experience—easy reading and navigation with a minimum of resizing, panning, and scrolling—across a wide range of devices (from mobile phones to desktop computer monitors). Reference: http://en.wikipedia.org/wiki/Responsive_web_design. To say it simply, responsive web design (RWD) means that the responsive website should adapt to the screen size of the device it is being viewed on. When I began my web development journey in 2002, we didn't have to consider as many factors as we do today. We just had to create the website for a 17-inch screen (which was the standard at that time), and that was it. Yes, we also had to consider 15, 19, and 21-inch monitors, but since the 17-inch screen was the standard, that was the target screen size for us. In pixels, these sizes were usually 800 or 1024. We also had to consider a fewer number of browsers (Internet Explorer, Netscape, and Opera) and the styling for the print, and that was it. Since then, a lot of things have changed, and today, in 2015, for a website design, we have to consider multiple factors, such as: A lot of different web browsers (Internet Explorer, Firefox, Opera, Chrome, and Safari) A number of different operating systems (Windows (XP, 7, and 8), Mac OS X, Linux, Unix, iOS, Android, and Windows phones) Device screen sizes (desktop, mobile, and tablet) Is content accessible and readable with screen readers? How the content will look like when it's printed? Today, creating different design for all these listed factors & devices would take years. This is where a responsive web design comes to the rescue. The concepts of RWD I have to point out that the mobile environment is becoming more important factor than the desktop environment. Mobile browsing is becoming bigger than the desktop-based access, which makes the mobile environment very important factor to consider when developing a website. Simply put, the main point of RWD is that the layout changes based on the size and capabilities of the device its being viewed on. The concepts of RWD, that we will learn next, are: Viewport, scaling and screen density. Controlling Viewport On the desktop, Viewport is the screen size of the window in a browser. For example, when we resize the browser window, we are actually changing the Viewport size. On mobile devices, the Viewport size is also independent of the device screen size. For example, Viewport is 850 px for mobile Opera and 980 px for mobile Safari, and the screen size for iPhone is 320 px. If we compare the Viewport size of 980 px and the screen size of an iPhone of 320 px, we can see that Viewport is bigger than the screen size. This is because mobile browsers function differently. They first load the page into Viewport, and then they resize it to the device's screen size. This is why we are able to see the whole page on the mobile device. If the mobile browsers had Viewport the same as the screen size (320 px), we would be able to see only a part of the page on the mobile device. In the following screenshot, we can see the table with the list of Viewport sizes for some iPhone models: We can control Viewport with CSS: @viewport {width: device-width;} Or, we can control it with the meta tag: <meta name="viewport" content="width=device-width"> In the preceding code, we are matching the Viewport width with the device width. Because the Viewport meta tag approach is more widely adopted, as it was first used on iOS and the @viewport approach was not supported by some browsers, we will use the meta tag approach. We are setting the Viewport width in order to match our web content with our mobile content, as we want to make sure that our web content looks good on a mobile device as well. We can set Viewports in the code for each device separately, for example, 320 px for the iPhone. The better approach will be to use content="width=device-width". Scaling Scaling is extremely important, as the initial scale controls the zoom aspect of the content for the initial look of the page. For example, if the initial scale is set to 3, the content will be loaded in the size of 3 times of the Viewport size, which means 3 times zoom. Here is the look of the screenshot for initial-scale=1 and initial-scale=3: As we can see from the preceding screenshots, on the initial scale 3 (three times zoom), the logo image takes the bigger part of the screen. It is important to note that this is just the initial scale, which means that the user can zoom in and zoom out later, if they want to. Here is the example of the code with the initial scale: <meta name="viewport" content="width=device-width, initial- scale=1, maximum-scale=1"> In this example, we have used the maximum-scale=1 option, which means that the user will not be able to use the zoom here. We should avoid using the maximum-scale property because of accessibility issues. If we forbid zooming on our pages, users with visual problems will not be able to see the content properly. The screen density As the screen technology is going forward every year or even faster than that, we have to consider the screen density aspect as well. Screen density is the number of pixels that are contained within a screen area. This means that if the screen density is higher, we can have more details, in this case, pixels in the same area. There are two measurements that are usually used for this, dots per inch (DPI) and pixels per inch (PPI). DPI means how many drops a printer can place in an inch of a space. PPI is the number of pixels we can have in one inch of the screen. If we go back to the preceding screenshot with the table where we are showing Viewports and densities and compare the values of iPhone 3G and iPhone 4S, we will see that the screen size stayed the same at 3.5 inch, Viewport stayed the same at 320 px, but the screen density has doubled, from 163 dpi to 326 dpi, which means that the screen resolution also has doubled from 320x480 to 640x960. The screen density is very relevant to RWD, as newer devices have bigger densities and we should do our best to cover as many densities as we can in order to provide a better experience for end users. Pixels' density matters more than the resolution or screen size, because more pixels is equal to sharper display: There are topics that need to be taken into consideration, such as hardware, reference pixels, and the device-pixel-ratio, too. Problems and solutions with the screen density Scalable vector graphics and CSS graphics will scale to the resolution. This is why I recommend using Font Awesome icons in your project. Font Awesome icons are available for download at: http://fortawesome.github.io/Font-Awesome/icons/. Font Icons is a font that is made up of symbols, icons, or pictograms (whatever you prefer to call them) that you can use in a webpage just like a font. They can be instantly customized with properties like: size, drop shadow, or anything you want can be done with the power of CSS. The real problem triggered by the change in the screen density is images, as for high-density screens, we should provide higher resolution images. There are several ways through which we can approach this problem: By targeting high-density screens (providing high-resolution images to all screens) By providing high-resolution images where appropriate (loading high-resolution images only on devices with high-resolution screens) By not using high-resolution images For the beginner developers I will recommend using second approach, providing high-resolution images where appropriate. Techniques in RWD RWD consists of three coding techniques: Media queries (adapt content to specific screen sizes) Fluid grids (for flexible layouts) Flexible images and media (that respond to changes to screen sizes) More detailed information about RWD techniques by Ethan Marcote, who coined the term Reponsive Web Design, is available at http://alistapart.com/article/responsive-web-design. Media queries Media queries are CSS modules, or as some people like to say, just a conditional statements, which are telling tells the browsers to use a specific type of style, depending on the size of the screen and other factors, such as print (specific styles for print). They are here for a long time already, as I was using different styles for print in 2002. If you wish to know more about media queries, refer to W3C Candidate Recommendation 8 July 2002 at http://www.w3.org/TR/2002/CR-css3-mediaqueries-20020708/. Here is an example of media query declaration: @media only screen and (min-width:500px) { font-family: sans-serif; } Let's explain the preceding code: The @media code means that it is a media type declaration. The screen and part of the query is an expression or condition (in this case, it means only screen and no print). The following conditional statement means that everything above 500 px will have the font family of sans serif: (min-width:500px) { font-family: sans-serif; } Here is another example of a media query declaration: @media only screen and (min-width: 500px), screen and (orientation: portrait) { font-family: sans-serif; } In this case, if we have two statements and if one of the statements is true, the entire declaration is applied (either everything above 50 px or the portrait orientation will be applied to the screen). The only keyword hides the styles from older browsers. As some older browsers don't support media queries, I recommend using a respond.js script, which will "patch" support for them. Polyfill (or polyfiller) is code that provides features that are not built or supported by some web browsers. For example, a number of HTML5 features are not supported by older versions of IE (older than 8 or 9), but these features can be used if polyfill is installed on the web page. This means that if the developer wants to use these features, he/she can just include that polyfill library and these features will work in older browsers. Breakpoints Breakpoint is a moment when layout switches, from one layout to another, when some condition is fulfilled, for example, the screen has been resized. Almost all responsive designs cover the changes of the screen between the desktop, tablets, and smart phones. Here is an example with comments inside: @media only screen and (max-width: 480px) { //mobile styles // up to 480px size } Media query in the preceding code will only be used if the width of the screen is 480 px or less. @media only screen and (min-width:481px) and (max-width: 768px) { //tablet styles //between 481 and 768px } Media query in the preceding code will only be used if the width of the screen is between the 481 px and 768 px. @media only screen and (min-width:769px) { //desktop styles //from 769px and up } Media query in the preceding code will only be used when the width of the screen is 769 px and more. The minimum width value in desktop styles is 1 pixel over the maximum width value in tablet styles, and the same difference is there between values from tablet and mobile styles. We are doing this in order to avoid overlapping, as that could cause problem with our styles. There is also an approach to set the maximum width and minimum width with em values. Setting em of the screen for maximum will mean that the width of the screen is set relative to the device's font size. If the font size for the device is 16 px (which is the usual size), the maximum width for mobile styles would be 480/16=30. Why do we use em values? With pixel sizes, everything is fixed; for example, h1 is 19 px (or 1.5 em of the default size of 16 px), and that's it. With em sizes, everything is relative, so if we change the default value in the browser from, for example, 16 px to 18 px, everything relative to that will change. Therefore, all h1 values will change from 19 px to 22 px and make our layout "zoomable". Here is the example with sizes changed to em: @media only screen and (max-width: 30em) { //mobile styles // up to 480px size }   @media only screen and (min-width:30em) and (max-width: 48em) { //tablet styles //between 481 and 768px }   @media only screen and (min-width:48em) { //desktop styles //from 769px and up } Fluid grids The major point in RWD is that the content should adapt to any screen it's viewed on. One of the best solutions to do this is to use fluid layouts where our content can be resized on each breakpoint. In fluid grids, we define a maximum layout size for the design. The grid is divided into a specific number of columns to keep the layout clean and easy to handle. Then we design each element with proportional widths and heights instead of pixel based dimensions. So whenever the device or screen size is changed, elements will adjust their widths and heights by the specified proportions to its parent container. Reference: http://www.1stwebdesigner.com/tutorials/fluid-grids-in-responsive-design/. To make the grid flexible (or elastic), we can use the % points, or we can use the em values, whichever suits us better. We can make our own fluid grids, or we can use grid frameworks. As there are so many frameworks available, I would recommend that you use the existing framework rather than building your own. Grid frameworks could use a single grid that covers various screen sizes, or we can have multiple grids for each of the break points or screen size categories, such as mobiles, tablets, and desktops. Some of the notable frameworks are Twitter's Bootstrap, Foundation, and SemanticUI. I prefer Twitter's Bootstrap, as it really helps me speed up the process and it is the most used framework currently. Flexible images and media Last but not the least important, are images and media (videos). The problem with them is that they are elements that come with fixed sizes. There are several approaches to fix this: Replacing dimensions with percentage values Using maximum widths Using background images only for some cases, as these are not good for accessibility Using some libraries, such as Scott Jehl's picturefill (https://github.com/scottjehl/picturefill) Taking out the width and height parameters from the image tag and dealing with dimensions in CSS Summary In this article, you learned about the RWD concepts such as: Viewport, scaling and the screen density. Also, we have covered the RWD techniques: media queries, fluid grids, and flexible media. Resources for Article: Further resources on this subject: Deployment Preparations [article] Why Meteor Rocks! [article] Clustering and Other Unsupervised Learning Methods [article]
Read more
  • 0
  • 0
  • 29501

article-image-essentials-working-python-collections
Packt
09 Jul 2015
14 min read
Save for later

The Essentials of Working with Python Collections

Packt
09 Jul 2015
14 min read
In this article by Steven F. Lott, the author of the book Python Essentials, we'll look at the break and continue statements; these modify a for or while loop to allow skipping items or exiting before the loop has processed all items. This is a fundamental change in the semantics of a collection-processing statement. (For more resources related to this topic, see here.) Processing collections with the for statement The for statement is an extremely versatile way to process every item in a collection. We do this by defining a target variable, a source of items, and a suite of statements. The for statement will iterate through the source of items, assigning each item to the target variable, and also execute the suite of statements. All of the collections in Python provide the necessary methods, which means that we can use anything as the source of items in a for statement. Here's some sample data that we'll work with. This is part of Mike Keith's poem, Near a Raven. We'll remove the punctuation to make the text easier to work with: >>> text = '''Poe, E. ...     Near a Raven ... ... Midnights so dreary, tired and weary.''' >>> text = text.replace(",","").replace(".","").lower() This will put the original text, with uppercase and lowercase and punctuation into the text variable. When we use text.split(), we get a sequence of individual words. The for loop can iterate through this sequence of words so that we can process each one. The syntax looks like this: >>> cadaeic= {} >>> for word in text.split(): ...     cadaeic[word]= len(word) We've created an empty dictionary, and assigned it to the cadaeic variable. The expression in the for loop, text.split(), will create a sequence of substrings. Each of these substrings will be assigned to the word variable. The for loop body—a single assignment statement—will be executed once for each value assigned to word. The resulting dictionary might look like this (irrespective of ordering): {'raven': 5, 'midnights': 9, 'dreary': 6, 'e': 1, 'weary': 5, 'near': 4, 'a': 1, 'poe': 3, 'and': 3, 'so': 2, 'tired': 5} There's no guaranteed order for mappings or sets. Your results may differ slightly. In addition to iterating over a sequence, we can also iterate over the keys in a dictionary. >>> for word in sorted(cadaeic): ...   print(word, cadaeic[word]) When we use sorted() on a tuple or a list, an interim list is created with sorted items. When we apply sorted() to a mapping, the sorting applies to the keys of the mapping, creating a sequence of sorted keys. This loop will print a list in alphabetical order of the various pilish words used in this poem. Pilish is a subset of English where the word lengths are important: they're used as mnemonic aids. A for statement corresponds to the "for all" logical quantifier, . At the end of a simple for loop we can assert that all items in the source collection have been processed. In order to build the "there exists" quantifier, , we can either use the while statement, or the break statement inside the body of a for statement. Using literal lists in a for statement We can apply the for statement to a sequence of literal values. One of the most common ways to present literals is as a tuple. It might look like this: for scheme in 'http', 'https', 'ftp':    do_something(scheme) This will assign three different values to the scheme variable. For each of those values, it will evaluate the do_something() function. From this, we can see that, strictly-speaking, the () are not required to delimit a tuple object. If the sequence of values grows, however, and we need to span more than one physical line, we'll want to add (), making the tuple literal more explicit. Using the range() and enumerate() functions The range() object will provide a sequence of numbers, often used in a for loop. The range() object is iterable, it's not itself a sequence object. It's a generator, which will produce items when required. If we use range() outside a for statement, we need to use a function like list(range(x)) or tuple(range(a,b)) to consume all of the generated values and create a new sequence object. The range() object has three commonly-used forms: range(n) produces ascending numbers including 0 but not including n itself. This is a half-open interval. We could say that range(n) produces numbers, x, such that . The expression list(range(5)) returns [0, 1, 2, 3, 4]. This produces n values including 0 and n - 1. range(a,b) produces ascending numbers starting from a but not including b. The expression tuple(range(-1,3)) will return (-1, 0, 1, 2). This produces b - a values including a and b - 1. range(x,y,z) produces ascending numbers in the sequence . This produces (y-x)//z values. We can use the range() object like this: for n in range(1, 21):    status= str(n)    if n % 5 == 0: status += " fizz"    if n % 7 == 0: status += " buzz"    print(status) In this example, we've used a range() object to produce values, n, such that . We use the range() object to generate the index values for all items in a list: for n in range(len(some_list)):    print(n, some_list[n]) We've used the range() function to generate values between 0 and the length of the sequence object named some_list. The for statement allows multiple target variables. The rules for multiple target variables are the same as for a multiple variable assignment statement: a sequence object will be decomposed and items assigned to each variable. Because of that, we can leverage the enumerate() function to iterate through a sequence and assign the index values at the same time. It looks like this: for n, v in enumerate(some_list):      print(n, v) The enumerate() function is a generator function which iterates through the items in source sequence and yields a sequence of two-tuple pairs with the index and the item. Since we've provided two variables, the two-tuple is decomposed and assigned to each variable. There are numerous use cases for this multiple-assignment for loop. We often have list-of-tuples data structures that can be handled very neatly with this multiple-assignment feature. Iterating with the while statement The while statement is a more general iteration than the for statement. We'll use a while loop in two situations. We'll use this in cases where we don't have a finite collection to impose an upper bound on the loop's iteration; we may suggest an upper bound in the while clause itself. We'll also use this when writing a "search" or "there exists" kind of loop; we aren't processing all items in a collection. A desktop application that accepts input from a user, for example, will often have a while loop. The application runs until the user decides to quit; there's no upper bound on the number of user interactions. For this, we generally use a while True: loop. Infinite iteration is recommended. If we want to write a character-mode user interface, we could do it like this: quit_received= False while not quit_received:    command= input("prompt> ")    quit_received= process(command) This will iterate until the quit_received variable is set to True. This will process indefinitely; there's no upper boundary on the number of iterations. This process() function might use some kind of command processing. This should include a statement like this: if command.lower().startswith("quit"): return True When the user enters "quit", the process() function will return True. This will be assigned to the quit_received variable. The while expression, not quit_received, will become False, and the loop ends. A "there exists" loop will iterate through a collection, stopping at the first item that meets certain criteria. This can look complex because we're forced to make two details of loop processing explicit. Here's an example of searching for the first value that meets a condition. This example assumes that we have a function, condition(), which will eventually be True for some number. Here's how we can use a while statement to locate the minimum for which this function is True: >>> n = 1 >>> while n != 101 and not condition(n): ...     n += 1 >>> assert n == 101 or condition(n) The while statement will terminate when n == 101 or the condition(n) is True. If this expression is False, we can advance the n variable to the next value in the sequence of values. Since we're iterating through the values in order from the smallest to the largest, we know that n will be the smallest value for which the condition() function is true. At the end of the while statement we have included a formal assertion that either n is 101 or the condition() function is True for the given value of n. Writing an assertion like this can help in design as well as debugging because it will often summarize the loop invariant condition. We can also write this kind of loop using the break statement in a for loop, something we'll look at in the next section. The continue and break statements The continue statement is helpful for skipping items without writing deeply-nested if statements. The effect of executing a continue statement is to skip the rest of the loop's suite. In a for loop, this means that the next item will be taken from the source iterable. In a while loop, this must be used carefully to avoid an otherwise infinite iteration. We might see file processing that looks like this: for line in some_file:    clean = line.strip()    if len(clean) == 0:        continue    data, _, _ = clean.partition("#")    data = data.rstrip()    if len(data) == 0:        continue    process(data) In this loop, we're relying on the way files act like sequences of individual lines. For each line in the file, we've stripped whitespace from the input line, and assigned the resulting string to the clean variable. If the length of this string is zero, the line was entirely whitespace, and we'll continue the loop with the next line. The continue statement skips the remaining statements in the body of the loop. We'll partition the line into three pieces: a portion in front of any "#", the "#" (if present), and the portion after any "#". We've assigned the "#" character and any text after the "#" character to the same easily-ignored variable, _, because we don't have any use for these two results of the partition() method. We can then strip any trailing whitespace from the string assigned to the data variable. If the resulting string has a length of zero, then the line is entirely filled with "#" and any trailing comment text. Since there's no useful data, we can continue the loop, ignoring this line of input. If the line passes the two if conditions, we can process the resulting data. By using the continue statement, we have avoided complex-looking, deeply-nested if statements. It's important to note that a continue statement must always be part of the suite inside an if statement, inside a for or while loop. The condition on that if statement becomes a filter condition that applies to the collection of data being processed. continue always applies to the innermost loop. Breaking early from a loop The break statement is a profound change in the semantics of the loop. An ordinary for statement can be summarized by "for all." We can comfortably say that "for all items in a collection, the suite of statements was processed." When we use a break statement, a loop is no longer summarized by "for all." We need to change our perspective to "there exists". A break statement asserts that at least one item in the collection matches the condition that leads to the execution of the break statement. Here's a simple example of a break statement: for n in range(1, 100):    factors = []    for x in range(1,n):        if n % x == 0: factors.append(x)    if sum(factors) == n:        break We've written a loop that is bound by . This loop includes a break statement, so it will not process all values of n. Instead, it will determine the smallest value of n, for which n is equal to the sum of its factors. Since the loop doesn't examine all values, it shows that at least one such number exists within the given range. We've used a nested loop to determine the factors of the number n. This nested loop creates a sequence, factors, for all values of x in the range , such that x, is a factor of the number n. This inner loop doesn't have a break statement, so we are sure it examines all values in the given range. The least value for which this is true is the number six. It's important to note that a break statement must always be part of the suite inside an if statement inside a for or while loop. If the break isn't in an if suite, the loop will always terminate while processing the first item. The condition on that if statement becomes the "where exists" condition that summarizes the loop as a whole. Clearly, multiple if statements with multiple break statements mean that the overall loop can have a potentially confusing and difficult-to-summarize post-condition. Using the else clause on a loop Python's else clause can be used on a for or while statement as well as on an if statement. The else clause executes after the loop body if there was no break statement executed. To see this, here's a contrived example: >>> for item in 1,2,3: ...     print(item) ...     if item == 2: ...         print("Found",item) ...       break ... else: ...     print("Found Nothing") The for statement here will iterate over a short list of literal values. When a specific target value has been found, a message is printed. Then, the break statement will end the loop, avoiding the else clause. When we run this, we'll see three lines of output, like this: 1 2 Found 2 The value of three isn't shown, nor is the "Found Nothing" message in the else clause. If we change the target value in the if statement from two to a value that won't be seen (for example, zero or four), then the output will change. If the break statement is not executed, then the else clause will be executed. The idea here is to allow us to write contrasting break and non-break suites of statements. An if statement suite that includes a break statement can do some processing in the suite before the break statement ends the loop. An else clause allows some processing at the end of the loop when none of the break-related suites statements were executed. Summary In this article, we've looked at the for statement, which is the primary way we'll process the individual items in a collection. A simple for statement assures us that our processing has been done for all items in the collection. We've also looked at the general purpose while loop. Resources for Article: Further resources on this subject: Introspecting Maya, Python, and PyMEL [article] Analyzing a Complex Dataset [article] Geo-Spatial Data in Python: Working with Geometry [article]
Read more
  • 0
  • 0
  • 2320

article-image-architecting-and-coding-high-performance-net-applications
Packt
09 Jul 2015
15 min read
Save for later

Architecting and coding high performance .NET applications

Packt
09 Jul 2015
15 min read
In this article by Antonio Esposito, author of Learning .NET High Performance Programming, we will learn about low-pass audio filtering implemented using .NET, and also learn about MVVM and XAML. Model-View-ViewModel and XAML The MVVM pattern is another descendant of the MVC pattern. Born from an extensive update to the MVP pattern, it is at the base of all eXtensible Application Markup Language (XAML) language-based frameworks, such as Windows presentation foundation (WPF), Silverlight, Windows Phone applications, and Store Apps (formerly known as Metro-style apps). MVVM is different from MVC, which is used by Microsoft in its main web development framework in that it is used for desktop or device class applications. The first and (still) the most powerful application framework using MVVM in Microsoft is WPF, a desktop class framework that can use the full .NET 4.5.3 environment. Future versions within Visual Studio 2015 will support built-in .NET 4.6. On the other hand, all other frameworks by Microsoft that use the XAML language supporting MVVM patterns are based on a smaller edition of .NET. This happens with Silverlight, Windows Store Apps, Universal Apps, or Windows Phone Apps. This is why Microsoft made the Portable Library project within Visual Studio, which allows us to create shared code bases compatible with all frameworks. While a Controller in MVC pattern is sort of a router for requests to catch any request and parsing input/output Models, the MVVM lies behind any View with a full two-way data binding that is always linked to a View's controls and together at Model's properties. Actually, multiple ViewModels may run the same View and many Views can use the same single/multiple instance of a given ViewModel. A simple MVC/MVVM design comparative We could assert that the experience offered by MVVM is like a film, while the experience offered by MVC is like photography, because while a Controller always makes one-shot elaborations regarding the application user requests in MVC, in MVVM, the ViewModel is definitely the view! Not only does a ViewModel lie behind a View, but we could also say that if a VM is a body, then a View is its dress. While the concrete View is the graphical representation, the ViewModel is the virtual view, the un-concrete view, but still the View. In MVC, the View contains the user state (the value of all items showed in the UI) until a GET/POST invocation is sent to the web server. Once sent, in the MVC framework, the View simply binds one-way reading data from a Model. In MVVM, behaviors, interaction logic, and user state actually live within the ViewModel. Moreover, it is again in the ViewModel that any access to the underlying Model, domain, and any persistence provider actually flows. Between a ViewModel and View, a data connection called data binding is established. This is a declarative association between a source and target property, such as Person.Name with TextBox.Text. Although it is possible to configure data binding by imperative coding (while declarative means decorating or setting the property association in XAML), in frameworks such as WPF and other XAML-based frameworks, this is usually avoided because of the more decoupled result made by the declarative choice. The most powerful technology feature provided by any XAML-based language is actually the data binding, other than the simpler one that was available in Windows Forms. XAML allows one-way binding (also reverted to the source) and two-way binding. Such data binding supports any source or target as a property from a Model or ViewModel or any other control's dependency property. This binding subsystem is so powerful in XAML-based languages that events are handled in specific objects named Command, and this can be data-bound to specific controls, such as buttons. In the .NET framework, an event is an implementation of the Observer pattern that lies within a delegate object, allowing a 1-N association between the only source of the event (the owner of the event) and more observers that can handle the event with some specific code. The only object that can raise the event is the owner itself. In XAML-based languages, a Command is an object that targets a specific event (in the meaning of something that can happen) that can be bound to different controls/classes, and all of those can register handlers or raise the signaling of all handlers. An MVVM performance map analysis Performance concerns Regarding performance, MVVM behaves very well in several scenarios in terms of data retrieval (latency-driven) and data entry (throughput- and scalability-driven). The ability to have an impressive abstraction of the view in the VM without having to rely on the pipelines of MVC (the actions) makes the programming very pleasurable and give the developer the choice to use different designs and optimization techniques. Data binding itself is done by implementing specific .NET interfaces that can be easily centralized. Talking about latency, it is slightly different from previous examples based on web request-response time, unavailable in MVVM. Theoretically speaking, in the design pattern of MVVM, there is no latency at all. In a concrete implementation within XAML-based languages, latency can refer to two different kinds of timings. During data binding, latency is the time between when a VM makes new data available and a View actually renders it. Instead, during a command execution, latency is the time between when a command is invoked and all relative handlers complete their execution. We usually use the first definition until differently specified. Although the nominal latency is near zero (some milliseconds because of the dictionary-based configuration of data binding), specific implementation concerns about latency actually exist. In any Model or ViewModel, an updated data notification is made by triggering the View with the INotifyPropertyChanged interface. The .NET interface causes the View to read the underlying data again. Because all notifications are made by a single .NET event, this can easily become a bottleneck because of the serialized approach used by any delegate or event handlers in the .NET world. On the contrary, when dealing with data that flows from the View to the Model, such an inverse binding is usually configured declaratively within the {Binding …} keyword, which supports specifying binding directions and trigger timing (to choose from the control's lost focus CLR event or anytime the property value changes). The XAML data binding does not add any measurable time during its execution. Although this, as said, such binding may link multiple properties or the control's dependency properties together. Linking this interaction logic could increase latency time heavily, adding some annoying delay at the View level. One fact upon all, is the added latency by any validation logic. It is even worse if such validation is other than formal, such as validating some ID or CODE against a database value. Talking about scalability, MVVM patterns does some work here, while we can make some concrete analysis concerning the XAML implementation. It is easy to say that scaling out is impossible because MVVM is a desktop class layered architecture that cannot scale. Instead, we can say that in a multiuser scenario with multiple client systems connected in a 2-tier or 3-tier system architecture, simple MVVM and XAML-based frameworks will never act as bottlenecks. The ability to use the full .NET stack in WPF gives us the chance to use all synchronization techniques available, in order to use a directly connected DBMS or middleware tier. Instead of scaling up by moving the application to an increased CPU clock system, the XAML-based application would benefit more from an increased CPU core count system. Obviously, to profit from many CPU cores, mastering parallel techniques is mandatory. About the resource usage, MVVM-powered architectures require only a simple POCO class as a Model and ViewModel. The only additional requirement is the implementation of the INotifyPropertyChanged interface that costs next to nothing. Talking about the pattern, unlike MVC, which has a specific elaboration workflow, MVVM does not offer this functionality. Multiple commands with multiple logic can process their respective logic (together with asynchronous invocation) with the local VM data or by going down to the persistence layer to grab missing information. We have all the choices here. Although MVVM does not cost anything in terms of graphical rendering, XAML-based frameworks make massive use of hardware-accelerated user controls. Talking about an extreme choice, Windows Forms with Graphics Device Interface (GDI)-based rendering require a lot less resources and can give a higher frame rate on highly updatable data. Thus, if a very high FPS is needed, the choice of still rendering a WPF area in GDI is available. For other XAML languages, the choice is not so easy to obtain. Obviously, this does not mean that XAML is slow in rendering with its DirectX based engine. Simply consider that WPF animations need a good Graphics Processing Unit (GPU), while a basic GDI animation will execute on any system, although it is obsolete. Talking about availability, MVVM-based architectures usually lead programmers to good programming. As MVC allows it, MVVM designs can be tested because of the great modularity. While a Controller uses a pipelined workflow to process any requests, a ViewModel is more flexible and can be tested with multiple initialization conditions. This makes it more powerful but also less predictable than a Controller, and hence is tricky to use. In terms of design, the Controller acts as a transaction script, while the ViewModel acts in a more realistic, object-oriented approach. Finally, yet importantly, throughput and efficiency are simply unaffected by MVVM-based architectures. However, because of the flexibility the solution gives to the developer, any interaction and business logic design may be used inside a ViewModel and their underlying Models. Therefore, any success or failure regarding those performance aspects are usually related to programmer work. In XAML frameworks, throughput is achieved by an intensive use of asynchronous and parallel programming assisted by a built-in thread synchronization subsystem, based on the Dispatcher class that deals with UI updates. Low-pass filtering for Audio Low-pass filtering has been available since 2008 in the native .NET code. NAudio is a powerful library helping any CLR programmer to create, manipulate, or analyze audio data in any format. Available through NuGet Package Manager, NAudio offers a simple and .NET-like programming framework, with specific classes and stream-reader for audio data files. Let's see how to apply the low-pass digital filter in a real audio uncompressed file in WAVE format. For this test, we will use the Windows start-up default sound file. The chart is still made in a legacy Windows Forms application with an empty Form1 file, as shown in the previous example: private async void Form1_Load(object sender, EventArgs e) {    //stereo wave file channels    var channels = await Task.Factory.StartNew(() =>        {            //the wave stream-like reader            using (var reader = new WaveFileReader("startup.wav"))            {                var leftChannel = new List<float>();              var rightChannel = new List<float>();                  //let's read all frames as normalized floats                while (reader.Position < reader.Length)                {                    var frame = reader.ReadNextSampleFrame();                   leftChannel.Add(frame[0]);                    rightChannel.Add(frame[1]);                }                  return new                {                    Left = leftChannel.ToArray(),                    Right = rightChannel.ToArray(),                };            }        });      //make a low-pass digital filter on floating point data    //at 200hz    var leftLowpassTask = Task.Factory.StartNew(() => LowPass(channels.Left, 200).ToArray());    var rightLowpassTask = Task.Factory.StartNew(() => LowPass(channels.Right, 200).ToArray());      //this let the two tasks work together in task-parallelism    var leftChannelLP = await leftLowpassTask;    var rightChannelLP = await rightLowpassTask;      //create and databind a chart    var chart1 = CreateChart();      chart1.DataSource = Enumerable.Range(0, channels.Left.Length).Select(i => new        {            Index = i,            Left = channels.Left[i],            Right = channels.Right[i],            LeftLP = leftChannelLP[i],            RightLP = rightChannelLP[i],        }).ToArray();      chart1.DataBind();      //add the chart to the form    this.Controls.Add(chart1); }   private static Chart CreateChart() {    //creates a chart    //namespace System.Windows.Forms.DataVisualization.Charting      var chart1 = new Chart();      //shows chart in fullscreen    chart1.Dock = DockStyle.Fill;      //create a default area    chart1.ChartAreas.Add(new ChartArea());      //left and right channel series    chart1.Series.Add(new Series    {        XValueMember = "Index",        XValueType = ChartValueType.Auto,        YValueMembers = "Left",        ChartType = SeriesChartType.Line,    });    chart1.Series.Add(new Series    {        XValueMember = "Index",        XValueType = ChartValueType.Auto,        YValueMembers = "Right",        ChartType = SeriesChartType.Line,    });      //left and right channel low-pass (bass) series    chart1.Series.Add(new Series    {        XValueMember = "Index",        XValueType = ChartValueType.Auto,        YValueMembers = "LeftLP",        ChartType = SeriesChartType.Line,        BorderWidth = 2,    });    chart1.Series.Add(new Series    {        XValueMember = "Index",        XValueType = ChartValueType.Auto,        YValueMembers = "RightLP",        ChartType = SeriesChartType.Line,        BorderWidth = 2,    });      return chart1; } Let's see the graphical result: The Windows start-up sound waveform. In bolt, the bass waveform with a low-pass filter at 200hz. The usage of parallelism in elaborations such as this is mandatory. Audio elaboration is a canonical example of engineering data computation because it works on a huge dataset of floating points values. A simple file, such as the preceding one that contains less than 2 seconds of audio sampled at (only) 22,050 Hz, produces an array greater than 40,000 floating points per channel (stereo = 2 channels). Just to have an idea of how hard processing audio files is, note that an uncompressed CD quality song of 4 minutes sampled at 44,100 samples per second * 60 (seconds) * 4 (minutes) will create an array greater than 10 million floating-point items per channel. Because of the FFT intrinsic logic, any low-pass filtering run must run in a single thread. This means that the only optimization we can apply when running FFT based low-pass filtering is parallelizing in a per channel basis. For most cases, this choice can only bring a 2X throughput improvement, regardless of the processor count of the underlying system. Summary In this article we got introduced to the applications of .NET high-performance performance. We learned how MVVM and XAML play their roles in .NET to create applications for various platforms, also we learned about its performance characteristics. Next we learned how high-performance .NET had applications in engineering aspects through a practical example of low-pass audio filtering. It showed you how versatile it is to apply high-performance programming to specific engineering applications. Resources for Article: Further resources on this subject: Windows Phone 8 Applications [article] Core .NET Recipes [article] Parallel Programming Patterns [article]
Read more
  • 0
  • 0
  • 3138

article-image-working-large-data-sources
Packt
08 Jul 2015
20 min read
Save for later

Working with large data sources

Packt
08 Jul 2015
20 min read
In this article, by Duncan M. McGreggor, author of the book Mastering matplotlib, we come across the use of NumPy in the world of matplotlib and big data, problems with large data sources, and the possible solutions to these problems. (For more resources related to this topic, see here.) Most of the data that users feed into matplotlib when generating plots is from NumPy. NumPy is one of the fastest ways of processing numerical and array-based data in Python (if not the fastest), so this makes sense. However by default, NumPy works on in-memory database. If the dataset that you want to plot is larger than the total RAM available on your system, performance is going to plummet. In the following section, we're going to take a look at an example that illustrates this limitation. But first, let's get our notebook set up, as follows: In [1]: import matplotlib        matplotlib.use('nbagg')        %matplotlib inline Here are the modules that we are going to use: In [2]: import glob, io, math, os         import psutil        import numpy as np        import pandas as pd        import tables as tb        from scipy import interpolate        from scipy.stats import burr, norm        import matplotlib as mpl        import matplotlib.pyplot as plt        from IPython.display import Image We'll use the custom style sheet that we created earlier, as follows: In [3]: plt.style.use("../styles/superheroine-2.mplstyle") An example problem To keep things manageable for an in-memory example, we're going to limit our generated dataset to 100 million points by using one of SciPy's many statistical distributions, as follows: In [4]: (c, d) = (10.8, 4.2)        (mean, var, skew, kurt) = burr.stats(c, d, moments='mvsk') The Burr distribution, also known as the Singh–Maddala distribution, is commonly used to model household income. Next, we'll use the burr object's method to generate a random population with our desired count, as follows: In [5]: r = burr.rvs(c, d, size=100000000) Creating 100 million data points in the last call took about 10 seconds on a moderately recent workstation, with the RAM usage peaking at about 2.25 GB (before the garbage collection kicked in). Let's make sure that it's the size we expect, as follows: In [6]: len(r) Out[6]: 100000000 If we save this to a file, it weighs in at about three-fourths of a gigabyte: In [7]: r.tofile("../data/points.bin") In [8]: ls -alh ../data/points.bin        -rw-r--r-- 1 oubiwann staff 763M Mar 20 11:35 points.bin This actually does fit in the memory on a machine with a RAM of 8 GB, but generating much larger files tends to be problematic. We can reuse it multiple times though, to reach a size that is larger than what can fit in the system RAM. Before we do this, let's take a look at what we've got by generating a smooth curve for the probability distribution, as follows: In [9]: x = np.linspace(burr.ppf(0.0001, c, d),                          burr.ppf(0.9999, c, d), 100)          y = burr.pdf(x, c, d) In [10]: (figure, axes) = plt.subplots(figsize=(20, 10))          axes.plot(x, y, linewidth=5, alpha=0.7)          axes.hist(r, bins=100, normed=True)          plt.show() The following plot is the result of the preceding code: Our plot of the Burr probability distribution function, along with the 100-bin histogram with a sample size of 100 million points, took about 7 seconds to render. This is due to the fact that NumPy handles most of the work, and we only displayed a limited number of visual elements. What would happen if we did try to plot all the 100 million points? This can be checked by the following code: In [11]: (figure, axes) = plt.subplots()          axes.plot(r)          plt.show() formatters.py:239: FormatterWarning: Exception in image/png formatter: Allocated too many blocks After about 30 seconds of crunching, the preceding error was thrown—the Agg backend (a shared library) simply couldn't handle the number of artists required to render all the points. But for now, this case clarifies the point that we stated a while back—our first plot rendered relatively quickly because we were selective about the data we chose to present, given the large number of points with which we are working. However, let's say we have data from the files that are too large to fit into the memory. What do we do about this? Possible ways to address this include the following: Moving the data out of the memory and into the filesystem Moving the data off the filesystem and into the databases We will explore examples of these in the following section. Big data on the filesystem The first of the two proposed solutions for large datasets involves not burdening the system memory with data, but rather leaving it on the filesystem. There are several ways to accomplish this, but the following two methods in particular are the most common in the world of NumPy and matplotlib: NumPy's memmap function: This function creates memory-mapped files that are useful if you wish to access small segments of large files on the disk without having to read the whole file into the memory. PyTables: This is a package that is used to manage hierarchical datasets. It is built on the top of the HDF5 and NumPy libraries and is designed to efficiently and easily cope with extremely large amounts of data. We will examine each in turn. NumPy's memmap function Let's restart the IPython kernel by going to the IPython menu at the top of notebook page, selecting Kernel, and then clicking on Restart. When the dialog box pops up, click on Restart. Then, re-execute the first few lines of the notebook by importing the required libraries and getting our style sheet set up. Once the kernel is restarted, take a look at the RAM utilization on your system for a fresh Python process for the notebook: In [4]: Image("memory-before.png") Out[4]: The following screenshot shows the RAM utilization for a fresh Python process: Now, let's load the array data that we previously saved to disk and recheck the memory utilization, as follows: In [5]: data = np.fromfile("../data/points.bin")        data_shape = data.shape        data_len = len(data)        data_len Out[5]: 100000000 In [6]: Image("memory-after.png") Out[6]: The following screenshot shows the memory utilization after loading the array data: This took about five seconds to load, with the memory consumption equivalent to the file size of the data. This means that if we wanted to build some sample data that was too large to fit in the memory, we'd need about 11 of those files concatenated, as follows: In [7]: 8 * 1024 Out[7]: 8192 In [8]: filesize = 763        8192 / filesize Out[8]: 10.73656618610747 However, this is only if the entire memory was available. Let's see how much memory is available right now, as follows: In [9]: del data In [10]: psutil.virtual_memory().available / 1024**2 Out[10]: 2449.1796875 That's 2.5 GB. So, to overrun our RAM, we'll just need a fraction of the total. This is done in the following way: In [11]: 2449 / filesize Out[11]: 3.2096985583224114 The preceding output means that we only need four of our original files to create a file that won't fit in memory. However, in the following section, we will still use 11 files to ensure that data, if loaded into the memory, will be much larger than the memory. How do we create this large file for demonstration purposes (knowing that in a real-life situation, the data would already be created and potentially quite large)? We can try to use numpy.tile to create a file of the desired size (larger than memory), but this can make our system unusable for a significant period of time. Instead, let's use numpy.memmap, which will treat a file on the disk as an array, thus letting us work with data that is too large to fit into the memory. Let's load the data file again, but this time as a memory-mapped array, as follows: In [12]: data = np.memmap(            "../data/points.bin", mode="r", shape=data_shape) The loading of the array to a memmap object was very quick (compared to the process of bringing the contents of the file into the memory), taking less than a second to complete. Now, let's create a new file to write the data to. This file must be larger in size as compared to our total system memory (if held on in-memory database, it will be smaller on the disk): In [13]: big_data_shape = (data_len * 11,)          big_data = np.memmap(              "../data/many-points.bin", dtype="uint8",              mode="w+", shape=big_data_shape) The preceding code creates a 1 GB file, which is mapped to an array that has the shape we requested and just contains zeros: In [14]: ls -alh ../data/many-points.bin          -rw-r--r-- 1 oubiwann staff 1.0G Apr 2 11:35 many-points.bin In [15]: big_data.shape Out[15]: (1100000000,) In [16]: big_data Out[16]: memmap([0, 0, 0, ..., 0, 0, 0], dtype=uint8) Now, let's fill the empty data structure with copies of the data we saved to the 763 MB file, as follows: In [17]: for x in range(11):              start = x * data_len              end = (x * data_len) + data_len              big_data[start:end] = data          big_data Out[17]: memmap([ 90, 71, 15, ..., 33, 244, 63], dtype=uint8) If you check your system memory before and after, you will only see minimal changes, which confirms that we are not creating an 8 GB data structure on in-memory. Furthermore, checking your system only takes a few seconds. Now, we can do some sanity checks on the resulting data and ensure that we have what we were trying to get, as follows: In [18]: big_data_len = len(big_data)          big_data_len Out[18]: 1100000000 In [19]: data[100000000 – 1] Out[19]: 63 In [20]: big_data[100000000 – 1] Out[20]: 63 Attempting to get the next index from our original dataset will throw an error (as shown in the following code), since it didn't have that index: In [21]: data[100000000] ----------------------------------------------------------- IndexError               Traceback (most recent call last) ... IndexError: index 100000000 is out of bounds ... But our new data does have an index, as shown in the following code: In [22]: big_data[100000000 Out[22]: 90 And then some: In [23]: big_data[1100000000 – 1] Out[23]: 63 We can also plot data from a memmaped array without having a significant lag time. However, note that in the following code, we will create a histogram from 1.1 million points of data, so the plotting won't be instantaneous: In [24]: (figure, axes) = plt.subplots(figsize=(20, 10))          axes.hist(big_data, bins=100)          plt.show() The following plot is the result of the preceding code: The plotting took about 40 seconds to generate. The odd shape of the histogram is due to the fact that, with our data file-hacking, we have radically changed the nature of our data since we've increased the sample size linearly without regard for the distribution. The purpose of this demonstration wasn't to preserve a sample distribution, but rather to show how one can work with large datasets. What we have seen is not too shabby. Thanks to NumPy, matplotlib can work with data that is too large for memory, even if it is a bit slow iterating over hundreds of millions of data points from the disk. Can matplotlib do better? HDF5 and PyTables A commonly used file format in the scientific computing community is Hierarchical Data Format (HDF). HDF is a set of file formats (namely HDF4 and HDF5) that were originally developed at the National Center for Supercomputing Applications (NCSA), a unit of the University of Illinois at Urbana-Champaign, to store and organize large amounts of numerical data. The NCSA is a great source of technical innovation in the computing industry—a Telnet client, the first graphical web browser, a web server that evolved into the Apache HTTP server, and HDF, which is of particular interest to us, were all developed here. It is a little known fact that NCSA's web browser code was the ancestor to both the Netscape web browser as well as a prototype of Internet Explorer that was provided to Microsoft by a third party. HDF is supported by Python, R, Julia, Java, Octave, IDL, and MATLAB, to name a few. HDF5 offers significant improvements and useful simplifications over HDF4. It uses B-trees to index table objects and, as such, works well for write-once/read-many time series data. Common use cases span fields such as meteorological studies, biosciences, finance, and aviation. The HDF5 files of multiterabyte sizes are common in these applications. Its typically constructed from the analyses of multiple HDF5 source files, thus providing a single (and often extensive) source of grouped data for a particular application. The PyTables library is built on the top of the Python HDF5 library and NumPy. As such, it not only provides access to one of the most widely used large data file formats in the scientific computing community, but also links data extracted from these files with the data types and objects provided by the fast Python numerical processing library. PyTables is also used in other projects. Pandas wraps PyTables, thus extending its convenient in-memory data structures, functions, and objects to large on-disk files. To use HDF data with Pandas, you'll want to create pandas.HDFStore, read from the HDF data sources with pandas.read_hdf, or write to one with pandas.to_hdf. Files that are too large to fit in the memory may be read and written by utilizing chunking techniques. Pandas does support the disk-based DataFrame operations, but these are not very efficient due to the required assembly on columns of data upon reading back into the memory. One project to keep an eye on under the PyData umbrella of projects is Blaze. It's an open wrapper and a utility framework that can be used when you wish to work with large datasets and generalize actions such as the creation, access, updates, and migration. Blaze supports not only HDF, but also SQL, CSV, and JSON. The API usage between Pandas and Blaze is very similar, and it offers a nice tool for developers who need to support multiple backends. In the following example, we will use PyTables directly to create an HDF5 file that is too large to fit in the memory (for an 8GB RAM machine). We will follow the following steps: Create a series of CSV source data files that take up approximately 14 GB of disk space Create an empty HDF5 file Create a table in the HDF5 file and provide the schema metadata and compression options Load the CSV source data into the HDF5 table Query the new data source once the data has been migrated Remember the temperature precipitation data for St. Francis, in Kansas, USA, from a previous notebook? We are going to generate random data with similar columns for the purposes of the HDF5 example. This data will be generated from a normal distribution, which will be used in the guise of the temperature and precipitation data for hundreds of thousands of fictitious towns across the globe for the last century, as follows: In [25]: head = "country,town,year,month,precip,tempn"          row = "{},{},{},{},{},{}n"          filename = "../data/{}.csv"          town_count = 1000          (start_year, end_year) = (1894, 2014)          (start_month, end_month) = (1, 13)          sample_size = (1 + 2                        * town_count * (end_year – start_year)                        * (end_month - start_month))          countries = range(200)          towns = range(town_count)          years = range(start_year, end_year)          months = range(start_month, end_month)          for country in countries:             with open(filename.format(country), "w") as csvfile:                  csvfile.write(head)                  csvdata = ""                  weather_data = norm.rvs(size=sample_size)                  weather_index = 0                  for town in towns:                    for year in years:                          for month in months:                              csvdata += row.format(                                  country, town, year, month,                                  weather_data[weather_index],                                  weather_data[weather_index + 1])                              weather_index += 2                  csvfile.write(csvdata) Note that we generated a sample data population that was twice as large as the expected size in order to pull both the simulated temperature and precipitation data at the same time (from the same set). This will take about 30 minutes to run. When complete, you will see the following files: In [26]: ls -rtm ../data/*.csv          ../data/0.csv, ../data/1.csv, ../data/2.csv,          ../data/3.csv, ../data/4.csv, ../data/5.csv,          ...          ../data/194.csv, ../data/195.csv, ../data/196.csv,          ../data/197.csv, ../data/198.csv, ../data/199.csv A quick look at just one of the files reveals the size of each, as follows: In [27]: ls -lh ../data/0.csv          -rw-r--r-- 1 oubiwann staff 72M Mar 21 19:02 ../data/0.csv With each file that is 72 MB in size, we have data that takes up 14 GB of disk space, which exceeds the size of the RAM of the system in question. Furthermore, running queries against so much data in the .csv files isn't going to be very efficient. It's going to take a long time. So what are our options? Well, to read this data, HDF5 is a very good fit. In fact, it is designed for jobs like this. We will use PyTables to convert the .csv files to a single HDF5. We'll start by creating an empty table file, as follows: In [28]: tb_name = "../data/weather.h5t"          h5 = tb.open_file(tb_name, "w")          h5 Out[28]: File(filename=../data/weather.h5t, title='', mode='w',              root_uep='/', filters=Filters(                  complevel=0, shuffle=False, fletcher32=False,                  least_significant_digit=None))          / (RootGroup) '' Next, we'll provide some assistance to PyTables by indicating the data types of each column in our table, as follows: In [29]: data_types = np.dtype(              [("country", "<i8"),              ("town", "<i8"),              ("year", "<i8"),              ("month", "<i8"),               ("precip", "<f8"),              ("temp", "<f8")]) Also, let's define a compression filter that can be used by PyTables when saving our data, as follows: In [30]: filters = tb.Filters(complevel=5, complib='blosc') Now, we can create a table inside our new HDF5 file, as follows: In [31]: tab = h5.create_table(              "/", "weather",              description=data_types,              filters=filters) With that done, let's load each CSV file, read it in chunks so that we don't overload the memory, and then append it to our new HDF5 table, as follows: In [32]: for filename in glob.glob("../data/*.csv"):          it = pd.read_csv(filename, iterator=True, chunksize=10000)          for chunk in it:              tab.append(chunk.to_records(index=False))            tab.flush() Depending on your machine, the entire process of loading the CSV file, reading it in chunks, and appending to a new HDF5 table can take anywhere from 5 to 10 minutes. However, what started out as a collection of the .csv files that weigh in at 14 GB is now a single compressed 4.8 GB HDF5 file, as shown in the following code: In [33]: h5.get_filesize() Out[33]: 4758762819 Here's the metadata for the PyTables-wrapped HDF5 table after the data insertion: In [34]: tab Out[34]: /weather (Table(288000000,), shuffle, blosc(5)) '' description := { "country": Int64Col(shape=(), dflt=0, pos=0), "town": Int64Col(shape=(), dflt=0, pos=1), "year": Int64Col(shape=(), dflt=0, pos=2), "month": Int64Col(shape=(), dflt=0, pos=3), "precip": Float64Col(shape=(), dflt=0.0, pos=4), "temp": Float64Col(shape=(), dflt=0.0, pos=5)} byteorder := 'little' chunkshape := (1365,) Now that we've created our file, let's use it. Let's excerpt a few lines with an array slice, as follows: In [35]: tab[100000:100010] Out[35]: array([(0, 69, 1947, 5, -0.2328834718674, 0.06810312195695),          (0, 69, 1947, 6, 0.4724989007889, 1.9529216219569),          (0, 69, 1947, 7, -1.0757216683235, 1.0415374480545),          (0, 69, 1947, 8, -1.3700249968748, 3.0971874991576),          (0, 69, 1947, 9, 0.27279758311253, 0.8263207523831),          (0, 69, 1947, 10, -0.0475253104621, 1.4530808932953),          (0, 69, 1947, 11, -0.7555493935762, -1.2665440609117),          (0, 69, 1947, 12, 1.540049376928, 1.2338186532516),          (0, 69, 1948, 1, 0.829743501445, -0.1562732708511),          (0, 69, 1948, 2, 0.06924900463163, 1.187193711598)],          dtype=[('country', '<i8'), ('town', '<i8'),                ('year', '<i8'), ('month', '<i8'),                ('precip', '<f8'), ('temp', '<f8')]) In [36]: tab[100000:100010]["precip"] Out[36]: array([-0.23288347, 0.4724989 , -1.07572167,                -1.370025 , 0.27279758, -0.04752531,                -0.75554939, 1.54004938, 0.8297435 ,                0.069249 ]) When we're done with the file, we do the same thing that we would do with any other file-like object: In [37]: h5.close() If we want to work with it again, simply load it, as follows: In [38]: h5 = tb.open_file(tb_name, "r")          tab = h5.root.weather Let's try plotting the data from our HDF5 file: In [39]: (figure, axes) = plt.subplots(figsize=(20, 10))          axes.hist(tab[:1000000]["temp"], bins=100)          plt.show() Here's a plot for the first million data points: This histogram was rendered quickly, with a much better response time than what we've seen before. Hence, the process of accessing the HDF5 data is very fast. The next question might be "What about executing calculations against this data?" Unfortunately, running the following will consume an enormous amount of RAM: tab[:]["temp"].mean() We've just asked for all of the data—all of its 288 million rows. We are going to end up loading everything into the RAM, grinding the average workstation to a halt. Ideally though, when you iterate through the source data and create the HDF5 file, you also crunch the numbers that you will need, adding supplemental columns or groups to the HDF5 file that can be used later by you and your peers. If we have data that we will mostly be selecting (extracting portions) and which has already been crunched and grouped as needed, HDF5 is a very good fit. This is why one of the most common use cases that you see for HDF5 is the sharing and distribution of the processed data. However, if we have data that we need to process repeatedly, then we will either need to use another method besides the one that will cause all the data to be loaded into the memory, or find a better match for our data processing needs. We saw in the previous section that the selection of data is very fast in HDF5. What about calculating the mean for a small section of data? If we've got a total of 288 million rows, let's select a divisor of the number that gives us several hundred thousand rows at a time—2,81,250 rows, to be more precise. Let's get the mean for the first slice, as follows: In [40]: tab[0:281250]["temp"].mean() Out[40]: 0.0030696632864265312 This took about 1 second to calculate. What about iterating through the records in a similar fashion? Let's break up the 288 million records into chunks of the same size; this will result in 1024 chunks. We'll start by getting the ranges needed for an increment of 281,250 and then, we'll examine the first and the last row as a sanity check, as follows: In [41]: limit = 281250          ranges = [(x * limit, x * limit + limit)              for x in range(2 ** 10)]          (ranges[0], ranges[-1]) Out[41]: ((0, 281250), (287718750, 288000000)) Now, we can use these ranges to generate the mean for each chunk of 281,250 rows of temperature data and print the total number of means that we generated to make sure that we're getting our counts right, as follows: In [42]: means = [tab[x * limit:x * limit + limit]["temp"].mean()              for x in range(2 ** 10)]          len(means) Out[42]: 1024 Depending on your machine, that should take between 30 and 60 seconds. With this work done, it's now easy to calculate the mean for all of the 288 million points of temperature data: In [43]: sum(means) / len(means) Out[43]: -5.3051780413782918e-05 Through HDF5's efficient file format and implementation, combined with the splitting of our operations into tasks that would not copy the HDF5 data into memory, we were able to perform calculations across a significant fraction of a billion records in less than a minute. HDF5 even supports parallelization. So, this can be improved upon with a little more time and effort. However, there are many cases where HDF5 is not a practical choice. You may have some free-form data, and preprocessing it will be too expensive. Alternatively, the datasets may be actually too large to fit on a single machine. This is when you may consider using matplotlib with distributed data. Summary In this article, we covered the role of NumPy in the world of big data and matplotlib as well as the process and problems in working with large data sources. Also, we discussed the possible solutions to these problems using NumPy's memmap function and HDF5 and PyTables. Resources for Article: Further resources on this subject: First Steps [article] Introducing Interactive Plotting [article] The plot function [article]
Read more
  • 0
  • 0
  • 5127
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-creating-f-project
Packt
08 Jul 2015
5 min read
Save for later

Creating an F# Project

Packt
08 Jul 2015
5 min read
In this article by Adnan Masood, author of the book, Learning F# Functional Data Structures and Algorithms, we see how to set up the IDE and create our first F# project. "Ah yes, Haskell. Where all the types are strong, all the men carry arrows, and all the children are above average." – marked trees (on the city of Haskell) The perceived adversity of functional programming is overly exaggerated; the essence of this paradigm is to explicitly recognize and enforce the referential transparency. We will see how to set up the tooling for Visual Studio 2013 and for F# 3.1, the currently available version of F# at the time of writing. We will review the F# 4.0 preview features by the end of this project. After we get the tooling sorted out, we will review some simple algorithms; starting with recursion with typical a Fibonacci sequence and Tower of Hanoi, we will perform lazy evaluation on a quick sort example. In this article, we will cover the following topics: Setting up Visual Studio and F# compiler to work together Setting up the environment and running your F# programs (For more resources related to this topic, see here.) Setting up the IDE As developers, we love our IDEs (Integrated Development Environments) and Visual Studio.NET is probably the best IDE for .NET development; no offense to Eclipse bloatware Luna. From the open source perspective, there has been a recent major development in making the .NET framework available as open source and on Mac and Linux platforms. Microsoft announced a pre-release of F# 4.0 in Visual Studio 2015 Preview and it will be available as part of the full release. To install and run F#, there are various options available for all platforms, sizes, and budgets. For those with a fear of commitments, there is the online interactive version of TryFsharp at http://www.tryfsharp.org/ where you can code in the browser. For Windows users, you have a few options. Until VS.NET 2015 comes out, you can try out the freely available Visual Studio Community 2013 or a Visual Studio 2013 trial edition, with trial being the keyword. The trial editions include Ultimate, Premium, and Professional versions. The free community edition IDE can be downloaded from https://www.visualstudio.com/en-us/news/vs2013-community-vs.aspx and the trial editions can be downloaded from http://www.visualstudio.com/downloads/download-visual-studio-vs. Alternatively, there are express editions available at no cost. Visual Studio Express 2013 for Windows Desktop Web editions can be downloaded from http://www.visualstudio.com/downloads/download-visual-studio-vs#d-express-windows-desktop. F# support is built into Visual Studio; the Visual F# tools package the latest updates to the F# compiler: interactive, runtime, and Visual Studio integration. F# support comes with Visual Studio. However, the F# team releases regular updates in the form of F# tools. The tools can be downloaded from www.microsoft.com/en-us/download/details.aspx?id=44011. The F# tools contain the F# command-line compiler (fsc.exe) and F# Interactive (fsi.exe), which are the easiest way to get started with F#. The fsi.exe compiler can be found in C:Program Files (x86)Microsoft SDKsF#<version>Framework<version>. The <version> placeholder in the preceding path is substituted by your .NET version installed. If you just want to use the F# compiler and tools from the command line, you can download the .NET framework 4.5 or above from https://www.microsoft.com/en-in/download/details.aspx?id=30653. You will also need the Windows SDK for associated dependencies, which can be downloaded from http://msdn.microsoft.com/windows/desktop/bg162891. Alternatively, Tsunami is the free IDE for F# that you can download from http://tsunami.io/download.html and use to build applications. CloudSharper by IntelliFactory is in beta but shows promise as a web-based IDE. For more information regarding CloudSharper, refer to http://cloudsharper.com/. In this article, we will be using Visual Studio 2013 Professional Edition and FSI (F# interactive) but you can either use the trial or Express edition, or the FSI command line to run the examples and exercises. Your first F# project Going through installation screens and showing how to click Next would be discourteous to our reader's intelligence. Therefore we will skip step-by-step installation for other more verbose texts. Let's start with our first F# project in Visual Studio. In the preceding screenshot, you can see the F# interactive window at the bottom. Here we have selected FILE | New | Project because we are attempting to open a new project of F# type. There are a few project types available, including console applications and F# library. For ease of explanation, let's begin with a Console Application as shown in the next screenshot: Alternatively, from within Visual Studio, we can use FSharp Interactive. FSharp Interactive (FSI) is an effective tool for testing out your code quickly. You can open the FSI window by selecting View | Other Windows | F# Interactive from the Visual Studio IDE as shown in the next screenshot: FSI lets you run code from a console which is similar to a shell. You can access the FSI executable directly from the location at c:Program Files (x86)Microsoft SDKsF#<version>Framework<version>. FSI maintains session context, which means that the constructs created earlier in the FSI are still available in the later parts of code. The FsiAnyCPU.exe executable file is the 64-bit counterpart of F# interactive; Visual Studio determines which executable to use based on the machine's architecture being either 32-bit or 64-bit. You can also change the F# interactive parameters and settings from the Options dialog as shown in the following screenshot: Summary In this article, you learned how to set up an IDE for F# in Visual Studio 2013 and created a new F# project. Resources for Article: Further resources on this subject: Test-driven API Development with Django REST Framework [article] edX E-Learning Course Marketing [article] Introduction to Microsoft Azure Cloud Services [article]
Read more
  • 0
  • 0
  • 2074

article-image-understanding-mesos-internals
Packt
08 Jul 2015
26 min read
Save for later

Understanding Mesos Internals

Packt
08 Jul 2015
26 min read
 In this article by, Dharmesh Kakadia, author of the book Apache Mesos Essentials, explains how Mesos works internally in detail. We will start off with cluster scheduling and fairness concepts, understanding the Mesos architecture, and we will move on towards resource isolation and fault tolerance implementation in Mesos. In this article, we will cover the following topics: The Mesos architecture Resource allocation (For more resources related to this topic, see here.) The Mesos architecture Modern organizations have a lot of different kinds of applications for different business needs. Modern applications are distributed and they are deployed across commodity hardware. Organizations today run different applications in siloed environments, where separate clusters are created for different applications. This static partitioning of cluster leads to low utilization, and all the applications will duplicate the effort of dealing with distributed infrastructures. Not only is this a wasted effort, but it also undermines the fact that distributed systems are hard to build and maintain. This is challenging for both developers and operators. For developers, it is a challenge to build applications that scale elastically and can handle faults that are inevitable in large-scale environment. Operators, on the other hand, have to manage and scale all of these applications individually in siloed environments. The preceding situation is like trying to develop applications without having an operating system and managing all the devices in a computer. Mesos solves the problems mentioned earlier by providing a data center kernel. Mesos provides a higher-level abstraction to develop applications that treat distributed infrastructure just like a large computer. Mesos abstracts the hardware infrastructure away from the applications from the physical infrastructure. Mesos makes developers more productive by providing an SDK to easily write data center scale applications. Now, developers can focus on their application logic and do not have to worry about the infrastructure that runs it. Mesos SDK provides primitives to build large-scale distributed systems, such as resource allocation, deployment, and monitoring isolation. They only need to know and implement what resources are needed, and not how they get the resources. Mesos allows you to treat the data center just as a computer. Mesos makes the infrastructure operations easier by providing elastic infrastructure. Mesos aggregates all the resources in a single shared pool of resources and avoids static partitioning. This makes it easy to manage and increases the utilization. The data center kernel has to provide resource allocation, isolation, and fault tolerance in a scalable, robust, and extensible way. We will discuss how Mesos fulfills these requirements, as well as some other important considerations of modern data center kernel: Scalability: The kernel should be scalable in terms of the number of machines and number of applications. As the number of machines and applications increase, the response time of the scheduler should remain acceptable. Flexibility: The kernel should support a wide range of applications. It should also support diverse frameworks currently running on the cluster and future frameworks as well. The framework should also be able to cope up with the heterogeneity in the hardware, as most clusters are built over time and have a variety of hardware running. Maintainability: The kernel would be one of the very important pieces of modern infrastructure. As the requirements evolve, the scheduler should be able to accommodate new requirements. Utilization and dynamism: The kernel should adapt to the changes in resource requirements and available hardware resources and utilize resources in an optimal manner. Fairness: The kernel should be fair in allocating resources to the different users and/or frameworks. We will see what it means to be fair in detail in the next section. The design philosophy behind Mesos was to define a minimal interface to enable efficient resource sharing across frameworks and defer the task scheduling and execution to the frameworks. This allows the frameworks to implement diverse approaches toward scheduling and fault tolerance. It also makes the Mesos core simple, and the frameworks and core can evolve independently. The preceding figure shows the overall architecture (http://mesos.apache.org/documentation/latest/mesos-architecture) of a Mesos cluster. It has the following entities: The Mesos masters The Mesos slaves Frameworks Communication Auxiliary services We will describe each of these entities and their role, followed by how Mesos implements different requirements of the data center kernel. The Mesos slave The Mesos slaves are responsible for executing tasks from frameworks using the resources they have. The slave has to provide proper isolation while running multiple tasks. The isolation mechanism should also make sure that the tasks get resources that they are promised, and not more or less. The resources on slaves that are managed by Mesos can be described using slave resources and slave attributes. Resources are elements of slaves that can be consumed by a task, while we use attributes to tag slaves with some information. Slave resources are managed by the Mesos master and are allocated to different frameworks. Attributes identify something about the node, such as the slave having a specific OS or software version, it's part of a particular network, or it has a particular hardware, and so on. The attributes are simple key-value pairs of strings that are passed along with the offers to frameworks. Since attributes cannot be consumed by a running task, they will always be offered for that slave. Mesos doesn't understand the slave attribute, and interpretation of the attributes is left to the frameworks. More information about resource and attributes in Mesos can be found at https://mesos.apache.org/documentation/attributes-resources. A Mesos resource or attribute can be described as one of the following types: Scalar values are floating numbers Range values are a range of scalar values; they are represented as [minValue-maxValue] Set values are arbitrary strings Text values are arbitrary strings; they are applicable only to attributes Names of the resources can be an arbitrary string consisting of alphabets, numbers, "-", "/", ".", "-". The Mesos master handles the cpus, mem, disk, and ports resources in a special way. A slave without the cpus and mem resources will not be advertised to the frameworks. The mem and disk scalars are interpreted in MB. The ports resource is represented as ranges. The list of resources a slave has to offer to various frameworks can be specified as the resources flag. Resources and attributes are separated by a semicolon. For example: --resources='cpus:30;mem:122880;disk:921600;ports:[21000-29000];bugs:{a,b,c}' --attributes='rack:rack-2;datacenter:europe;os:ubuntuv14.4' This slave offers 30 cpus, 102 GB mem, 900 GB disk, ports from 21000 to 29000, and have bugs a, b, and c. The slave has three attributes: rack with value rack-2, datacenter with value europe, and os with value ubuntu14.4. Mesos does not yet provide direct support for GPUs, but does support custom resource types. This means that if we specify gpu(*):8 as part of --resources, then it will be part of the resource that offers to frameworks. Frameworks can use it just like other resources. Once some of the GPU resources are in use by a task, only the remaining resources will be offered. Mesos does not yet have support for GPU isolation, but it can be extended by implementing a custom isolator. Alternately, we can also specify which slaves have GPUs using attributes, such as --attributes="hasGpu:true". The Mesos master The Mesos master is primarily responsible for allocating resources to different frameworks and managing the task life cycle for them. The Mesos master implements fine-grained resource sharing using resource offers. The Mesos master acts as a resource broker for frameworks using pluggable policies. The master decides to offer cluster resources to frameworks in the form of resource offers based on them. Resources offer represents a unit of allocation in the Mesos world. It's a vector of resource available on a node. An offer represents some resources available on a slave being offered to a particular framework. Frameworks Distributed applications that run on top of Mesos are called frameworks. Frameworks implement the domain requirements using the general resource allocation API of Mesos. A typical framework wants to run a number of tasks. Tasks are the consumers of resources and they do not have to be the same. A framework in Mesos consists of two components: a framework scheduler and executors. Framework schedulers are responsible for coordinating the execution. An executor provides the ability to control the task execution. Executors can realize a task execution in many ways. An executor can choose to run multiple tasks, by spawning multiple threads, in an executor, or it can run one task in each executor. Apart from the life cycle and task management-related functions, the Mesos framework API also provides functions to communicate with framework schedulers and executors. Communication Mesos currently uses an HTTP-like wire protocol to communicate with the Mesos components. Mesos uses the libprocess library to implement the communication that is located in 3rdparty/libprocess. The libprocess library provides asynchronous communication with processes. The communication primitives have an actor message passing, such as semantics. The libprocess messages are immutable, which makes parallelizing the libprocess internals easier. Mesos communication happens along with the following APIs: Scheduler API: This is used to communicate with the framework scheduler and master. The internal communication is intended to be used only by the SchedulerDriver API. Executor API: This is used to communicate with an executor and the Mesos slave. Internal API: This is used to communicate with the Mesos master and slave. Operator API: This is the API exposed by Mesos for operators and is used by web UI, among other things. Unlike most Mesos API, the operator API is a synchronous API. To send a message, the actor does an HTTP POST request. The path is composed by the name of the actor followed by the name of the message. The User-Agent field is set to "libprocess/…" to distinguish from the normal HTTP requests. The message data is passed as the body of the HTTP request. Mesos uses protocol buffers to serialize all the messages (defined in src/messages/messages.proto). The parsing and interpretation of the message is left to the receiving actor. Here is an example header of a message sent to master to register the framework by scheduler(1) running at 10.0.1.7:53523 address: POST /master/mesos.internal.RegisterFrameworkMessage HTTP/1.1 User-Agent: libprocess/scheduler(1)@10.0.1.7:53523 The reply message header from the master that acknowledges the framework registration might look like this: POST /scheduler(1)/mesos.internal.FrameworkRegisteredMessage HTTP/1.1 User-Agent: libprocess/master@10.0.1.7:5050 At the time of writing, there is a very early discussion about rewiring the Mesos Scheduler API and Executor API as a pure HTTP API (https://issues.apache.org/jira/browse/MESOS-2288). This will make the API standard and integration with Mesos for various tools much easier without the need to be dependent on native libmesos. Also, there is an ongoing effort to convert all the internal messages into a standardized JSON or protocol buffer format (https://issues.apache.org/jira/browse/MESOS-1127). Auxiliary services Apart from the preceding main components, a Mesos cluster also needs some auxiliary services. These services are not part of Mesos itself, and are not strictly required, but they form a basis for operating the Mesos cluster in production environments. These services include, but are not limited to, the following: Shared filesystem: Mesos provides a view of the data center as a single computer and allows developers to develop for the data center scale application. With this unified view of resources, clusters need a shared filesystem to truly make the data center a computer. HDFS, NFS (Network File System), and Cloud-based storage options, such as S3, are popular among various Mesos deployments. Consensus service: Mesos uses a consensus service to be resilient in face of failure. Consensus services, such as ZooKeeper or etcd, provide a reliable leader election in a distributed environment. Service fabric: Mesos enables users to run a number of frameworks on unified computing resources. With a large number of applications and services running, it's important for users to be able to connect to them in a seamless manner. For example, how do users connect to Hive running on Mesos? How does the Ruby on Rails application discover and connect to the MongoDB database instances when one or both of them are running on Mesos? How is the website traffic routed to web servers running on Mesos? Answering these questions mainly requires service discovery and load balancing mechanisms, but also things such as IP/port management and security infrastructure. We are collectively referring to these services that connect frameworks to other frameworks and users as service fabric. Operational services: Operational services are essential in managing operational aspects of Mesos. Mesos deployments and upgrades, monitoring cluster health and alerting when human intervention is required, logging, and security are all part of the operational services that play a very important role in a Mesos cluster. Resource allocation As a data center kernel, Mesos serves a large variety of workloads and no single scheduler will be able to satisfy the needs of all different frameworks. For example, the way in which a real-time processing framework schedules its tasks will be very different from how a long running service will schedule its task, which, in turn, will be very different from how a batch processing framework would like to use its resources. This observation leads to a very important design decision in Mesos: separation of resource allocation and task scheduling. Resource allocation is all about deciding who gets what resources, and it is the responsibility of the Mesos master. Task scheduling, on the other hand, is all about how to use the resources. This is decided by various framework schedulers according to their own needs. Another way to understand this would be that Mesos handles coarse-grain resource allocation across frameworks, and then each framework does fine-grain job scheduling via appropriate job ordering to achieve its needs. The Mesos master gets information on the available resources from the Mesos slaves, and based on resource policies, the Mesos master offers these resources to different frameworks. Different frameworks can choose to accept or reject the offer. If the framework accepts a resource offer, the framework allocates the corresponding resources to the framework, and then the framework is free to use them to launch tasks. The following image shows the high-level flow of Mesos resource allocation: Mesos two level scheduler Here is the typical flow of events for one framework in Mesos: The framework scheduler registers itself with the Mesos master. The Mesos master receives the resource offers from slaves. It invokes the allocation module and decides which frameworks should receive the resource offers. The framework scheduler receives the resource offers from the Mesos master. On receiving the resource offers, the framework scheduler inspects the offer to decide whether it's suitable. If it finds it satisfactory, the framework scheduler accepts the offer and replies to the master with the list of executors that should be run on the slave, utilizing the accepted resource offers. Alternatively, the framework can reject the offer and wait for a better offer. The slave allocates the requested resources and launches the task executors. The executor is launched on slave nodes and runs the framework's tasks. It is up to the framework scheduler to accept or reject the resource offers. Here is an example of events that can happen when allocating resources. The framework scheduler gets notified about the task's completion or failure. The framework scheduler will continue receiving the resource offers and task reports and launch tasks as it sees fit. The framework unregisters with the Mesos master and will not receive any further resource offers. Note that this is optional and a long running service, and meta-framework will not unregister during the normal operation. Because of this design, Mesos is also known as a two-level scheduler. Mesos' two-level scheduler design makes it simpler and more scalable, as the resource allocation process does not need to know how scheduling happens. This makes the Mesos core more stable and scalable. Frameworks and Mesos are not tied to each other and each can iterate independently. Also, this makes porting frameworks easier. The choice of a two-level scheduler means that the scheduler does not have a global knowledge about resource utilization and the resource allocation decisions can be nonoptimal. One potential concern could be about the preferences that the frameworks have about the kind of resources needed for execution. Data locality, special hardware, and security constraints can be a few of the constraints on which tasks can run. In the Mesos realm, these preferences are not explicitly specified by a framework to the Mesos master, instead the framework rejects all the offers that do not meet its constraints. The Mesos scheduler Mesos was the first cluster scheduler to allow the sharing of resources to multiple frameworks. Mesos resource allocation is based on online Dominant Resource Fairness (DRF) called HierarchicalDRF. In a world of single resource static partitioning, fairness is easy to define. DRF extends this concept of fairness to multi-resource settings without the need for static partitioning. Resource utilization and fairness are equally important, and often conflicting, goals for a cluster scheduler. The fairness of resource allocation is important in a shared environment, such as data centers, to ensure that all the users/processes of the cluster get nearly an equal amount of resources. Min-max fairness provides a well-known mechanism to share a single resource among multiple users. Min-max fairness algorithm maximizes the minimum resources allocated to a user. In its simplest form, it allocates 1/Nth of the resource to each of the users. The weighted min-max fairness algorithm can also support priorities and reservations. Min-max resource fairness has been a basis for many well-known schedulers in operating systems and distributed frameworks, such as Hadoop's fair scheduler (http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html), capacity scheduler (https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html), Quincy scheduler (http://dl.acm.org/citation.cfm?id=1629601), and so on. However, it falls short when the cluster has multiple types of resources, such as the CPU, memory, disk, and network. When jobs in a distributed environment use different combinations of these resources to achieve the outcome, the fairness has to be redefined. For example, the two requests <1 CPU, 3 GB> and <3 CPU, 1 GB> come to the scheduler. How do they compare and what is the fair allocation? DRF generalizes the min-max algorithm for multiple resources. A user's dominant resource is the resource for which the user has a biggest share. For example, if the total resources are <8 CPU, 5 GB>, then for the user allocation of <2 CPU, 1 GB>, the user's dominant share is maximumOf(2/8,1/5) means CPU. A user's dominant share is the fraction of the dominant resource that's allocated to the user. In our example, it would be 25 percent (2/8). DRF applies the min-max algorithm to the dominant share of each user. It has many provable properties: Strategy proofness: A user cannot gain any advantage by lying about the demands. Sharing incentive: DRF has a minimum allocation guarantee for each user, and no user will prefer exclusive partitioned cluster of size 1/N over DRF allocation. Single resource fairness: In case of only one resource, DRF is equivalent to the min-max algorithm. Envy free: Every user prefers his allocation over any other allocation of other users. This also means that the users with the same requests get equivalent allocations. Bottleneck fairness: When one resource becomes a bottleneck, and each user has a dominant demand for it, DRF is equivalent to min-max. Monotonicity: Adding resources or removing users can only increase the allocation of the remaining users. Pareto efficiency: The allocation achieved by DRF will be pareto efficient, so it would be impossible to make a better allocation for any user without making allocation for some other user worse. We will not further discuss DRF but will encourage you to refer to the DRF paper for more details at http://static.usenix.org/event/nsdi11/tech/full_papers/Ghodsi.pdf. Mesos uses role specified in FrameworkInfo for resource allocation decision. A role can be per user or per framework or can be shared by multiple users and frameworks. If it's not set, Mesos will set it to the current user that runs the framework scheduler. An optimization is to use deny resource offers from particular slaves for a specified time period. Mesos can revoke tasks allocation killing those tasks. Before killing a task, Mesos gives the framework a grace period to clean up. Mesos asks the executor to kill the task, but if it does not oblige the request, it will kill the executor and all of its tasks. Weighted DRF DRF calculates each role's dominant share and allocates the available resources to the user with the smallest dominant share. In practice, an organization rarely wants to assign resources in a complete fair manner. Most organizations want to allocate resources in a weighted manner, such as 50 percent resources to ads team, 30 percent to QA, and 20 percent to R&D teams. To satisfy this command functionality, Mesos implements weighted DRF, where masters can be configured with weights for different roles. When weights are specified, a client's DRF share will be divided by the weight. For example, a role that has a weight of two will be offered twice as many resources as a role with weight of one. Mesos can be configured to use weighted DRF using the --weights and --roles flags on the master startup. The --weights flag expects a list of role/weight pairs in the form of role1=weight1 and role2=weight2. Weights do not need to be integers. We must provide weights for each role that appear in --roles on the master startup. Reservation One of the other most asked questions for requirement is the ability to reserve resources. For example, persistent or stateful services, such as memcache, or a database running on Mesos, would need a reservation mechanism to avoid being negatively affected on restart. Without reservation, memcache is not guaranteed to get a resource offer from the slave, which has all the data and would incur significant time in initialization and downtime for the service. Reservation can also be used to limit the resource per role. Reservation provides guaranteed resources for roles, but improper usage might lead to resource fragmentation and lower utilization of resources. Note that all the reservation requests go through a Mesos authorization mechanism to ensure that the operator or framework requesting the operation has the proper privileges. Reservation privileges are specified to the Mesos master through ACL along with the rest of the ACL configuration. Mesos supports the following two kinds of reservation: Static reservation Dynamic reservation Static reservation In static reservation, resources are reserved for a particular role. The restart of the slave after removing the checkpointed state is required to change static reservation. Static reservation is thus typically managed by operators using the --resources flag on the slave. The flag expects a list of name(role):value for different resources. If a resource is assigned to role A, then only frameworks with role A are eligible to get an offer for that resource. Any resources that do not include a role or resources that are not included in the --resources flag will be included in the default role (default *). For example, --resources="cpus:4;mem:2048;cpus(ads):8;mem(ads):4096" specifies that the slave has 8 CPUs and 4096 MB memory reserved for "ads" role and has 4 CPUs and 2048 MB memory unreserved. Nonuniform static reservation across slaves can quickly become difficult to manage. Dynamic reservation Dynamic reservation allows operators and frameworks to manage reservation more dynamically. Frameworks can use dynamic reservations to reserve offered resources, allowing those resources to only be reoffered to the same framework. At the time of writing, dynamic reservation is still being actively developed and is targeted toward the next release of Mesos (https://issues.apache.org/jira/browse/MESOS-2018). When asked for a reservation, Mesos will try to convert the unreserved resources to reserved resources. On the other hand, during the unreserve operation, the previously reserved resources are returned to the unreserved pool of resources. To support dynamic reservation, Mesos allows a sequence of Offer::Operations to be performed as a response to accepting resource offers. A framework manages reservation by sending Offer::Operations::Reserve and Offer::Operations::Unreserve as part of these operations, when receiving resource offers. For example, consider the framework that receives the following resource offer with 32 CPUs and 65536 MB memory: {   "id" : <offer_id>,   "framework_id" : <framework_id>,   "slave_id" : <slave_id>,   "hostname" : <hostname>,   "resources" : [     {       "name" : "cpus",       "type" : "SCALAR",       "scalar" : { "value" : 32 },       "role" : "*",     },     {       "name" : "mem",       "type" : "SCALAR",       "scalar" : { "value" : 65536 },       "role" : "*",     }   ] } The framework can decide to reserve 8 CPUs and 4096 MB memory by sending the Operation::Reserve message with resources field with the desired resources state: [   {     "type" : Offer::Operation::RESERVE,     "resources" : [       {         "name" : "cpus",         "type" : "SCALAR",         "scalar" : { "value" : 8 },         "role" : <framework_role>,         "reservation" : {           "framework_id" : <framework_id>,           "principal" : <framework_principal>         }       }       {         "name" : "mem",         "type" : "SCALAR",         "scalar" : { "value" : 4096 },         "role" : <framework_role>,         "reservation" : {           "framework_id" : <framework_id>,           "principal" : <framework_principal>         }       }     ]   } ] After a successful execution, the framework will receive resource offers with reservation. The next offer from the slave might look as follows: {   "id" : <offer_id>,   "framework_id" : <framework_id>,   "slave_id" : <slave_id>,   "hostname" : <hostname>,   "resources" : [     {       "name" : "cpus",       "type" : "SCALAR",       "scalar" : { "value" : 8 },       "role" : <framework_role>,       "reservation" : {         "framework_id" : <framework_id>,         "principal" : <framework_principal>       }     },     {       "name" : "mem",       "type" : "SCALAR",       "scalar" : { "value" : 4096 },       "role" : <framework_role>,       "reservation" : {         "framework_id" : <framework_id>,         "principal" : <framework_principal>       }     },     {       "name" : "cpus",       "type" : "SCALAR",       "scalar" : { "value" : 24 },       "role" : "*",     },     {       "name" : "mem",       "type" : "SCALAR",       "scalar" : { "value" : 61440 },       "role" : "*",     }   ] } As shown, the framework has 8 CPUs and 4096 MB memory reserved resources and 24 CPUs and 61440 MB memory underserved in the resource offer. The unreserve operation is similar. The framework on receiving the resource offer can send the unreserve operation message, and subsequent offers will not have reserved resources. The operators can use/reserve and/unreserve HTTP endpoints of the operator API to manage the reservation. The operator API allows operators to change the reservation specified when the slave starts. For example, the following command will reserve 4 CPUs and 4096 MB memory on slave1 for role1 with the operator authentication principal ops: ubuntu@master:~ $ curl -d slaveId=slave1 -d resources="{          {            "name" : "cpus",            "type" : "SCALAR",            "scalar" : { "value" : 4 },            "role" : "role1",            "reservation" : {              "principal" : "ops"            }          },          {            "name" : "mem",            "type" : "SCALAR",            "scalar" : { "value" : 4096 },            "role" : "role1",            "reservation" : {              "principal" : "ops"            }          },        }"        -X POST http://master:5050/master/reserve Before we end this discussion on resource allocation, it would be important to note that the Mesos community continues to innovate on the resource allocation front by incorporating interesting ideas, such as oversubscription (https://issues.apache.org/jira/browse/MESOS-354), from academic literature and other systems. Summary In this article, we looked at the Mesos architecture in detail and learned how Mesos deals with resource allocation, resource isolation, and fault tolerance. We also saw the various ways in which we can extend Mesos. Resources for Article: Further resources on this subject: Recommender systems dissected Tuning Solr JVM and Container [article] Transformation [article] Getting Started [article]
Read more
  • 0
  • 0
  • 5733

article-image-editors-and-ides
Packt
08 Jul 2015
10 min read
Save for later

Editors and IDEs

Packt
08 Jul 2015
10 min read
In this article by Daniel Blair, the author of the book Learning Banana Pi, you are going to learn about some editors and the programming languages that are available on the Pi and Linux. These tools will help you write the code that will interact with the hardware through GPIO and on the Pi as a server. (For more resources related to this topic, see here.) Choosing your editor There are many different integrated development environments (generally abbreviated as IDEs) to choose from on Linux. When working on the Banana Pi, you're limited to the software that will run on an ARM-based CPU. Hence, options such as Sublime Text are not available. Some options that you may be familiar with are available for general purpose code editing. Some tools are available for the command line, while others are GUI tools. So, depending on whether you have a monitor or not, you will want to choose an appropriate tool. The following screenshot shows some JavaScript being edited via nano on the command line: Command-line editors The command line is a powerful tool. If you master it, you will rarely need to leave it. There are several editors available for the command line. There has been an ongoing war between the users of two editors: GNU Emacs and Vim. There are many editors like nano (which is my preference), but the war tends to be between the two aforementioned editors. The Emacs editor This is my least favorite command-line editor (just my preference). Emacs is a GNU-flavored editor for the command line. It is often installed by default, but you can easily install it if it is missing by running a quick command, as follows: sudo apt-get install emacs Now, you can edit a file via the CLI by using the following code: emacs <command-line arguments> <your file> The preceding code will open the file in Emacs for you to edit. You can also use this to create new files. You can save and close the editor with a couple of key combinations: Ctrl + X and Ctrl + S Ctrl + X and Ctrl + C Thus, your document will be saved and closed. The Vim editor Vim is actually an extension of Vi, and it is functionally the same thing. Vim is a fine editor. Many won't personally go out of their way to not use it. However, people do find it a bit difficult to remember all the commands. If you do get good at it, though, you can code very quickly. You can install Vim with the command line: sudo apt-get install vim Also, there is a GUI version available that allows interaction with the mouse; this is functionally the same program as the Vim command line. You don't have to be confined to the terminal window. You can install it with an identical command: sudo apt-get install vim-gnome You can edit files easily with Vim via the command line for both Vim and Vim-Gnome, as follows: vim <your file> gvim <your file> The gnome version will open the file in a window There is a handy tutorial that you can use to learn the commands of Vim. You can run the tutorial with the help of the following command: vimtutor This tutorial will teach you how to run this editor, which is awesome because the commands can be a bit complicated at first. The following screenshot shows Vim editing the file that we used earlier: The nano editor The nano editor is my favorite editor for the command line. This is probably because it was the first editor that I was exposed to when I started to learn Linux and experiment with the servers and eventually, the Raspberry Pi and Banana Pi. The nano editor is generally considered the easiest to use and is installed by default on the Banana Pi images. If, for some reason, you need to install it, you can get it quickly with the help of the following command: sudo apt-get install nano The editor is easy to use. It comes with several commands that you will use frequently. To save and close the editor, use the following key combinations: Ctrl + O Ctrl + X You can get help at any time by pressing Ctrl + G. Graphic editors With the exception of gVim, all the editors we just talked about live on the command line. If you are more accustomed to graphical tools, you may be more comfortable with a full-featured IDE. There are a couple of choices in this regard that you may be familiar with. These tools are a little heavier than the command-line tools because you will need to not only run the software, but also render the window. This is not as much of a big deal on the Banana Pi as it is on the Raspberry Pi, because we have more RAM to play with. However, if you have a lot of programs running already, it might cause some performance issues. Eclipse Eclipse is a very popular IDE that is available for everything. You can use it to develop all kinds of systems and use all kinds of programming languages. This is a tool that can be used to do professional development. There are a lot of plugins available in this IDE. It is also used to develop apps for Android (although Android Studio is also available now). Eclipse is written in Java. Hence, in order to make it work, you will require a Java Runtime Environment. The Banana Pi should come equipped with the Java development and runtime environments. If this is not the case, they are not difficult to install. In order to grab the proper version of Eclipse and avoid browsing all the specific versions on the website, you can just install it via the command line by entering the following code: sudo apt-get install eclipse Once Eclipse is installed, you will find it in the application menu under programming tools. The following screenshot shows the Eclipse IDE running on the Banana Pi: The Geany IDE Geany is a lighter weight IDE than Eclipse although the former is not quite fully featured. It is a clean UI that can be customized and used to write a lot of different programming languages. Geany was one of the first IDEs I ever used when first exploring Linux when I was a kid. Geany does not come preinstalled on the Banana Pi images, but it is easy to get via the command line: sudo apt-get install geany Depending on what you plan to do code-wise on the Banana Pi, Geany may be your best bet. It is GUI-based and offers quite a bit of functionality. However, it is a lot faster to load than Eclipse. It may seem familiar for Windows users, and they might find it easier to operate since it resembles Windows software. The following screenshot shows Geany on Linux: Both of these editors, Geany and Eclipse, are not specific to a particular programming language, but they both are slightly better for certain languages. Geany tends to be better for web languages such as HTML, PHP, JavaScript, and CSS, while Eclipse tends to be better for compiled languages such as C++, Go, and Java as well as PHP and Ruby with plugins. If you plan to write scripts or languages that are intended to be run from the command line such as Bash, Ruby, or Python, you may want to stick to the command line and use an editor such as Vim or nano. It is worth your time to play around with the editors and find your preferences. Web IDEs In addition to the command line and GUI editors, there are a couple of web-based IDEs. These essentially turn your Pi into a code server, which allows you to run and even execute certain types of code on an IDE written in web languages. These IDEs are great for learning code, but they are not really replacements for the solutions that were listed previously. Google Coder Google Coder is an educational web IDE that was released as an open source project by Google for the Raspberry Pi. Although there is a readily available image for the Raspberry Pi, we can manually install it for the Banana Pi. The following screenshot shows the Google Coder's interface: The setup is fairly straightforward. We will clone the Git repo and install it with Node.js. If you don't have Git and Node.js installed, you can install them with a quick command in the terminal, as follows: sudo apt-get install nodejs npm git Once it is installed, we can clone the coder repo by using the following code: git clone https://github.com/googlecreativelab/coder After it is cloned, we will move into the directory and install it with the help of the following code: cd ~/coder/coder-base/ npm install It may take several minutes to install, even on the Banana Pi. Next, we will edit the config.js file, which will be used to configure the ports and IP addresses. nano config.js The preceding code will reveal the contents of the file. Change the top values to match the following: exports.listenIP = '127.0.0.1'; exports.listenPort = '8081'; exports.httpListenPort = '8080'; exports.cacheApps = true; exports.httpVisiblePort = '8080'; exports.httpsVisiblePort = '8081'; After you change the settings you need, run a server by using Node.js: nodejs server.js You should now be able to connect to the Pi in a browser either on it or on another computer and use Coder. Coder is an educational tool with a lot of different built-in tutorials. You can use Coder to learn JavaScript, CSS, HTML, and jQuery. Adafruit WebIDE Adafruit has developed its own Web IDE, which is designed to run on the Raspberry Pi and BeagleBone. Since we are using the Banana Pi, it will only run better. This IDE is designed to work with Ruby, Python, and JavaScript, to name a few. It includes a terminal via which you can send commands to the Pi from the browser. It is an interesting tool if you wish to learn how to code. The following screenshot shows the interface of the WebIDE: The installation of WebIDE is very simple compared to that of Google Coder, which took several steps. We will just run one command: curl https://raw.githubusercontent.com/adafruit/Adafruit-WebIDE/alpha/scripts/install.sh | sudo sh After a few minutes, you will see an output that indicates that the server is starting. You will be able to access the IDE just like Google Coder—through a browser from another computer or from itself. It should be noted that you will be required to create a free Bit Bucket account to use this software. Summary In this article, we explored several different programming languages, command-line tools, graphical editors, and even some web IDEs. These tools are valuable for all kinds of projects that you may be working on. Resources for Article: Further resources on this subject: Prototyping Arduino Projects using Python [article] Raspberry Pi and 1-Wire [article] The Raspberry Pi and Raspbian [article]
Read more
  • 0
  • 0
  • 8186

Packt
08 Jul 2015
21 min read
Save for later

To Be or Not to Be – Optionals

Packt
08 Jul 2015
21 min read
In this article by Andrew J Wagner, author of the book Learning Swift, we will cover: What is an optional? How to unwrap an optional Optional chaining Implicitly unwrapped optionals How to debug optionals The underlying implementation of an optional (For more resources related to this topic, see here.) Introducing optionals So, we know that the purpose of optionals in Swift is to allow the representation of the absent value, but what does that look like and how does it work? An optional is a special type that can wrap any other type. This means that you can make an optional String, optional Array, and so on. You can do this by adding a question mark (?) to the type name: var possibleString: String? var possibleArray: [Int]? Note that this code does not specify any initial values. This is because all optionals, by default, are set to no value at all. If we want to provide an initial value, we can do so like any other variable: var possibleInt: Int? = 10 Also note that, if we leave out the type specification (: Int?), possibleInt would be inferred to be of the Int type instead of an Int optional. It is pretty verbose to say that a variable lacks a value. Instead, if an optional lacks a variable, we say that it is nil. So, both possibleString and possibleArray are nil, while possibleInt is 10. However, possibleInt is not truly 10. It is still wrapped in an optional. You can see all the forms a variable can take by putting the following code in to a playground: var actualInt = 10 var possibleInt: Int? = 10 var nilInt: Int? println(actualInt) // "10" println(possibleInt) // "Optional(10)" println(nilInt) // "nil" As you can see, actualInt prints out as we expect it to, but possibleInt prints out as an optional that contains the value 10 instead of just 10. This is a very important distinction because an optional cannot be used as if it were the value it wraps. The nilInt optional just reports that it is nil. At any point, you can update the value within an optional, including the fact that you can give it a value for the first time using the assignment operator (=): nilInt = 2 println(nilInt) // "Optional(2)" You can even remove the value within an optional by assigning it to nil: nilInt = nil println(nilInt) // "nil" So, we have this wrapped form of a variable that may or may not contain a value. What do we do if we need to access the value within an optional? The answer is that we must unwrap it. Unwrapping an optional There are multiple ways to unwrap an optional. All of them essentially assert that there is truly a value within the optional. This is a wonderful safety feature of Swift. The compiler forces you to consider the possibility that an optional lacks any value at all. In other languages, this is a very commonly overlooked scenario that can cause obscure bugs. Optional binding The safest way to unwrap an optional is using something called optional binding. With this technique, you can assign a temporary constant or variable to the value contained within the optional. This process is contained within an if statement, so that you can use an else statement for when there is no value. An optional binding looks like this: if let string = possibleString {    println("possibleString has a value: \(string)") } else {    println("possibleString has no value") } An optional binding is distinguished from an if statement primarily by the if let syntax. Semantically, this code says "if you can let the constant string be equal to the value within possibleString, print out its value; otherwise, print that it has no value." The primary purpose of an optional binding is to create a temporary constant that is the normal (nonoptional) version of the optional. It is also possible to use a temporary variable in an optional binding: possibleInt = 10 if var int = possibleInt {    int *= 2 } println(possibleInt) // Optional(10) Note that an astrix (*) is used for multiplication in Swift. You should also note something important about this code, that is, if you put it into a playground, even though we multiplied int by 2, the value does not change. When we print out possibleInt later, the value still remains Optional(10). This is because even though we made the int variable (otherwise known as mutable), it is simply a temporary copy of the value within possibleInt. No matter what we do with int, nothing will be changed about the value within possibleInt. If we need to update the actual value stored within possibleInt, we need to simply assign possibleInt to int after we are done modifying it: possibleInt = 10 if var int = possibleInt {    int *= 2    possibleInt = int } println(possibleInt) // Optional(20) Now the value wrapped inside possibleInt has actually been updated. A common scenario that you will probably come across is the need to unwrap multiple optional values. One way of doing this is by simply nesting the optional bindings: if let actualString = possibleString {    if let actualArray = possibleArray {        if let actualInt = possibleInt {            println(actualString)            println(actualArray)            println(actualInt)        }    } } However, this can be a pain as it increases the indentation level each time to keep the code organized. Instead, you can actually list multiple optional bindings in a single statement separated by commas: if let actualString = possibleString,    let actualArray = possibleArray,    let actualInt = possibleInt {    println(actualString)    println(actualArray)    println(actualInt) } This generally produces more readable code. This way of unwrapping is great, but saying that optional binding is the safe way to access the value within an optional implies that there is an unsafe way to unwrap an optional. This way is called forced unwrapping. Forced unwrapping The shortest way to unwrap an optional is by forced unwrapping. This is done using an exclamation mark (!) after the variable name when it is used: possibleInt = 10 possibleInt! *= 2   println(possibleInt) // "Optional(20)" However, the reason it is considered unsafe is that your entire program crashes if you try to unwrap an optional that is currently nil: nilInt! *= 2 // fatal error The full error you get is "unexpectedly found as nil while unwrapping an optional value". This is because forced unwrapping is essentially your personal guarantee that the optional truly holds a value. This is why it is called forced. Therefore, forced unwrapping should be used in limited circumstances. It should never be used just to shorten up the code. Instead, it should only be used when you can guarantee, from the structure of the code, that it cannot be nil, even though it is defined as an optional. Even in this case, you should check whether it is possible to use a nonoptional variable instead. The only other place you may use it is when your program truly cannot recover if an optional is nil. In these circumstances, you should at least consider presenting an error to the user, which is always better than simply having your program crash. An example of a scenario where forced unwrapping may be used effectively is with lazily calculated values. A lazily calculated value is a value that is not created until the first time it is accessed. To illustrate this, let's consider a hypothetical class that represents a filesystem directory. It would have a property that lists its contents that are lazily calculated. The code would look something like this: class FileSystemItem {} class File: FileSystemItem {} class Directory: FileSystemItem {    private var realContents: [FileSystemItem]?    var contents: [FileSystemItem] {        if self.realContents == nil {           self.realContents = self.loadContents()        }        return self.realContents!    }      private func loadContents() -> [FileSystemItem] {        // Do some loading        return []    } } Here, we defined a superclass called FileSystemItem that both File and Directory inherit from. The contents of a directory is a list of any kind of FileSystemItem. We define content as a calculated variable and store the real value within the realContents property. The calculated property checks whether there is a value yet loaded for realContents; if there isn't, it loads the contents and puts it into the realContents property. Based on this logic, we know with 100 percent certainty that there will be a value within realContents by the time we get to the return statement, so it is perfectly safe to use forced unwrapping. Nil coalescing In addition to optional binding and forced unwrapping, Swift also provides an operator called the nil coalescing operator to unwrap an optional. This is represented by a double question mark (??). Basically, this operator lets us provide a default value for a variable or operation result in case it is nil. This is a safe way to turn an optional value into a nonoptional value and it would look something like this: var possibleString: String? = "An actual string" println(possibleString ?? "Default String")   // "An Actual String" Here, we ask the program to print out possibleString unless it is nil, in which case, it will just print Default String. Since we did give it a value, it printed out that value and it is important to note that it printed out as a regular variable, not as an optional. This is because one way or another, an actual value will be printed. This is a great tool for concisely and safely unwrapping an optional when a default value makes sense. Optional chaining A common scenario in Swift is to have an optional that you must calculate something from. If the optional has a value you want to store the result of the calculation on, but if it is nil, the result should just be set to nil: var invitee: String? = "Sarah" var uppercaseInvitee: String? if let actualInvitee = invitee {    uppercaseInvitee = actualInvitee.uppercaseString } This is pretty verbose. To shorten this up in an unsafe way, we could use forced unwrapping: uppercaseInvitee = invitee!.uppercaseString However, optional chaining will allow us to do this safely. Essentially, it allows optional operations on an optional. When the operation is called, if the optional is nil, it immediately returns nil; otherwise, it returns the result of performing the operation on the value within the optional: uppercaseInvitee = invitee?.uppercaseString So in this call, invitee is an optional. Instead of unwrapping it, we will use optional chaining by placing a question mark (?) after it, followed by the optional operation. In this case, we asked for the uppercaseInvitee property on it. If invitee is nil, uppercaseInvitee is immediately set to nil without it even trying to access uppercaseString. If it actually does contain a value, uppercaseInvitee gets set to the uppercaseString property of the contained value. Note that all optional chains return an optional result. You can chain as many calls, both optional and nonoptional, as you want in this way: var myNumber: String? = "27" myNumber?.toInt()?.advancedBy(10).description This code attempts to add 10 to myNumber, which is represented by String. First, the code uses an optional chain in case myNumber is nil. Then, the call to toInt uses an additional optional chain because that method returns an optional Int type. We then call advancedBy, which does not return an optional, allowing us to access the description of the result without using another optional chain. If at any point any of the optionals are nil, the result will be nil. This can happen for two different reasons: This can happen because myNumber is nil This can also happen because toInt returns nil as it cannot convert String to the Int type If the chain makes it all the way to advanceBy, there is no longer a failure path and it will definitely return an actual value. You will notice that there are exactly two question marks used in this chain and there are two possible failure reasons. At first, it can be hard to understand when you should and should not use a question mark to create a chain of calls. The rule is that you should always use a question mark if the previous element in the chain returns an optional. However, since you are prepared, let's look at what happens if you use an optional chain improperly: myNumber.toInt() // Value of optional type 'String?' not unwrapped In this case, we try to call a method directly on an optional without a chain so that we get an error. We also have the case where we try to inappropriately use an optional chain: var otherNumber = "10" otherNumber?.toInt() // Operand of postfix '?'   should have optional type Here, we get an error that says a question mark can only be used on an optional type. It is great to have a good sense of catching errors, which you will see when you make mistakes, so that you can quickly correct them because we all make silly mistakes from time to time. Another great feature of optional chaining is that it can be used for method calls on an optional that does not actually return a value: var invitees: [String]? = [] invitee?.removeAll(keepCapacity: false) In this case, we only want to call removeAll if there is truly a value within the optional array. So, with this code, if there is a value, all the elements are removed from it: otherwise, it remains nil. In the end, option chaining is a great choice for writing concise code that still remains expressive and understandable. Implicitly unwrapped optionals There is a second type of optional called an implicitly unwrapped optional. There are two ways to look at what an implicitly unwrapped optional is. One way is to say that it is a normal variable that can also be nil. The other way is to say that it is an optional that you don't have to unwrap to use. The important thing to understand about them is that like optionals, they can be nil, but like a normal variable, you do not have to unwrap them. You can define an implicitly unwrapped optional with an exclamation mark (!) instead of a question mark (?) after the type name: var name: String! Just like with regular optionals, implicitly unwrapped optionals do not need to be given an initial value because they are nil by default. At first, this may sound like it is the best of both worlds, but in reality, it is more like the worst of both worlds. Even though an implicitly unwrapped optional does not have to be unwrapped, it will crash your entire program if it is nil when used: name.uppercaseString // Crash A great way to think about them is that every time an implicitly unwrapped optional is used, it is implicitly performing a forced unwrapping. The exclamation mark is placed in its type declaration instead of using it every time. This is particularly bad because it appears the same as any other variable except for how it is declared. This means that it is very unsafe to use, unlike a normal optional. So, if implicitly unwrapped optionals are the worst of both worlds and are so unsafe, why do they even exist? The reality is that in rare circumstances, they are necessary. They are used in circumstances where a variable is not truly optional, but you also cannot give an initial value to it. This is almost always true in the case of custom types that have a member variable that is nonoptional, but cannot be set during initialization. A rare example of this is a view in iOS. UIKit, as we discussed earlier, is the framework that Apple provides for iOS development. In it, Apple has a class called UIView that is used for displaying content on the screen. Apple also provides a tool in Xcode called Interface Builder that lets you design these views in a visual editor instead of in code. Many views designed in this way need references to other views that can be accessed programmatically later. When one of these views is loaded, it is initialized without anything connected and then all the connections are made. Once all the connections are made, a function called awakeFromNib is called on the view. This means that these connections are not available for use during initialization, but are available once awakeFromNib is called. This order of operations also ensures that awakeFromNib is always called before anything actually uses the view. This is a circumstance where it is necessary to use an implicitly unwrapped optional. A member variable may not be defined until the view is initialized and when it is completely loaded: import UIKit class MyView: UIView {    @IBOutlet var button : UIButton!    var buttonOriginalWidth : CGFloat!      override func awakeFromNib() {        self.buttonOriginalWidth = self.button.frame.size.width    } } Note that we have actually declared two implicitly unwrapped optionals. The first is a connection to button. We know this is a connection because it is preceded by @IBOutlet. This is declared as an implicitly unwrapped optional because the connections are not set up until after initialization, but they are still guaranteed to be set up before any other methods are called on the view. This also then leads us to make our second variable, buttonOriginalWidth, implicitly unwrapped because we need to wait until the connection is made before we can determine the width of button. After awakeFromNib is called, it is safe to treat both button and buttonOriginalWidth as nonoptional. You may have noticed that we had to dive pretty deep in to app development in order to find a valid use case for implicitly unwrapped optionals, and this is arguably only because UIKit is implemented in Objective-C. Debugging optionals We already saw a couple of compiler errors that we commonly see because of optionals. If we try to call a method on an optional that we intended to call on the wrapped value, we will get an error. If we try to unwrap a value that is not actually optional, we will get an error that the variable or constant is not optional. We also need to be prepared for runtime errors that optionals can cause. As discussed, optionals cause runtime errors if you try to forcefully unwrap an optional that is nil. This can happen with both explicit and implicit forced unwrapping. If you followed my advice so far in this article, this should be a rare occurrence. However, we all end up working with third-party code, and maybe they were lazy or maybe they used forced unwrapping to enforce their expectations about how their code should be used. Also, we all suffer from laziness from time to time. It can be exhausting or discouraging to worry about all the edge cases when you are excited about programming the main functionality of your app. We may use forced unwrapping temporarily while we worry about that main functionality and plan to come back to handle it later. After all, during development, it is better to have a forced unwrapping crash the development version of your app than it is for it to fail silently if you have not yet handled that edge case. We may even decide that an edge case is not worth the development effort of handling because everything about developing an app is a trade-off. Either way, we need to recognize a crash from forced unwrapping quickly, so that we don't waste extra time trying to figure out what went wrong. When an app tries to unwrap a nil value, if you are currently debugging the app, Xcode shows you the line that tries to do the unwrapping. The line reports that there was EXC_BAD_INSTRUCTION and you will also get a message in the console saying fatal error: unexpectedly found nil while unwrapping an Optional value:   You will also sometimes have to look at which code currently calls the code that failed. To do that, you can use the call stack in Xcode. When your program crashes, Xcode automatically displays the call stack, but you can also manually show it by going to View | Navigators | Show Debug Navigator. This will look something as follows:   Here, you can click on different levels of code to see the state of things. This becomes even more important if the program crashes within one of Apple's framework, where you do not have access to the code. In that case, you should move up the call stack to the point where your code is called in the framework. You may also be able to look at the names of the functions to help you figure out what may have gone wrong. Anywhere on the call stack, you can look at the state of the variables in the debugger, as shown in the following screenshot:   If you do not see this variable's view, you can display it by clicking on the button at the bottom-left corner, which is second from the right that will be grayed out. Here, you can see that invitee is indeed nil, which is what caused the crash. As powerful as the debugger is, if you find that it isn't helping you find the problem, you can always put println statements in important parts of the code. It is always safe to print out an optional as long as you don't forcefully unwrap it like in the preceding example. As we saw earlier, when an optional is printed, it will print nil if it doesn't have a value or it will print Optional(<value>) if it does have a value. Debugging is an extremely important part of becoming a productive developer because we all make mistakes and create bugs. Being a great developer means that you can identify problems quickly and understand how to fix them soon after that. This will largely come from practice, but it will also come when you have a firm grasp of what really happens with your code instead of simply adapting some code you find online to fit your needs through trial and error. The underlying implementation At this point, you should have a pretty strong grasp of what an optional is and how to use and debug it, but it is valuable to look deeper at optionals and see how they actually work. In reality, the question mark syntax for optionals is just a special shorthand. Writing String? is equivalent to writing Optional<String>. Writing String! is equivalent to writing ImplicitlyUnwrappedOptional<String>. The Swift compiler has shorthand versions because they are so commonly used This allows the code to be more concise and readable. If you declare an optional using the long form, you can see Swift's implementation by holding command and clicking on the word Optional. Here, you can see that Optional is implemented as an enumeration. If we simplify the code a little, we have: enum Optional<T> {    case None    case Some(T) } So, we can see that Optional really has two cases: None and Some. None stands for the nil case, while the Some case has an associated value, which is the value wrapped inside Optional. Unwrapping is then the process of retrieving the associated value out of the Some case. One part of this that you have not seen yet is the angled bracket syntax (<T>). This is a generic and essentially allows the enumeration to have an associated value of any type. Realizing that optionals are simply enumerations will help you to understand how to use them. It also gives you some insight into how concepts are built on top of other concepts. Optionals seem really complex until you realize that they are just two-case enumerations. Once you understand enumerations, you can pretty easily understand optionals as well. Summary We only covered a single concept, optionals, in this article, but we saw that this is a pretty dense topic. We saw that at the surface level, optionals are pretty straightforward. They offer a way to represent a variable that has no value. However, there are multiple ways to get access to the value wrapped within an optional, which have very specific use cases. Optional binding is always preferred as it is the safest method, but we can also use forced unwrapping if we are confident that an optional is not nil. We also have a type called implicitly unwrapped optional to delay the assigning of a variable that is not intended to be optional, but we should use it sparingly because there is almost always a better alternative. Resources for Article: Further resources on this subject: Network Development with Swift [article] Flappy Swift [article] Playing with Swift [article]
Read more
  • 0
  • 0
  • 5063
article-image-building-biztalk-server-2013-applications
Packt
08 Jul 2015
15 min read
Save for later

Building BizTalk Server 2013 Applications

Packt
08 Jul 2015
15 min read
Creativity is the power to connect the seemingly unconnected. – William Plomer Let's begin our journey by investigating what BizTalk Server actually is, why one would use it, and how to craft a running application. This article will be a refresher on BizTalk Server for those of you who have some familiarity with the product. In this article by Mark Brimble coauthor of the book SOA Patterns with BizTalk Server 2013 and Microsoft Azure - Second Edition, you will learn: How to articulate BizTalk Server, when to use it, and how it works How to outline the role of BizTalk schemas, maps, and orchestrations BizTalk messaging configurations What is BizTalk Server? So what exactly is BizTalk Server, and why should you care about it? In a nutshell, Microsoft BizTalk Server 2013 uses adapter technology to connect disparate entities and enable the integration of data, events, processes, and services. An entity may be an application, department, or a different organization altogether that you need to be able to share information with. A software adapter is typically used when we need to establish communication between two components that do not natively collaborate. BizTalk Server adapters are built with a common framework; which results in system integration implemented via configuration, not coding. Traditionally, BizTalk Server has solved problems in the following three areas: Enterprise Application Integration Business-to-Business Business Process Automation First, BizTalk Server acts as an Enterprise Application Integration (EAI) server that connects applications that are natively incapable of talking to each other. The applications may have incompatible platforms, data structure formats, or security models. For example, when a new employee is hired, the employee data from the human resources application needs to be sent to the payroll application so that the new employee receives his/her paycheck on time. Nothing prevents you from writing the code necessary to connect these disparate applications with a point-to-point solution. However, using such a strategy often leads to an application landscape that looks like this:   Many organizations choose to insert a communication broker between these applications, as shown in the following figure:   Some of the benefits that you would realize from such an architectural choice include: Loose coupling of applications where one does not have a physical dependency on the other Durable infrastructure that can guarantee delivery and queue messages during destination system downtime Centralized management of system integration endpoints Message flow control such as in-order delivery Encouragement for the reuse of core components Insight into cross-functional business processes through business activity monitoring BizTalk Server solves a second problem by filling the role of a Business-to-Business (B2B) broker that facilitates communication across different organizations. BizTalk supports B2B scenarios by offering Internet-friendly adapters, industry standard EDI message schemas, and robust support for both channel- and message-based security.   The third broad area that BizTalk Server excels in is Business Process Automation (BPA). BPA is all about taking historically manual workflow procedures and turning them into executable processes. For example, consider an organization that typically receives a new order via e-mail, and the sales agent manually checks inventory levels prior to inserting the order into the fulfillment system. If the inventory is too low, then the sales agent has to initiate an order with their supplier and watch out for the response so that the inventory system can be updated. The inevitable problems of this scenario are as follows: Poor scalability when the number of orders increases Lack of visibility into the status of orders and supplier requests Multiple instances of redundant data entry, ripe for mistakes in one system and not the other Unreliable resources when a sales agent is sick By deciding to automate this scenario, the company can reduce human error while streamlining communications between applications and organizations.   The beginning of the second decade of the 21st century saw the disruption of the traditional ways in which EAI and B2B problems were solved because of the rise of Software as Service (SaaS). SaaS is a software that is hosted external to your business, and is paid for on a subscription basis; its best known example is Salesforce.com. Many organizations have chosen to modify their EAI and B2B solutions with BizTalk Server to access SaaS applications using hybrid solutions, as shown in the following figure:   Four new adapters, the WCF-BasicHTTPRelay, WCF-WebHTTP, WCF-NetTCPRelay, and SB-Messaging adapter, have been added to BizTalk 2013 to support hybrid solutions, and are nicknamed the "cloud adapters". New chapters on RESTful services and the Azure Service Bus have been added to this edition of the book to describe how these cloud adapters enhance the BizTalk Server story. Microsoft Azure BizTalk Service (MABS) has been created as a SaaS offering that can abstract B2B problems to Azure. We have added a chapter that shows how to use BizTalk Server 2013 with this new SaaS model. Examples of how to use all these new components to add new SOA capabilities to BizTalk Server have been added to this book. Azure App Services is Microsoft's next generation SaaS offering that will supersede MABS. While the platform is still very fresh, we have outlined the underlying concepts for you in the final chapter of this book to help you get a head start on usage of this platform. What's the one thing that all of these BizTalk Server cases have in common? They all depend on the real-time interchange and processing of discrete messages in an event-driven fashion. This partially explains why BizTalk Server is such a strong tool within a service-oriented architecture. We'll investigate many of BizTalk's service-oriented capabilities in later chapters, but it's important to note that the functionality that exists to support the three top-level scenarios mentioned earlier (EAI, B2B, and BPM) fits well into a service-oriented mindset. Concepts such as contract-first design, loose coupling, and reusability are soaked into the fabric of BizTalk Server. BizTalk Server should be targeted for solutions that exchange real-time messages as opposed to Extract Transform Load (ETL) products that excel at bulky, batch-oriented exchanges between data stores. BizTalk Server 2013 is the eighth release of the product, the first release being BizTalk Server 2000. Back in those days, developers had access to four native adapters (filesystem, MSMQ, HTTP, and SMTP). Development was done using a series of different tools, and the underlying engine had some fairly tight coupling between components. Since then, the entire product has been rebuilt and reengineered for .NET and a myriad of new services and features have become part of the BizTalk Server suite. The application continues to evolve and take greater advantage of the features of the Microsoft product stack, while still being the most interoperable and platform-neutral offering that Microsoft has ever produced. BizTalk architecture So how does BizTalk Server actually work? At its core, BizTalk Server is an event-processing engine based on a conventional publish-subscribe pattern. Wikipedia defines the publish-subscribe pattern as: "An asynchronous messaging paradigm where senders (publishers) of messages are not programmed to send their messages to specific receivers (subscribers). Rather, published messages are characterized into classes, without knowledge of what (if any) subscribers there may be. Subscribers express interest in one or more classes, and only receive messages that are of interest, without knowledge of what (if any) publishers there are." This pattern enforces a natural loose coupling and provides more scalability than an engine that requires a tight connection between receivers and senders. In the first release of BizTalk Server, the product did have tightly coupled messaging components, but thankfully, the engine was completely redesigned for BizTalk Server 2004. Once a message is received by a BizTalk adapter, it runs through any necessary preprocessing (such as decoding and validations) in BizTalk pipelines before being subjected to data transformation via BizTalk maps, and finally being published to a central database called the MessageBox. Then, the parties that have a corresponding subscription for that message can consume it as they see fit. While introducing a bit of unavoidable latency, the MessageBox database makes up for that by providing us with durability, reliability, and scalability. For instance, if one of our subscriber systems is offline for maintenance, outbound messages are not lost, but rather the MessageBox ensures that the messages are queued until the subscriber is ready to receive them. Worried about a large flood of inbound messages that steal processing threads away from other BizTalk activities? No problem! The MessageBox ensures that each and every message finds its way to its targeted subscriber, even if it must wait until the flood of inbound messages subsides. There are really two ways to look at the way BizTalk is structured. The first is the traditional EAI view, which sees BizTalk receiving messages and routes them to the next system for consumption. The flow is very linear and BizTalk is seen as a broker between two applications, shown as follows:   However, the other way to consider BizTalk, and the focus of this book, is as a Service Bus, with numerous input/output channels that process messages in a very dynamic way. That is, instead of visualizing the data flow as a straight path through BizTalk to a destination system, consider BizTalk exposing services as on-ramps to a variety of destinations. Messages published to BizTalk Server may fan out to dozens of subscribers, who have no interest in what the publishing application actually was. Instead of thinking about BizTalk as a simple connector of systems, think of it as a message bus that coordinates a symphony of events between endpoints. This concept is an exciting way to exploit BizTalk's engine in this modern world of service orientation. In the following figure, I've shown how the central BizTalk bus has receiver services hanging from it, and has a multitude of distinct subscriber services that are activated by relevant messages reaching the bus:   If the on-ramp concept is a bit abstract to understand, consider a simple analogy. In designing the transportation for a city, it would be foolish to create distinct roads between each and every destination. The design and maintenance of such a project would be lunacy. It would be smart to design a shared highway with on and off ramps, which enable people to use a common route to get to the numerous locations around town. As new destinations in the city emerge, the entire highway (or road system) doesn't need to undergo changes, but rather, only a new entrance/exit point needs to be appended to the existing shared infrastructure. What exactly is a message anyway? A message is data processed through BizTalk Server's messaging engine, whether that data is transported as an XML document, a delimited flat file, or a Microsoft Word document. The message content may contain a command (for example, InsertCustomer), a document (for example, Invoice), or an event (for example, VendorAdded). A message has a set of properties associated with it. First and foremost, a message may have a type associated with it, which uniquely defines it within the messaging bus. The type is typically comprised of the XML namespace and the root node name (for example, http://CompanyA.Purchasing#PurchaseOrder). The message type is much like the class object in an object-oriented programming language; it uniquely identifies entities by their properties. The other critical attribute of a message in BizTalk Server is the property bag called the message context, as shown in the following screenshot:   The message context is a set of name/value properties that stay attached to the message as long as it remains within BizTalk Server. These context values include metadata about the transport used to publish the message and attributes of the message itself. Properties in the message context that are visible to the BizTalk engine, and therefore available for routing decisions, are called promoted properties. How does a message actually get into BizTalk Server? A receive location is configured for the actual endpoint that receives messages. The receive location uses a particular adapter that knows how to absorb the inbound message. For instance, a receive location may be configured to use the FILE adapter, which polls a particular directory for XML messages. The receive location stores the file path to monitor, while the adapter provides transport connectivity. Upon receipt of a message, the adapter stamps a set of values into the message context. For the FILE adapter, values such as ReceivedFileName are added to that message's context property bag. Note that BizTalk has both application adapters, such as SQL Server, Oracle, and SAP, as well as transport-level adapters, such as HTTP, MSMQ, and FILE. The key point is that the adapter configuration user experience is virtually identical regardless of the type of adapter chosen. Some of the adapters available are shown in the following figure:   Receive locations have a particular receive pipeline associated with them. A pipeline is a sequential set of optional operations that is performed on the message in preparation of being parsed and sent to the message box database by the BizTalk adapter. For instance, I would need a pipeline in order to decrypt, unzip, or validate the XML structure of my inbound message. One of the most critical roles of the pipeline is to identify the type of the inbound message and put the type into the message context as a promoted property. Custom pipelines can serve as preprocessing stages to make the message useful for processing. As discussed earlier, a message type is the unique characterization of a message. Think of a receive pipeline as performing all the preprocessing steps necessary for putting the message in to its most usable format. A receive port contains one or more receive locations. Receive ports have XSLT maps associated with them that are applied to messages prior to publishing them to the MessageBox database. What value does a receive port offer? It acts as a grouping of receive locations where capabilities such as mapping and data tracking can be applied to all of the associated receive locations. It may also act as a container that allows us to publish a single entity to BizTalk Server regardless of how it came in, or what it looked like upon receipt. Let's say that my receive port contains three receive locations, which all receive slightly different "invoice" messages from three different external vendors. At the receive port level, I have three maps that take each unrelated message and maps it to a single, common format, before publishing it to BizTalk. Now that we have a message cleaned up (by the pipeline) and in the final structure (via an XSLT map), it's published to the BizTalk Server MessageBox where message routing can begin. For our purposes, there are two types of subscribers that we care about. The first type of subscriber is a send port. A send port is conceptually the inverse of the receive location and is responsible for transporting messages out of the BizTalk "bus". It has not only the adapter reference, adapter configuration settings, and pipeline (much like the receive location), but also the ability to apply XSLT maps to outbound messages. If a send port subscribes to a message, it first applies any XSLT map to the message, then processes it through a send pipeline, and finally uses the adapter to transmit the message out of BizTalk. The other type of subscriber for a published message is a BizTalk orchestration. An orchestration is an executable business process that uses messages to complete operations in a workflow. We'll spend plenty of time working with orchestration subscribers throughout this book. Summary In this article, we looked at what BizTalk is, its core use cases, and how it works. In my experience, one of the biggest competitors to BizTalk Server is not another product, but custom-built solutions. Many organizations engage a "build versus buy" debate prior to committing to a commercial product. In this article, I highlighted just a few aspects of BizTalk that make it a compelling choice for usage. With BizTalk Server, you get a well-designed scalable messaging engine with a durable persistence tier, which guarantees that your mission-critical messages are not lost in transit. The engine also provides native support for message tracking, recoverability, and straightforward scalability. BizTalk provides you with more than 20 native application adapters that save weeks of custom development time and testing. We also got a glimpse of BizTalk's integrated workflow toolset, which enables us to quickly build executable business processes that run in a load-balanced environment. These features alone often tip the scales in BizTalk Server's favor, not to mention the multitude of features that we are yet to discuss, such as Enterprise Single Sign On, the Business Rules Engine, Business Activity Monitoring, and so on. I hope that this article also planted some seeds in your mind with regards to thinking about BizTalk solutions in a service-oriented fashion. There are best practices for designing reusable, maintainable solutions that we will investigate throughout the rest of this book. In the next chapter, we'll explore one of the most important technologies for building robust service interfaces in BizTalk Server, which is Windows Communication Foundation. Resources for Article: Further resources on this subject: Implementation Case Study [article] Oracle B2B Overview [article] SOA Application Design [article]
Read more
  • 0
  • 0
  • 1579

article-image-deployment-preparations
Packt
08 Jul 2015
23 min read
Save for later

Deployment Preparations

Packt
08 Jul 2015
23 min read
In this article by Jurie-Jan Botha, author of the book Grunt Cookbook, has covered the following recipes: Minifying HTML Minifying CSS Optimizing images Linting JavaScript code Uglifying JavaScript code Setting up RequireJS (For more resources related to this topic, see here.) Once our web application is built and its stability ensured, we can start preparing it for deployment to its intended market. This will mainly involve the optimization of the assets that make up the application. Optimization in this context mostly refers to compression of one kind or another, some of which might lead to performance increases too. The focus on compression is primarily due to the fact that the smaller the asset, the faster it can be transferred from where it is hosted to a user's web browser. This leads to a much better user experience, and can sometimes be essential to the functioning of an application. Minifying HTML In this recipe, we make use of the contrib-htmlmin (0.3.0) plugin to decrease the size of some HTML documents by minifying them. Getting ready In this example, we'll work with the a basic project structure. How to do it... The following steps take us through creating a sample HTML document and configuring a task that minifies it: We'll start by installing the package that contains the contrib-htmlmin plugin. Next, we'll create a simple HTML document called index.html in the src directory, which we'd like to minify, and add the following content in it: <html> <head>    <title>Test Page</title> </head> <body>    <!-- This is a comment! -->    <h1>This is a test page.</h1> </body> </html> Now, we'll add the following htmlmin task to our configuration, which indicates that we'd like to have the white space and comments removed from the src/index.html file, and that we'd like the result to be saved in the dist/index.html file: htmlmin: { dist: {    src: 'src/index.html',    dest: 'dist/index.html',    options: {      removeComments: true,      collapseWhitespace: true    } } } The removeComments and collapseWhitespace options are used as examples here, as using the default htmlmin task will have no effect. Other minification options can be found at the following URL: https://github.com/kangax/html-minifier#options-quick-reference We can now run the task using the grunt htmlmin command, which should produce output similar to the following: Running "htmlmin:dist" (htmlmin) task Minified dist/index.html 147 B ? 92 B If we now take a look at the dist/index.html file, we will see that all white space and comments have been removed: <html> <head>    <title>Test Page</title> </head> <body>    <h1>This is a test page.</h1> </body> </html> Minifying CSS In this recipe, we'll make use of the contrib-cssmin (0.10.0) plugin to decrease the size of some CSS documents by minifying them. Getting ready In this example, we'll work with a basic project structure. How to do it... The following steps take us through creating a sample CSS document and configuring a task that minifies it. We'll start by installing the package that contains the contrib-cssmin plugin. Then, we'll create a simple CSS document called style.css in the src directory, which we'd like to minify, and provide it with the following contents: body { /* Average body style */ background-color: #ffffff; color: #000000; /*! Black (Special) */ } Now, we'll add the following cssmin task to our configuration, which indicates that we'd like to have the src/style.css file compressed, and have the result saved to the dist/style.min.css file: cssmin: { dist: {    src: 'src/style.css',    dest: 'dist/style.min.css' } } We can now run the task using the grunt cssmin command, which should produce the following output: Running "cssmin:dist" (cssmin) taskFile dist/style.css created: 55 B ? 38 B If we take a look at the dist/style.min.css file that was produced, we will see that it has the compressed contents of the original src/style.css file: body{background-color:#fff;color:#000;/*! Black (Special) */} There's more... The cssmin task provides us with several useful options that can be used in conjunction with its basic compression feature. We'll look at prefixing a banner, removing special comments, and reporting gzipped results. Prefixing a banner In the case that we'd like to automatically include some information about the compressed result in the resulting CSS file, we can do so in a banner. A banner can be prepended to the result by supplying the desired banner content to the banner option, as shown in the following example: cssmin: { dist: {    src: 'src/style.css',    dest: 'dist/style.min.css',    options: {      banner: '/* Minified version of style.css */'    } } } Removing special comments Comments that should not be removed by the minification process are called special comments and can be indicated using the "/*! comment */" markers. By default, the cssmin task will leave all special comments untouched, but we can alter this behavior by making use of the keepSpecialComments option. The keepSpecialComments option can be set to either the *, 1, or 0 value. The * value is the default and indicates that all special comments should be kept, 1 indicates that only the first comment that is found should be kept, and 0 indicates that none of them should be kept. The following configuration will ensure that all comments are removed from our minified result: cssmin: { dist: {    src: 'src/style.css',    dest: 'dist/style.min.css',    options: {      keepSpecialComments: 0    } } } Reporting on gzipped results Reporting is useful to see exactly how well the cssmin task has compressed our CSS files. By default, the size of the targeted file and minified result will be displayed, but if we'd also like to see the gzipped size of the result, we can set the report option to gzip, as shown in the following example: cssmin: { dist: {    src: 'src/main.css',    dest: 'dist/main.css',    options: {      report: 'gzip'    } } } Optimizing images In this recipe, we'll make use of the contrib-imagemin (0.9.4) plugin to decrease the size of images by compressing them as much as possible without compromising on their quality. This plugin also provides a plugin framework of its own, which is discussed at the end of this recipe. Getting ready In this example, we'll work with the basic project structure. How to do it... The following steps take us through configuring a task that will compress an image for our project. We'll start by installing the package that contains the contrib-imagemin plugin. Next, we can ensure that we have an image called image.jpg in the src directory on which we'd like to perform optimizations. Now, we'll add the following imagemin task to our configuration and indicate that we'd like to have the src/image.jpg file optimized, and have the result saved to the dist/image.jpg file: imagemin: { dist: {    src: 'src/image.jpg',    dest: 'dist/image.jpg' } } We can then run the task using the grunt imagemin command, which should produce the following output: Running "imagemin:dist" (imagemin) task Minified 1 image (saved 13.36 kB) If we now take a look at the dist/image.jpg file, we will see that its size has decreased without any impact on the quality. There's more... The imagemin task provides us with several options that allow us to tweak its optimization features. We'll look at how to adjust the PNG compression level, disable the progressive JPEG generation, disable the interlaced GIF generation, specify SVGO plugins to be used, and use the imagemin plugin framework. Adjusting the PNG compression level The compression of a PNG image can be increased by running the compression algorithm on it multiple times. By default, the compression algorithm is run 16 times. This number can be changed by providing a number from 0 to 7 to the optimizationLevel option. The 0 value means that the compression is effectively disabled and 7 indicates that the algorithm should run 240 times. In the following configuration we set the compression level to its maximum: imagemin: { dist: {    src: 'src/image.png',    dest: 'dist/image.png',    options: {      optimizationLevel: 7    } } } Disabling the progressive JPEG generation Progressive JPEGs are compressed in multiple passes, which allows a low-quality version of them to quickly become visible and increase in quality as the rest of the image is received. This is especially helpful when displaying images over a slower connection. By default, the imagemin plugin will generate JPEG images in the progressive format, but this behavior can be disabled by setting the progressive option to false, as shown in the following example: imagemin: { dist: {    src: 'src/image.jpg',    dest: 'dist/image.jpg',    options: {      progressive: false    } } } Disabling the interlaced GIF generation An interlaced GIF is the equivalent of a progressive JPEG in that it allows the contained image to be displayed at a lower resolution before it has been fully downloaded, and increases in quality as the rest of the image is received. By default, the imagemin plugin will generate GIF images in the interlaced format, but this behavior can be disabled by setting the interlaced option to false, as shown in the following example: imagemin: { dist: {    src: 'src/image.gif',    dest: 'dist/image.gif',    options: {      interlaced: false    } } } Specifying SVGO plugins to be used When optimizing SVG images, the SVGO library is used by default. This allows us to specify the use of various plugins provided by the SVGO library that each performs a specific function on the targeted files. Refer to the following URL for more detailed instructions on how to use the svgo plugins options and the SVGO library: https://github.com/sindresorhus/grunt-svgmin#available-optionsplugins Most of the plugins in the library are enabled by default, but if we'd like to specifically indicate which of these should be used, we can do so using the svgoPlugins option. Here, we can provide an array of objects, where each contain a property with the name of the plugin to be affected, followed by a true or false value to indicate whether it should be activated. The following configuration disables three of the default plugins: imagemin: { dist: {    src: 'src/image.svg',    dest: 'dist/image.svg',    options: {      svgoPlugins: [        {removeViewBox:false},        {removeUselessStrokeAndFill:false},        {removeEmptyAttrs:false}      ]    } } } Using the 'imagemin' plugin framework In order to provide support for the various image optimization projects, the imagemin plugin has a plugin framework of its own that allows developers to easily create an extension that makes use of the tool they require. You can get a list of the available plugin modules for the imagemin plugin's framework at the following URL: https://www.npmjs.com/browse/keyword/imageminplugin The following steps will take us through installing and making use of the mozjpeg plugin to compress an image in our project. These steps start where the main recipe takes off. We'll start by installing the imagemin-mozjpeg package using the npm install imagemin-mozjpeg command, which should produce the following output: imagemin-mozjpeg@4.0.0 node_modules/imagemin-mozjpeg With the package installed, we need to import it into our configuration file, so that we can make use of it in our task configuration. We do this by adding the following line at the top of our Gruntfile.js file: var mozjpeg = require('imagemin-mozjpeg'); With the plugin installed and imported, we can now change the configuration of our imagemin task by adding the use option and providing it with the initialized plugin: imagemin: { dist: {    src: 'src/image.jpg',    dest: 'dist/image.jpg',    options: {      use: [mozjpeg()]    } } } Finally, we can test our setup by running the task using the grunt imagemin command. This should produce an output similar to the following: Running "imagemin:dist" (imagemin) task Minified 1 image (saved 9.88 kB) Linting JavaScript code In this recipe, we'll make use of the contrib-jshint (0.11.1) plugin to detect errors and potential problems in our JavaScript code. It is also commonly used to enforce code conventions within a team or project. As can be derived from its name, it's basically a Grunt adaptation for the JSHint tool. Getting ready In this example, we'll work with the basic project structure. How to do it... The following steps take us through creating a sample JavaScript file and configuring a task that will scan and analyze it using the JSHint tool. We'll start by installing the package that contains the contrib-jshint plugin. Next, we'll create a sample JavaScript file called main.js in the src directory, and add the following content in it: sample = 'abc'; console.log(sample); With our sample file ready, we can now add the following jshint task to our configuration. We'll configure this task to target the sample file and also add a basic option that we require for this example: jshint: { main: {    options: {      undef: true    },    src: ['src/main.js'] } } The undef option is a standard JSHint option used specifically for this example and is not required for this plugin to function. Specifying this option indicates that we'd like to have errors raised for variables that are used without being explicitly defined. We can now run the task using the grunt jshint command, which should produce output informing us of the problems found in our sample file: Running "jshint:main" (jshint) task      src/main.js      1 |sample = 'abc';          ^ 'sample' is not defined.      2 |console.log(sample);          ^ 'console' is not defined.      2 |console.log(sample);                      ^ 'sample' is not defined.   >> 3 errors in 1 file There's more... The jshint task provides us with several options that allow us to change its general behavior, in addition to how it analyzes the targeted code. We'll look at how to specify standard JSHint options, specify globally defined variables, send reported output to a file, and prevent task failure on JSHint errors. Specifying standard JSHint options The contrib-jshint plugin provides a simple way to pass all the standard JSHint options from the task's options object to the underlying JSHint tool. A list of all the options provided by the JSHint tool can be found at the following URL: http://jshint.com/docs/options/ The following example adds the curly option to the task we created in our main recipe to enforce the use of curly braces wherever they are appropriate: jshint: { main: {    options: {      undef: true,      curly: true    },    src: ['src/main.js'] } } Specifying globally defined variables Making use of globally defined variables is quite common when working with JavaScript, which is where the globals option comes in handy. Using this option, we can define a set of global values that we'll use in the targeted code, so that errors aren't raised when JSHint encounters them. In the following example, we indicate that the console variable should be treated as a global, and not raise errors when encountered: jshint: { main: {    options: {      undef: true,      globals: {        console: true      }    },    src: ['src/main.js'] } } Sending reported output to a file If we'd like to store the resulting output from our JSHint analysis, we can do so by specifying a path to a file that should receive it using the reporterOutput option, as shown in the following example: jshint: { main: {    options: {      undef: true,      reporterOutput: 'report.dat'    },    src: ['src/main.js'] } } Preventing task failure on JSHint errors The default behavior for the jshint task is to exit the running Grunt process once a JSHint error is encountered in any of the targeted files. This behavior becomes especially undesirable if you'd like to keep watching files for changes, even when an error has been raised. In the following example, we indicate that we'd like to keep the process running when errors are encountered by giving the force option a true value: jshint: { main: {    options: {      undef: true,      force: true    },    src: ['src/main.js'] } } Uglifying JavaScript Code In this recipe, we'll make use of the contrib-uglify (0.8.0) plugin to compress and mangle some files containing JavaScript code. For the most part, the process of uglifying just removes all the unnecessary characters and shortens variable names in a source code file. This has the potential to dramatically reduce the size of the file, slightly increase performance, and make the inner workings of your publicly available code a little more obscure. Getting ready In this example, we'll work with the basic project structure. How to do it... The following steps take us through creating a sample JavaScript file and configuring a task that will uglify it. We'll start by installing the package that contains the contrib-uglify plugin. Then, we can create a sample JavaScript file called main.js in the src directory, which we'd like to uglify, and provide it with the following contents: var main = function () { var one = 'Hello' + ' '; var two = 'World';   var result = one + two;   console.log(result); }; With our sample file ready, we can now add the following uglify task to our configuration, indicating the sample file as the target and providing a destination output file: uglify: { main: {    src: 'src/main.js',    dest: 'dist/main.js' } } We can now run the task using the grunt uglify command, which should produce output similar to the following: Running "uglify:main" (uglify) task >> 1 file created. If we now take a look at the resulting dist/main.js file, we should see that it contains the uglified contents of the original src/main.js file. There's more... The uglify task provides us with several options that allow us to change its general behavior and see how it uglifies the targeted code. We'll look at specifying standard UglifyJS options, generating source maps, and wrapping generated code in an enclosure. Specifying standard UglifyJS options The underlying UglifyJS tool can provide a set of options for each of its separate functional parts. These parts are the mangler, compressor, and beautifier. The contrib-plugin allows passing options to each of these parts using the mangle, compress, and beautify options. The available options for each of the mangler, compressor, and beautifier parts can be found at each of following URLs (listed in the order mentioned): https://github.com/mishoo/UglifyJS2#mangler-options https://github.com/mishoo/UglifyJS2#compressor-options https://github.com/mishoo/UglifyJS2#beautifier-options The following example alters the configuration of the main recipe to provide a single option to each of these parts: uglify: { main: {    src: 'src/main.js',    dest: 'dist/main.js',    options: {      mangle: {        toplevel: true      },      compress: {        evaluate: false      },      beautify: {        semicolons: false      }    } } } Generating source maps As code gets mangled and compressed, it becomes effectively unreadable to humans, and therefore, nearly impossible to debug. For this reason, we are provided with the option of generating a source map when uglifying our code. The following example makes use of the sourceMap option to indicate that we'd like to have a source map generated along with our uglified code: uglify: { main: {    src: 'src/main.js',    dest: 'dist/main.js',    options: {      sourceMap: true    } } } Running the altered task will now, in addition to the dist/main.js file with our uglified source, generate a source map file called main.js.map in the same directory as the uglified file. Wrapping generated code in an enclosure When building your own JavaScript code modules, it's usually a good idea to have them wrapped in a wrapper function to ensure that you don't pollute the global scope with variables that you won't be using outside of the module itself. For this purpose, we can use the wrap option to indicate that we'd like to have the resulting uglified code wrapped in a wrapper function, as shown in the following example: uglify: { main: {    src: 'src/main.js',    dest: 'dist/main.js',    options: {      wrap: true    } } } If we now take a look at the result dist/main.js file, we should see that all the uglified contents of the original file are now contained within a wrapper function. Setting up RequireJS In this recipe, we'll make use of the contrib-requirejs (0.4.4) plugin to package the modularized source code of our web application into a single file. For the most part, this plugin just provides a wrapper for the RequireJS tool. RequireJS provides a framework to modularize JavaScript source code and consume those modules in an orderly fashion. It also allows packaging an entire application into one file and importing only the modules that are required while keeping the module structure intact. Getting ready In this example, we'll work with the basic project structure. How to do it... The following steps take us through creating some files for a sample application and setting up a task that bundles them into one file. We'll start by installing the package that contains the contrib-requirejs plugin. First, we'll need a file that will contain our RequireJS configuration. Let's create a file called config.js in the src directory and add the following content in it: require.config({ baseUrl: 'app' }); Secondly, we'll create a sample module that we'd like to use in our application. Let's create a file called sample.js in the src/app directory and add the following content in it: define(function (require) { return function () {    console.log('Sample Module'); } }); Lastly, we'll need a file that will contain the main entry point for our application, and also makes use of our sample module. Let's create a file called main.js in the src/app directory and add the following content in it: require(['sample'], function (sample) { sample(); }); Now that we've got all the necessary files required for our sample application, we can setup a requirejs task that will bundle it all into one file: requirejs: { app: {    options: {      mainConfigFile: 'src/config.js',      name: 'main',      out: 'www/js/app.js'    } } } The mainConfigFile option points out the configuration file that will determine the behavior of RequireJS. The name option indicates the name of the module that contains the application entry point. In the case of this example, our application entry point is contained in the app/main.js file, and app is the base directory of our application in the src/config.js file. This translates the app/main.js filename into the main module name. The out option is used to indicate the file that should receive the result of the bundled application. We can now run the task using the grunt requirejs command, which should produce output similar to the following: Running "requirejs:app" (requirejs) task We should now have a file named app.js in the www/js directory that contains our entire sample application. There's more... The requirejs task provides us with all the underlying options provided by the RequireJS tool. We'll look at how to use these exposed options and generate a source map. Using RequireJS optimizer options The RequireJS optimizer is quite an intricate tool, and therefore, provides a large number of options to tweak its behavior. The contrib-requirejs plugin allows us to easily set any of these options by just specifying them as options of the plugin itself. A list of all the available configuration options for the RequireJS build system can be found in the example configuration file at the following URL: https://github.com/jrburke/r.js/blob/master/build/example.build.js The following example indicates that the UglifyJS2 optimizer should be used instead of the default UglifyJS optimizer by using the optimize option: requirejs: { app: {    options: {      mainConfigFile: 'src/config.js',      name: 'main',      out: 'www/js/app.js',      optimize: 'uglify2'    } } } Generating a source map When the source code is bundled into one file, it becomes somewhat harder to debug, as you now have to trawl through miles of code to get to the point you're actually interested in. A source map can help us with this issue by relating the resulting bundled file to the modularized structure it is derived from. Simply put, with a source map, our debugger will display the separate files we had before, even though we're actually using the bundled file. The following example makes use of the generateSourceMap option to indicate that we'd like to generate a source map along with the resulting file: requirejs: { app: {    options: {      mainConfigFile: 'src/config.js',      name: 'main',      out: 'www/js/app.js',      optimize: 'uglify2',      preserveLicenseComments: false,      generateSourceMaps: true    } } } In order to use the generateSourceMap option, we have to indicate that UglifyJS2 is to be used for optimization, by setting the optimize option to uglify2, and that license comments should not be preserved, by setting the preserveLicenseComments option to false. Summary This article covers the optimization of images, minifying of CSS, ensuring the quality of our JavaScript code, compressing it, and packaging it all together into one source file. Resources for Article: Further resources on this subject: Grunt in Action [article] So, what is Node.js? [article] Exploring streams [article]
Read more
  • 0
  • 0
  • 1543

article-image-developing-javafx-application-ios
Packt
08 Jul 2015
10 min read
Save for later

Developing a JavaFX Application for iOS

Packt
08 Jul 2015
10 min read
In this article by Mohamed Taman, authors of the book JavaFX Essentials, we will learn how to develop a JavaFX, Apple has a great market share in the mobile and PC/Laptop world, with many different devices, from mobile phones such as the iPhone to musical devices such as the iPod and tablets such as the iPad. (For more resources related to this topic, see here.) It has a rapidly growing application market, called the Apple Store, serving its community, where the number of available apps increases daily. Mobile application developers should be ready for such a market. Mobile application developers targeting both iOS and Android face many challenges. By just comparing the native development environments of these two platforms, you will find that they differ substantially. iOS development, according to Apple, is based on the Xcode IDE (https://developer.apple.com/xcode/) and its programming languages. Traditionally, it was Objetive-C and, in June 2014, Apple introduced Swift (https://developer.apple.com/swift/); on the other hand, Android development, as defined by Google, is based on the Intellij IDEA IDE and the Java programming language. Not many developers are proficient in both environments. In addition, these differences rule out any code reuse between the platforms. JavaFX 8 is filling the gap for reusable code between the platforms, as we will see in this article, by sharing the same application in both platforms. Here are some skills that you will have gained by the end of this article: Installing and configuring iOS environment tools and software Creating iOS JavaFX 8 applications Simulating and debugging JavaFX mobile applications Packaging and deploying applications on iOS mobile devices Using RoboVM to run JavaFX on iOS RoboVM is the bridge from Java to Objetive-C. Using this, it becomes easy to develop JavaFX 8 applications that are to be run on iOS-based devices, as the ultimate goal of the RoboVM project is to solve this problem without compromising on developer experience or app user experience. As we saw in the article about Android, using JavaFXPorts to generate APKs was a relatively easy task due to the fact that Android is based on Java and the Dalvik VM. On the contrary, iOS doesn't have a VM for Java, and it doesn't allow dynamic loading of native libraries. Another approach is required. The RoboVM open source project tries to close the gap for Java developers by creating a bridge between Java and Objective-C using an ahead-of-time compiler that translates Java bytecode into native ARM or x86 machine code. Features Let's go through the RoboVM features: Brings Java and other JVM languages, such as Scala, Clojure, and Groovy, to iOS-based devices Translates Java bytecode into machine code ahead of time for fast execution directly on the CPU without any overhead The main target is iOS and the ARM processor (32- and 64-bit), but there is also support for Mac OS X and Linux running on x86 CPUs (both 32- and 64-bit) Does not impose any restrictions on the Java platform features accessible to the developer, such as reflection or file I/O Supports standard JAR files that let the developer reuse the vast ecosystem of third-party Java libraries Provides access to the full native iOS APIs through a Java-to-Objective-C bridge, enabling the development of apps with truly native UIs and with full hardware access Integrates with the most popular tools such as NetBeans, Eclipse, Intellij IDEA, Maven, and Gradle App Store ready, with hundreds of apps already in the store Limitations Mainly due to the restrictions of the iOS platform, there are a few limitations when using RoboVM: Loading custom bytecode at runtime is not supported. All class files comprising the app have to be available at compile time on the developer machine. The Java Native Interface technology as used on the desktop or on servers usually loads native code from dynamic libraries, but Apple does not permit custom dynamic libraries to be shipped with an iOS app. RoboVM supports a variant of JNI based on static libraries. Another big limitation is that RoboVM is an alpha-state project under development and not yet recommended for production usage. RoboVM has full support for reflection. How it works Since February 2015 there has been an agreement between the companies behind RoboVM and JavaFXPorts, and now a single plugin called jfxmobile-plugin allows us to build applications for three platforms—desktop, Android, and iOS—from the same codebase. The JavaFXMobile plugin adds a number of tasks to your Java application that allow you to create .ipa packages that can be submitted to the Apple Store. Android mostly uses Java as the main development language, so it is easy to merge your JavaFX 8 code with it. On iOS, the situation is internally totally different—but with similar Gradle commands. The plugin will download and install the RoboVM compiler, and it will use RoboVM compiler commands to create an iOS application in build/javafxports/ios. Getting started In this section, you will learn how to install the RoboVM compiler using the JavaFXMobile plugin, and make sure the tool chain works correctly by reusing the same application, Phone Dial version 1.0. Prerequisites In order to use the RoboVM compiler to build iOS apps, the following tools are required: Gradle 2.4 or higher is required to build applications with the jfxmobile plugin. A Mac running Mac OS X 10.9 or later. Xcode 6.x from the Mac App Store (https://itunes.apple.com/us/app/xcode/id497799835?mt=12). The first time you install Xcode, and every time you update to a new version, you have to open it once to agree to the Xcode terms. Preparing a project for iOS We will reuse the project we developed before, for the Android platform, since there is no difference in code, project structure, or Gradle build script when targeting iOS. They share the same properties and features, but with different Gradle commands that serve iOS development, and a minor change in the Gradle build script for the RoboVM compiler. Therefore, we will see the power of WORA Write Once, Run Everywhere with the same application. Project structure Based on the same project structure from the Android, the project structure for our iOS app should be as shown in the following figure: The application We are going to reuse the same application from the Phone DialPad version 2.0 JavaFX 8 application: As you can see, reusing the same codebase is a very powerful and useful feature, especially when you are developing to target many mobile platforms such as iOS and Android at the same time. Interoperability with low-level iOS APIs To have the same functionality of natively calling the default iOS phone dialer from our application as we did with Android, we have to provide the native solution for iOS as the following IosPlatform implementation: import org.robovm.apple.foundation.NSURL; import org.robovm.apple.uikit.UIApplication; import packt.taman.jfx8.ch4.Platform;   public class IosPlatform implements Platform {   @Override public void callNumber(String number) {    if (!number.equals("")) {      NSURL nsURL = new NSURL("telprompt://" + number);      UIApplication.getSharedApplication().openURL(nsURL);    } } } Gradle build files We will use the Gradle build script file, but with a minor change by adding the following lines to the end of the script: jfxmobile { ios {    forceLinkClasses = [ 'packt.taman.jfx8.ch4.**.*' ] } android {    manifest = 'lib/android/AndroidManifest.xml' } } All the work involved in installing and using robovm compilers is done by the jfxmobile plugin. The purpose of those lines is to give the RoboVM compiler the location of the main application class that has to be loaded at runtime is, as it is not visible by default to the compiler. The forceLinkClasses property ensures that those classes are linked in during RoboVM compilation. Building the application After we have added the necessary configuration set to build the script for iOS, its time to build the application in order to deploy it to different iOS target devices. To do so, we have to run the following command: $ gradle build We should have the following output: BUILD SUCCESSFUL   Total time: 44.74 secs We have built our application successfully; next, we need to generate the .ipa and, in the case of production, you have to test it by deploying it to as many iOS versions as you can. Generating the iOS .ipa package file In order to generate the final .ipa iOS package for our JavaFX 8 application, which is necessary for the final distribution to any device or the AppStore, you have to run the following gradle command: gradle ios This will generate the .ipa file in the directory build/javafxports/ios. Deploying the application During development, we need to check our application GUI and final application prototype on iOS simulators and measure the application performance and functionality on different devices. These procedures are very useful, especially for testers. Let's see how it is a very easy task to run our application on either simulators or on real devices. Deploying to a simulator On a simulator, you can simply run the following command to check if your application is running: $ gradle launchIPhoneSimulator This command will package and launch the application in an iPhone simulator as shown in the following screenshot: DialPad2 JavaFX 8 application running on the iOS 8.3/iPhone 4s simulator This command will launch the application in an iPad simulator: $ gradle launchIPadSimulator Deploying to an Apple device In order to package a JavaFX 8 application and deploy it to an Apple device, simply run the following command: $ gradle launchIOSDevice This command will launch the JavaFX 8 application in the device that is connected to your desktop/laptop. Then, once the application is launched on your device, type in any number and then tap Call. The iPhone will ask for permission to dial using the default mobile dialer; tap on Ok. The default mobile dialer will be launched and will the number as shown in the following figure: To be able to test and deploy your apps on your devices, you will need an active subscription with the Apple Developer Program. Visit the Apple Developer Portal, https://developer.apple.com/register/index.action, to sign up. You will also need to provision your device for development. You can find information on device provisioning in the Apple Developer Portal, or follow this guide: http://www.bignerdranch.com/we-teach/how-to-prepare/ios-device-provisioning/. Summary This article gave us a very good understanding of how JavaFX-based applications can be developed and customized using RoboVM for iOS to make it possible to run your applications on Apple platforms. You learned about RoboVM features and limitations, and how it works; you also gained skills that you can use for developing. You then learned how to install the required software and tools for iOS development and how to enable Xcode along with the RoboVM compiler, to package and install the Phone Dial JavaFX-8-based application on OS simulators. Finally, we provided tips on how to run and deploy your application on real devices. Resources for Article: Further resources on this subject: Function passing [article] Creating Java EE Applications [article] Contexts and Dependency Injection in NetBeans [article]
Read more
  • 0
  • 0
  • 10071
Packt
08 Jul 2015
8 min read
Save for later

Zabbix and I – Almost Heroes

Packt
08 Jul 2015
8 min read
In this article written by Luciano Alves, author of the book Zabbix Performance Tuning, the author explains that ever since he started working with IT infrastructure, he's been noticing that almost every company, when they start thinking about a monitoring tool, think of trying to know in some way when the system or service will go down before it actually happens. They expect the monitoring tool to create some kind of alert when something is broken. But by this approach, the system administrator will know about an error or system outage only after the error occurs (and maybe, at the same time, users are trying to use those systems). We need a monitoring solution to help us predict system outages and any other situation that our services can be affected by. Our approach with monitoring tools should cover not only our system monitoring but also our business monitoring. Nowadays, any company (small, medium, or large) has some dependency on technologies, from servers and network assets to IP equipment with a lower environmental impact. Maybe you need security cameras, thermometers, UPS, access control devices, or any other IP device by which you can gather some useful data. What about applications and services? What about data integration or transactions? What about user experience? What about a supplier website or system that you depend on? We should realize that monitoring things is not restricted to IT infrastructure, and it can be extended to other areas and business levels as well. (For more resources related to this topic, see here.) After starting Zabbix – the initial steps Suppose you already have your Zabbix server up and running. In a few weeks, Zabbix has helped you save a lot of time while restoring systems. It has also helped you notice some hidden things in your environment—maybe a flapping port in a network switch, or lack of CPU in a router. In a few months, Zabbix and you (of course) are like superstars. During lunch, people are talking about you. Some are happy because you've dealt with a recurring error. Maybe, a manager asks you to find a way to monitor a printer because it's very important to their team, another manager asks you to monitor an application, and so on. The other teams and areas also need some kind of monitoring. They have other things to monitor, not only IT things. But are these people familiar with technical things? Technical words, expressions, flows, and lines of thoughts are not so easy for people with nontechnical backgrounds to understand. Of course, in small and medium enterprises (SME), things will go ahead faster and paths will be shorter, but the scenario is not too different in most cases. You can work alone or in a huge team, but now you have another important partner—Zabbix. An immutable fact is that monitoring things comes with more and more responsibility and reliability. At this point, we have some new issues to solve: How do we create and authenticate a user? When Zabbix's visibility starts growing in your environment, you will need to think how to manage and handle these users. Do you have an LDAP or Microsoft Active Directory that you can use for centralized authentication? Of course, depending on the users you have, you will have more requests. Will you permit any user to access the Zabbix interface? Only a few? And which ones? Is it necessary to create a custom monitor? We know that Zabbix has a lot of built-in keys for gathering data. These keys are available for a good number of operating systems. We also have built-in functions used to gather data using the Intelligent Platform Management Interface (IPMI), Simple Network Management Protocol (SNMP), Open Database Connectivity (ODBC), Java Management Extensions (JMX), user parameters in the Zabbix agent, and so on. However, we need to think about a wide scenario where we need to gather data from somewhere Zabbix hasn't reached yet. Our experience shows us that most of the time, it is necessary to create custom monitors (not one, but a lot of them). Zabbix is a very flexible and easy-to-customize platform. It is possible to make Zabbix do anything you want. However, to learn every new function or to monitor Zabbix, you'll need to think about what kind of extension you'll use. More functions, more data, more load, and more TCP connections! This means that when other teams or areas start putting light on Zabbix, you will need to think about the number of new functions or monitors you will need to get. Then, which language to choose to develop these new things? Maybe you know the C language and you are thinking of using Zabbix modules. Will you use bulk operations to avoid network traffic? The natural growth In most scenarios, natural growth will occur without control. I mean, people are not used to planning this growth. It is very important to keep it under control. When some guys start their Zabbix deployment, they probably do not intend to cater to all company teams, areas, or businesses. They think about their needs and their team only. So, they don't think a lot about user rights, mainly because they are technicians and know mostly about hosts, items, triggers, maps, graphs, screens, and so on. What about users who are not technicians? Will they understand the Zabbix interface easily? Do you know that in Zabbix, we have a lot of paths that reach the same point? The Zabbix interface isn't object-based, which means that users need a lot of clicks to reach (read or write) the information related to an object (hosts, items, graphs, triggers, events, and so on). If you need to see the most recent data gathered from a specific item, you'll need to use the Monitoring menu, then use the Latest data menu, choose the group that the host belongs to, choose your host, and finally search for your item in the table. If you need to see a specific custom graph, use the Graphs menu, which is under Monitoring. Choose the group that the hosts belong to, choose your host, and then search for your graph in a combobox. If you need to know about an active trigger in your host, you'll need to use the Triggers menu, which is under Monitoring. Choose the group that your host belongs to and choose your host. Then, you can see the triggers from that specific host. If you want to include a new item in an existing custom graph, you'll need to access the Hosts menu, which is under Configuration. Choose the group that the hosts belong to, search for your host, and click on the Graphs link. Then you can choose which graph you want to change. There are a lot of clicks required to do simple things. Of course, the steps you just saw are something familiar for guys who have deployed Zabbix, but is this true for other teams too? Maybe, you are thinking right now that it doesn't matter to those guys. But actually, it matters, and it's directly related to Zabbix's growth in your environment. Okay, I think the next two questions will be: are you sure it matters? And why? Let's agree that the actual Zabbix interface isn't very user friendly for nontechnical guys. But according to the path of natural growth, you started gathering data from a lot of things that are not just IT related. Also, you can develop custom charts and any data from Zabbix via API functions. Now you'll have a lot of nontechnical guys trying to use Zabbix data. I'm sure that it will be necessary to create some maps and screens to help these users get the required information quickly and smoothly. The following screenshots show how we can transform the viewing layer of Zabbix into something more attractive: Tactical dashboard Here is what a strategic dashboard may look like: Strategic dashboard The point here is whether your Zabbix deployment is prepared to cater to these types of requirements. Summary We've noticed how Zabbix has evolved in terms of performance issues with each version. Also, you realized the importance of the need to be aware of its new features. Another significant point was to realize that the importance of Zabbix is growing, as the other teams and areas of the company are now aware of the potential of this tool. This movement will take Zabbix to all the corners of a company, which often requires a more open approach as far as monitoring tasks is concerned. Monitoring only servers and network assets will not suffice. Resources for Article: Further resources on this subject: Going beyond Zabbix agents [article] Understanding Self-tuning Thresholds [article] Query Performance Tuning [article]
Read more
  • 0
  • 0
  • 6006

article-image-integrating-google-play-services
Packt
08 Jul 2015
41 min read
Save for later

Integrating Google Play Services

Packt
08 Jul 2015
41 min read
In this article Integrating Google Play Services by Raul Portales, author of the book Mastering Android Game Development, we will cover the tools that Google Play Services offers for game developers. We'll see the integration of achievements and leaderboards in detail, take an overview of events and quests, save games, and use turn-based and real-time multiplaying. Google provides Google Play Services as a way to use special features in apps. Being the game services subset the one that interests us the most. Note that Google Play Services are updated as an app that is independent from the operating system. This allows us to assume that most of the players will have the latest version of Google Play Services installed. (For more resources related to this topic, see here.) More and more features are being moved from the Android SDK to the Play Services because of this. Play Services offer much more than just services for games, but there is a whole section dedicated exclusively to games, Google Play Game Services (GPGS). These features include achievements, leaderboards, quests, save games, gifts, and even multiplayer support. GPGS also comes with a standalone app called "Play Games" that shows the user the games he or she has been playing, the latest achievements, and the games his or her friends play. It is a very interesting way to get exposure for your game. Even as a standalone feature, achievements and leaderboards are two concepts that most games use nowadays, so why make your own custom ones when you can rely on the ones made by Google? GPGS can be used on many platforms: Android, iOS and web among others. It is more used on Android, since it is included as a part of Google apps. There is extensive step-by-step documentation online, but the details are scattered over different places. We will put them together here and link you to the official documentation for more detailed information. For this article, you are supposed to have a developer account and have access to the Google Play Developer Console. It is also advisable for you to know the process of signing and releasing an app. If you are not familiar with it, there is very detailed official documentation at http://developer.android.com/distribute/googleplay/start.html. There are two sides of GPGS: the developer console and the code. We will alternate from one to the other while talking about the different features. Setting up the developer console Now that we are approaching the release state, we have to start working with the developer console. The first thing we need to do is to get into the Game services section of the console to create and configure a new game. In the left menu, we have an option labeled Game services. This is where you have to click. Once in the Game services section, click on Add new game: This bring us to the set up dialog. If you are using other Google services like Google Maps or Google Cloud Messaging (GCM) in your game, you should select the second option and move forward. Otherwise, you can just fill in the fields for I don't use any Google APIs on my game yet and continue. If you don't know whether you are already using them, you probably aren't. Now, it is time to link a game to it. I recommend you publish your game beforehand as an alpha release. This will let you select it from the list when you start typing the package name. Publishing the game to the alpha channel before adding it to Game services makes it much easier to configure. If you are not familiar with signing and releasing your app, check out the official documentation at http://developer.android.com/tools/publishing/app-signing.html. Finally, there are only two steps that we have to take when we link the first app. We need to authorize it and provide branding information. The authorization will generate an OAuth key—that we don't need to use since it is required for other platforms—and also a game ID. This ID is unique to all the linked apps and we will need it to log in. But there is no need to write it down now, it can be found easily in the console at anytime. Authorizing the app will generate the game ID, which is unique to all linked apps. Note that the app we have added is configured with the release key. If you continue and try the login integration, you will get an error telling you that the app was signed with the wrong certificate: You have two ways to work with this limitation: Always make a release build to test GPGS integration Add your debug-signed game as a linked app I recommend that you add the debug signed app as a linked app. To do this, we just need to link another app and configure it with the SHA1 fingerprint of the debug key. To obtain it, we have to open a terminal and run the keytool utility: keytool -exportcert -alias androiddebugkey -keystore <path-to-debug-keystore> -list -v Note that in Windows, the debug keystore can be found at C:Users<USERNAME>.androiddebug.keystore. On Mac and Linux, the debug keystore is typically located at ~/.android/debug.keystore. Dialog to link the debug application on the Game Services console Now, we have the game configured. We could continue creating achievements and leaderboards in the console, but we will put it aside and make sure that we can sign in and connect with GPGS. The only users who can sign in to GPGS while a game is not published are the testers. You can make the alpha and/or beta testers of a linked app become testers of the game services, and you can also add e-mail addresses by hand for this. You can modify this in the Testing tab. Only test accounts can access a game that is not published. The e-mail of the owner of the developer console is prefilled as a tester. Just in case you have problems logging in, double-check the list of testers. A game service that is not published will not appear in the feed of the Play Services app, but it will be possible to test and modify it. This is why it is a good idea to keep it in draft mode until the game itself is ready and publish both the game and the game services at the same time. Setting up the code The first thing we need to do is to add the Google Play Services library to our project. This should already have been done by the wizard when we created the project, but I recommend you to double-check it now. The library needs to be added to the build.gradle file of the main module. Note that Android Studio projects contain a top-level build.gradle and a module-level build.gradle for each module. We will modify the one that is under the mobile module. Make sure that the play services' library is listed under dependencies: apply plugin: 'com.android.application'     dependencies { compile 'com.android.support:appcompat-v7:22.1.1' compile 'com.google.android.gms:play-services:7.3.0' } At the point of writing, the latest version is 7.3.0. The basic features have not changed much and they are unlikely to change. You could force Gradle to use a specific version of the library, but in general I recommend you use the latest version. Once you have it, save the changes and click on Sync Project with Gradle Files. To be able to connect with GPGS, we need to let the game know what the game ID is. This is done through the <meta-data> tag on AndroidManifest.xml. You could hardcode the value here, but it is highly recommended that you set it as a resource in your Android project. We are going to create a new file for this under res/values, which we will name play_services.xml. In this file we will put the game ID, but later we will also have the achievements and leaderboard IDs in it. Using a separate file for these values is recommended because they are constants that do not need to be translated: <application> <meta-data android_name="com.google.android.gms.games.APP_ID" android_value="@string/app_id" /> <meta-data android_name="com.google.android.gms.version" android_value="@integer/google_play_services_version"/> [...] </application> Adding this metadata is extremely important. If you forget to update the AndroidManifest.xml, the app will crash when you try to sign in to Google Play services. Note that the integer for the gms version is defined in the library and we do not need to add it to our file. If you forget to add the game ID to the strings the app will crash. Now, it is time to proceed to sign in. The process is quite tedious and requires many checks, so Google has released an open source project named BaseGameUtils, which makes it easier. Unfortunately this project is not a part of the play services' library and it is not even available as a library. So, we have to get it from GitHub (either check it out or download the source as a ZIP file). BaseGameUtils abstracts us from the complexity of handling the connection with Play Services. Even more cumbersome, BaseGameUtils is not available as a standalone download and has to be downloaded together with another project. The fact that this significant piece of code is not a part of the official library makes it quite tedious to set up. Why it has been done like this is something that I do not comprehend myself. The project that contains BaseGameUtils is called android-basic-samples and it can be downloaded from https://github.com/playgameservices/android-basic-samples. Adding BaseGameUtils is not as straightforward as we would like it to be. Once android-basic-samples is downloaded, open your game project in Android Studio. Click on File > Import Module and navigate to the directory where you downloaded android-basic-samples. Select the BaseGameUtils module in the BasicSamples/libraries directory and click on OK. Finally, update the dependencies in the build.gradle file for the mobile module and sync gradle again: dependencies { compile project(':BaseGameUtils') [...] } After all these steps to set up the project, we are finally ready to begin the sign in. We will make our main Activity extend from BaseGamesActivity, which takes care of all the handling of the connections, and sign in with Google Play Services. One more detail: until now, we were using Activity and not FragmentActivity as the base class for YassActivity (BaseGameActivity extends from FragmentActivity) and this change will mess with the behavior of our dialogs while calling navigateBack. We can change the base class of BaseGameActivity or modify navigateBack to perform a pop-on fragment navigation hierarchy. I recommend the second approach: public void navigateBack() { // Do a pop on the navigation history getFragmentManager().popBackStack(); } This util class has been designed to work with single-activity games. It can be used in multiple activities, but it is not straightforward. This is another good reason to keep the game in a single activity. The BaseGameUtils is designed to be used in single-activity games. The default behavior of BaseGameActivity is to try to log in each time the Activity is started. If the user agrees to sign in, the sign in will happen automatically. But if the user rejects doing so, he or she will be asked again several times. I personally find this intrusive and annoying, and I recommend you to only prompt to log in to Google Play services once (and again, if the user logs out). We can always provide a login entry point in the app. This is very easy to change. The default number of attempts is set to 3 and it is a part of the code of GameHelper: // Should we start the flow to sign the user in automatically on   startup? If // so, up to // how many times in the life of the application? static final int DEFAULT_MAX_SIGN_IN_ATTEMPTS = 3; int mMaxAutoSignInAttempts = DEFAULT_MAX_SIGN_IN_ATTEMPTS; So, we just have to configure it for our activity, adding one line of code during onCreate to change the default behavior with the one we want: just try it once: getGameHelper().setMaxAutoSignInAttempts(1); Finally, there are two methods that we can override to act when the user successfully logs in and when there is a problem: onSignInSucceeded and onSignInFailed. We will use them when we update the main menu at the end of the article. Further use of GPGS is to be made via the GameHelper and/or the GoogleApiClient, which is a part of the GameHelper. We can obtain a reference to the GameHelper using the getGameHelper method of BaseGameActivity. Now that the user can sign into Google Play services we can continue with achievements and leaderboards. Let's go back to the developer console. Achievements We will first define a few achievements in the developer console and then see how to unlock them in the game. Note that to publish any game with GPGS, you need to define at least five achievements. No other feature is mandatory, but achievements are. We need to define at least five achievements to publish a game with Google Play Game services. If you want to use GPGS with a game that has no achievements, I recommend you to add five dummy secret achievements and let them be. To add an achievement, we just need to navigate to the Achievements tab on the left and click on Add achievement: The menu to add a new achievement has a few fields that are mostly self-explanatory. They are as follows: Name: the name that will be shown (can be localized to different languages). Description: the description of the achievement to be shown (can also be localized to different languages). Icon: the icon of the achievement as a 512x512 px PNG image. This will be used to show the achievement in the list and also to generate the locked image and the in-game popup when it is unlocked. Incremental achievements: if the achievement requires a set of steps to be completed, it is called an incremental achievement and can be shown with a progress bar. We will have an incremental achievement to illustrate this. Initial state: Revealed/Hidden depending on whether we want the achievement to be shown or not. When an achievement is shown, the name and description are visible, players know what they have to do to unlock it. A hidden achievement, on the other hand, is a secret and can be a funny surprise when unlocked. We will have two secret achievements. Points: GPGS allows each game to have 1,000 points to give for unlocking achievements. This gets converted to XP in the player profile on Google Play games. This can be used to highlight that some achievements are harder than others, and therefore grant a bigger reward. You cannot change these once they are published, so if you plan to have more achievements in the future, plan ahead with the points. List order: The order of the achievements is shown. It is not followed all the time, since on the Play Games app the unlocked ones are shown before the locked ones. It is still handy to rearrange them. Dialog to add an achievement on the developer console As we already decided, we will have five achievements in our game and they will be as follows: Big Score: score over 100,000 points in one game. This is to be granted while playing. Asteroid killer: destroy 100 asteroids. This will count them across different games and is an incremental achievement. Survivor: survive for 60 seconds. Target acquired: a hidden achievement. Hit 20 asteroids in a row without missing a hit. This is meant to reward players that only shoot when they should. Target lost: this is supposed to be a funny achievement, granted when you miss with 10 bullets in a row. It is also hidden, because otherwise it would be too easy to unlock. So, we created some images for them and added them to the console. The developer console with all the configured achievements Each achievement has a string ID. We will need these ids to unlock the achievements in our game, but Google has made it easy for us. We have a link at the bottom named Get resources that pops up a dialog with the string resources we need. We can just copy them from there and paste them in our project in the play_services.xml file we have already created. Architecture For our game, given that we only have five achievements, we are going to add the code for achievements directly into the ScoreObject. This will make it less code for you to read so we can focus on how it is done. However, for a real production code I recommend you define a dedicated architecture for achievements. The recommended architecture is to have an AchievementsManager class that loads all the achievements when the game starts and stores them in three lists: All achievements Locked achievements Unlocked achievements Then, we have an Achievement base class with an abstract check method that we implement for each one of them: public boolean check (GameEngine gameEngine, GameEvent gameEvent) { } This base class takes care of loading the achievement state from local storage (I recommend using SharedPreferences for this) and modify it as per the result of check. The achievements check is done at AchievementManager level using a checkLockedAchievements method that iterates over the list of achievements that can be unlocked. This method should be called as a part of onEventReceived of GameEngine. This architecture allows you to check only the achievements that are yet to be unlocked and also all the achievements included in the game in a specific dedicated place. In our case, since we are keeping the score inside the ScoreGameObject, we are going to add all achievements code there. Note that making the GameEngine take care of the score and having it as a variable that other objects can read are also recommended design patterns, but it was simpler to do this as a part of ScoreGameObject. Unlocking achievements To handle achievements, we need to have access to an object of the class GoogleApiClient. We can get a reference to it in the constructor of ScoreGameObject: private final GoogleApiClient mApiClient;   public ScoreGameObject(YassBaseFragment parent, View view, int viewResId) { […] mApiClient =  parent.getYassActivity().getGameHelper().getApiClient(); } The parent Fragment has a reference to the Activity, which has a reference to the GameHelper, which has a reference to the GoogleApiClient. Unlocking an achievement requires just a single line of code, but we also need to check whether the user is connected to Google Play services or not before trying to unlock an achievement. This is necessary because if the user has not signed it, an exception is thrown and the game crashes. Unlocking an achievement requires just a single line of code. But this check is not enough. In the edge case, when the user logs out manually from Google Play services (which can be done in the achievements screen), the connection will not be closed and there is no way to know whether he or she has logged out. We are going to create a utility method to unlock the achievements that does all the checks and also wraps the unlock method into a try/catch block and make the API client disconnect if an exception is raised: private void unlockSafe(int resId) { if (mApiClient.isConnecting() || mApiClient.isConnected()) {    try {      Games.Achievements.unlock(mApiClient, getString(resId));    } catch (Exception e) {      mApiClient.disconnect();    } } } Even with all the checks, the code is still very simple. Let's work on the particular achievements we have defined for the game. Even though they are very specific, the methodology to track game events and variables and then check for achievements to unlock is in itself generic, and serves as a real-life example of how to deal with achievements. The achievements we have designed require us to count some game events and also the running time. For the last two achievements, we need to make a new GameEvent for the case when a bullet misses, which we have not created until now. The code in the Bullet object to trigger this new GameEvent is as follows: @Override public void onUpdate(long elapsedMillis, GameEngine gameEngine) { mY += mSpeedFactor * elapsedMillis; if (mY < -mHeight) {    removeFromGameEngine(gameEngine);    gameEngine.onGameEvent(GameEvent.BulletMissed); } } Now, let's work inside ScoreGameObject. We are going to have a method that checks achievements each time an asteroid is hit. There are three achievements that can be unlocked when that event happens: Big score, because hitting an asteroid gives us points Target acquired, because it requires consecutive asteroid hits Asteroid killer, because it counts the total number of asteroids that have been destroyed The code is like this: private void checkAsteroidHitRelatedAchievements() { if (mPoints > 100000) {    // Unlock achievement    unlockSafe(R.string.achievement_big_score); } if (mConsecutiveHits >= 20) {    unlockSafe(R.string.achievement_target_acquired); } // Increment achievement of asteroids hit if (mApiClient.isConnecting() || mApiClient.isConnected()) {    try {      Games.Achievements.increment(mApiClient, getString(R.string.achievement_asteroid_killer), 1);    } catch (Exception e) {      mApiClient.disconnect();    } } } We check the total points and the number of consecutive hits to unlock the corresponding achievements. The "Asteroid killer" achievement is a bit of a different case, because it is an incremental achievement. These type of achievements do not have an unlock method, but rather an increment method. Each time we increment the value, progress on the achievement is updated. Once the progress is 100 percent, it is unlocked automatically. Incremental achievements are automatically unlocked, we just have to increment their value. This makes incremental achievements much easier to use than tracking the progress locally. But we still need to do all the checks as we did for unlockSafe. We are using a variable named mConsecutiveHits, which we have not initialized yet. This is done inside onGameEvent, which is the place where the other hidden achievement target lost is checked. Some initialization for the "Survivor" achievement is also done here: public void onGameEvent(GameEvent gameEvent) { if (gameEvent == GameEvent.AsteroidHit) {    mPoints += POINTS_GAINED_PER_ASTEROID_HIT;    mPointsHaveChanged = true;    mConsecutiveMisses = 0;    mConsecutiveHits++;    checkAsteroidHitRelatedAchievements(); } else if (gameEvent == GameEvent.BulletMissed) {    mConsecutiveMisses++;    mConsecutiveHits = 0;    if (mConsecutiveMisses >= 20) {      unlockSafe(R.string.achievement_target_lost);    } } else if (gameEvent == GameEvent.SpaceshipHit) {    mTimeWithoutDie = 0; } […] } Each time we hit an asteroid, we increment the number of consecutive asteroid hits and reset the number of consecutive misses. Similarly, each time we miss a bullet, we increment the number of consecutive misses and reset the number of consecutive hits. As a side note, each time the spaceship is destroyed we reset the time without dying, which is used for "Survivor", but this is not the only time when the time without dying should be updated. We have to reset it when the game starts, and modify it inside onUpdate by just adding the elapsed milliseconds that have passed: @Override public void startGame(GameEngine gameEngine) { mTimeWithoutDie = 0; […] }   @Override public void onUpdate(long elapsedMillis, GameEngine gameEngine) { mTimeWithoutDie += elapsedMillis; if (mTimeWithoutDie > 60000) {    unlockSafe(R.string.achievement_survivor); } } So, once the game has been running for 60,000 milliseconds since it started or since a spaceship was destroyed, we unlock the "Survivor" achievement. With this, we have all the code we need to unlock the achievements we have created for the game. Let's finish this section with some comments on the system and the developer console: As a rule of thumb, you can edit most of the details of an achievement until you publish it to production. Once your achievement has been published, it cannot be deleted. You can only delete an achievement in its prepublished state. There is a button labeled Delete at the bottom of the achievement screen for this. You can also reset the progress for achievements while they are in draft. This reset happens for all players at once. There is a button labeled Reset achievement progress at the bottom of the achievement screen for this. Also note that GameBaseActivity does a lot of logging. So, if your device is connected to your computer and you run a debug build, you may see that it lags sometimes. This does not happen in a release build for which the log is removed. Leaderboards Since YASS has only one game mode and one score in the game, it makes sense to have only one leaderboard on Google Play Game Services. Leaderboards are managed from their own tab inside the Game services area of the developer console. Unlike achievements, it is not mandatory to have any leaderboard to be able to publish your game. If your game has different levels of difficulty, you can have a leaderboard for each of them. This also applies if the game has several values that measure player progress, you can have a leaderboard for each of them. Managing leaderboards on Play Games console Leaderboards can be created and managed in the Leaderboards tag. When we click on Add leaderboard, we are presented with a form that has several fields to be filled. They are as follows: Name: the display name of the leaderboard, which can be localized. We will simply call it High Scores. Score formatting: this can be Numeric, Currency, or Time. We will use Numeric for YASS. Icon: a 512x512 px icon to identify the leaderboard. Ordering: Larger is better / Smaller is better. We are going to use Larger is better, but other score types may be Smaller is better as in a racing game. Enable tamper protection: this automatically filters out suspicious scores. You should keep this on. Limits: if you want to limit the score range that is shown on the leaderboard, you can do it here. We are not going to use this List order: the order of the leaderboards. Since we only have one, it is not really important for us. Setting up a leaderboard on the Play Games console Now that we have defined the leaderboard, it is time to use it in the game. As happens with achievements, we have a link where we can get all the resources for the game in XML. So, we proceed to get the ID of the leaderboard and add it to the strings defined in the play_services.xml file. We have to submit the scores at the end of the game (that is, a GameOver event), but also when the user exits a game via the pause button. To unify this, we will create a new GameEvent called GameFinished that is triggered after a GameOver event and after the user exits the game. We will update the stopGame method of GameEngine, which is called in both cases to trigger the event: public void stopGame() { if (mUpdateThread != null) {    synchronized (mLayers) {      onGameEvent(GameEvent.GameFinished);    }    mUpdateThread.stopGame();  mUpdateThread = null; } […] } We have to set the updateThread to null after sending the event, to prevent this code being run twice. Otherwise, we could send each score more than once. Similarly, as happens for achievements, submitting a score is very simple, just a single line of code. But we also need to check that the GoogleApiClient is connected and we still have the same edge case when an Exception is thrown. So, we need to wrap it in a try/catch block. To keep everything in the same place, we will put this code inside ScoreGameObject: @Override public void onGameEvent(GameEvent gameEvent) { […] else if (gameEvent == GameEvent.GameFinished) {    // Submit the score    if (mApiClient.isConnecting() || mApiClient.isConnected()) {      try {        Games.Leaderboards.submitScore(mApiClient,          getLeaderboardId(), mPoints);      }      catch (Exception e){        mApiClient.disconnect();      }    } } }   private String getLeaderboardId() { return mParent.getString(R.string.leaderboard_high_scores); } This is really straightforward. GPGS is now receiving our scores and it takes care of the timestamp of the score to create daily, weekly, and all time leaderboards. It also uses your Google+ circles to show the social score of your friends. All this is done automatically for you. The final missing piece is to let the player open the leaderboards and achievements UI from the main menu as well as trigger a sign in if they are signed out. Opening the Play Games UI To complete the integration of achievements and leaderboards, we are going to add buttons to open the native UI provided by GPGS to our main menu. For this, we are going to place two buttons in the bottom–left corner of the screen, opposite the music and sound buttons. We will also check whether we are connected or not; if not, we will show a single sign-in button. For these buttons we will use the official images of GPGS, which are available for developers to use. Note that you must follow the brand guidelines while using the icons and they must be displayed as they are and not modified. This also provides a consistent look and feel across all the games that support Play Games. Since we have seen a lot of layouts already, we are not going to include another one that is almost the same as something we already have. The main menu with the buttons to view achievements and leaderboards. To handle these new buttons we will, as usual, set the MainMenuFragment as OnClickListener for the views. We do this in the same place as the other buttons, that is, inside onViewCreated: @Override public void onViewCreated(View view, Bundle savedInstanceState) { super.onViewCreated(view, savedInstanceState); [...] view.findViewById(    R.id.btn_achievements).setOnClickListener(this); view.findViewById(    R.id.btn_leaderboards).setOnClickListener(this); view.findViewById(R.id.btn_sign_in).setOnClickListener(this); } As happened with achievements and leaderboards, the work is done using static methods that receive a GoogleApiClient object. We can get this object from the GameHelper that is a part of the BaseGameActivity, like this: GoogleApiClient apiClient = getYassActivity().getGameHelper().getApiClient(); To open the native UI, we have to obtain an Intent and then start an Activity with it. It is important that you use startActivityForResult, since some data is passed back and forth. To open the achievements UI, the code is like this: Intent achievementsIntent = Games.Achievements.getAchievementsIntent(apiClient); startActivityForResult(achievementsIntent, REQUEST_ACHIEVEMENTS); This works out of the box. It automatically grays out the icons for the unlocked achievements, adds a counter and progress bar to the one that is in progress, and a padlock to the hidden ones. Similarly, to open the leaderboards UI we obtain an intent from the Games.Leaderboards class instead: Intent leaderboardsIntent = Games.Leaderboards.getLeaderboardIntent( apiClient, getString(R.string.leaderboard_high_scores)); startActivityForResult(leaderboardsIntent, REQUEST_LEADERBOARDS); In this case, we are asking for a specific leaderboard, since we only have one. We could use getLeaderboardsIntent instead, which will open the Play Games UI for the list of all the leaderboards. We can have an intent to open the list of leaderboards or a specific one. What remains to be done is to replace the buttons for the login one when the user is not connected. For this, we will create a method that reads the state and shows and hides the views accordingly: private void updatePlayButtons() { GameHelper gameHelper = getYassActivity().getGameHelper(); if (gameHelper.isConnecting() || gameHelper.isSignedIn()) {    getView().findViewById(      R.id.btn_achievements).setVisibility(View.VISIBLE);    getView().findViewById(      R.id.btn_leaderboards).setVisibility(View.VISIBLE);    getView().findViewById(      R.id.btn_sign_in).setVisibility(View.GONE); } else {    getView().findViewById(      R.id.btn_achievements).setVisibility(View.GONE);    getView().findViewById(      R.id.btn_leaderboards).setVisibility(View.GONE);    getView().findViewById(      R.id.btn_sign_in).setVisibility(View.VISIBLE); } } This method decides whether to remove or make visible the views based on the state. We will call it inside the important state-changing methods: onLayoutCompleted: the first time we open the game to initialize the UI. onSignInSucceeded: when the user successfully signs in to GPGS. onSignInFailed: this can be triggered when we auto sign in and there is no connection. It is important to handle it. onActivityResult: when we come back from the Play Games UI, in case the user has logged out. But nothing is as easy as it looks. In fact, when the user signs out and does not exit the game, GoogleApiClient keeps the connection open. Therefore the value of isSignedIn from GameHelper still returns true. This is the edge case we have been talking about all through the article. As a result of this edge case, there is an inconsistency in the UI that shows the achievements and leaderboards buttons when it should show the login one. When the user logs out from Play Games, GoogleApiClient keeps the connection open. This can lead to confusion. Unfortunately, this has been marked as work as expected by Google. The reason is that the connection is still active and it is our responsibility to parse the result in the onActivityResult method to determine the new state. But this is not very convenient. Since it is a rare case we will just go for the easiest solution, which is to wrap it in a try/catch block and make the user sign in if he or she taps on leaderboards or achievements while not logged in. This is the code we have to handle the click on the achievements button, but the one for leaderboards is equivalent: else if (v.getId() == R.id.btn_achievements) { try {    GoogleApiClient apiClient =      getYassActivity().getGameHelper().getApiClient();    Intent achievementsIntent =      Games.Achievements.getAchievementsIntent(apiClient);    startActivityForResult(achievementsIntent,      REQUEST_ACHIEVEMENTS); } catch (Exception e) {    GameHelper gameHelper = getYassActivity().getGameHelper();    gameHelper.disconnect();    gameHelper.beginUserInitiatedSignIn(); } } Basically, we have the old code to open the achievements activity, but we wrap it in a try/catch block. If an exception is raised, we disconnect the game helper and begin a new login using the beginUserInitiatedSignIn method. It is very important to disconnect the gameHelper before we try to log in again. Otherwise, the login will not work. We must disconnect from GPGS before we can log in using the method from the GameHelper. Finally, there is the case when the user clicks on the login button, which just triggers the login using the beginUserInitiatedSignIn method from the GameHelper: if (v.getId() == R.id.btn_sign_in) { getYassActivity().getGameHelper().beginUserInitiatedSignIn(); } Once you have published your game and the game services, achievements and leaderboards will not appear in the game description on Google Play straight away. It is required that "a fair amount of users" have used them. You have done nothing wrong, you just have to wait. Other features of Google Play services Google Play Game Services provides more features for game developers than achievements and leaderboards. None of them really fit the game we are building, but it is useful to know they exist just in case your game needs them. You can save yourself lots of time and effort by using them and not reinventing the wheel. The other features of Google Play Games Services are: Events and quests: these allow you to monitor game usage and progression. Also, they add the possibility of creating time-limited events with rewards for the players. Gifts: as simple as it sounds, you can send a gift to other players or request one to be sent to you. Yes, this is seen in the very mechanical Facebook games popularized a while ago. Saved games: the standard concept of a saved game. If your game has progression or can unlock content based on user actions, you may want to use this feature. Since it is saved in the cloud, saved games can be accessed across multiple devices. Turn-based and real-time multiplayer: Google Play Game Services provides an API to implement turn-based and real-time multiplayer features without you needing to write any server code. If your game is multiplayer and has an online economy, it may be worth making your own server and granting virtual currency only on the server to prevent cheating. Otherwise, it is fairly easy to crack the gifts/reward system and a single person can ruin the complete game economy. However, if there is no online game economy, the benefits of gifts and quests may be more important than the fact that someone can hack them. Let's take a look at each of these features. Events The event's APIs provides us with a way to define and collect gameplay metrics and upload them to Google Play Game Services. This is very similar to the GameEvents we are already using in our game. Events should be a subset of the game events of our game. Many of the game events we have are used internally as a signal between objects or as a synchronization mechanism. These events are not really relevant outside the engine, but others could be. Those are the events we should send to GPGS. To be able to send an event from the game to GPGS, we have to create it in the developer console first. To create an event, we have to go to the Events tab in the developer console, click on Add new event, and fill in the following fields: Name: a short name of the event. The name can be up to 100 characters. This value can be localized. Description: a longer description of the event. The description can be up to 500 characters. This value can also be localized. Icon: the icon for the event of the standard 512x512 px size. Visibility: as for achievements, this can be revealed or hidden. Format: as for leaderboards, this can be Numeric, Currency, or Time. Event type: this is used to mark events that create or spend premium currency. This can be Premium currency sink, Premium currency source, or None. While in the game, events work pretty much as incremental achievements. You can increment the event counter using the following line of code: Games.Events.increment(mGoogleApiClient, myEventId, 1); You can delete events that are in the draft state or that have been published as long as the event is not in use by a quest. You can also reset the player progress data for the testers of your events as you can do for achievements. While the events can be used as an analytics system, their real usefulness appears when they are combined with quests. Quests A quest is a challenge that asks players to complete an event a number of times during a specific time frame to receive a reward. Because a quest is linked to an event, to use quests you need to have created at least one event. You can create a quest from the quests tab in the developer console. A quest has the following fields to be filled: Name: the short name of the quest. This can be up to 100 characters and can be localized. Description: a longer description of the quest. Your quest description should let players know what they need to do to complete the quest. The description can be up to 500 characters. The first 150 characters will be visible to players on cards such as those shown in the Google Play Games app. Icon: a square icon that will be associated with the quest. Banner: a rectangular image that will be used to promote the quest. Completion Criteria: this is the configuration of the quest itself. It consists of an event and the number of times the event must occur. Schedule: the start and end date and time for the quest. GPGS uses your local time zone, but stores the values as UTC. Players will see these values appear in their local time zone. You can mark a checkbox to notify users when the quest is about to end. Reward Data: this is specific to each game. It can be a JSON object, specifying the reward. This is sent to the client when the quest is completed. Once configured in the developer console, you can do two things with the quests: Display the list of quests Process a quest completion To get the list of quests, we start an activity with an intent that is provided to us via a static method as usual: Intent questsIntent = Games.Quests.getQuestsIntent(mGoogleApiClient,    Quests.SELECT_ALL_QUESTS); startActivityForResult(questsIntent, QUESTS_INTENT); To be notified when a quest is completed, all we have to do is register a listener: Games.Quests.registerQuestUpdateListener(mGoogleApiClient, this); Once we have set the listener, the onQuestCompleted method will be called once the quest is completed. After completing the processing of the reward, the game should call claim to inform Play Game services that the player has claimed the reward. The following code snippet shows how you might override the onQuestCompleted callback: @Override public void onQuestCompleted(Quest quest) { // Claim the quest reward. Games.Quests.claim(mGoogleApiClient, quest.getQuestId(),    quest.getCurrentMilestone().getMilestoneId()); // Process the RewardData to provision a specific reward. String reward = new    String(quest.getCurrentMilestone().getCompletionRewardData(),    Charset.forName("UTF-8")); } The rewards themselves are defined by the client. As we mentioned before, this will make the game quite easy to crack and get rewards. But usually, avoiding the hassle of writing your own server is worth it. Gifts The gifts feature of GPGS allows us to send gifts to other players and to request them to send us one as well. This is intended to make the gameplay more collaborative and to improve the social aspect of the game. As for other GPGS features, we have a built-in UI provided by the library that can be used. In this case, to send and request gifts for in-game items and resources to and from friends in their Google+ circles. The request system can make use of notifications. There are two types of requests that players can send using the game gifts feature in Google Play Game Services: A wish request to ask for in-game items or some other form of assistance from their friends A gift request to send in-game items or some other form of assistance to their friends A player can specify one or more target request recipients from the default request-sending UI. A gift or wish can be consumed (accepted) or dismissed by a recipient. To see the gifts API in detail, you can visit https://developers.google.com/games/services/android/giftRequests. Again, as for quest rewards, this is done entirely by the client, which makes the game susceptible to piracy. Saved games The saved games service offers cloud game saving slots. Your game can retrieve the saved game data to allow returning players to continue a game at their last save point from any device. This service makes it possible to synchronize a player's game data across multiple devices. For example, if you have a game that runs on Android, you can use the saved games service to allow a player to start a game on their Android phone and then continue playing the game on a tablet without losing any of their progress. This service can also be used to ensure that a player's game play continues from where it was left off even if their device is lost, destroyed, or traded in for a newer model or if the game was reinstalled The saved games service does not know about the game internals, so it provides a field that is an unstructured binary blob where you can read and write the game data. A game can write an arbitrary number of saved games for a single player subjected to user quota, so there is no hard requirement to restrict players to a single save file. Saved games are done in an unstructured binary blob. The API for saved games also receives some metadata that is used by Google Play Games to populate the UI and to present useful information in the Google Play Game app (for example, last updated timestamp). Saved games has several entry points and actions, including how to deal with conflicts in the saved games. To know more about these check out the official documentation at https://developers.google.com/games/services/android/savedgames. Multiplayer games If you are going to implement multiplayer, GPGS can save you a lot of work. You may or may not use it for the final product, but it will remove the need to think about the server-side until the game concept is validated. You can use GPGS for turn-based and real-time multiplayer games. Although each one is completely different and uses a different API, there is always an initial step where the game is set up and the opponents are selected or invited. In a turn-based multiplayer game, a single shared state is passed among the players and only the player that owns the turn has permission to modify it. Players take turns asynchronously according to an order of play determined by the game. A turn is finished explicitly by the player using an API call. Then the game state is passed to the other players, together with the turn. There are many cases: selecting opponents, creating a match, leaving a match, canceling, and so on. The official documentation at https://developers.google.com/games/services/android/turnbasedMultiplayer is quite exhaustive and you should read through it if you plan to use this feature. In a real-time multiplayer there is no concept of turn. Instead, the server uses the concept of room: a virtual construct that enables network communication between multiple players in the same game session and lets players send data directly to one another, a common concept for game servers. Real-time multiplayer service is based on the concept of Room. The API of real-time multiplayer allows us to easily: Manage network connections to create and maintain a real-time multiplayer room Provide a player-selection user interface to invite players to join a room, look for random players for auto-matching, or a combination of both Store participant and room-state information on the Play Game services' servers while the game is running Send room invitations and updates to players To check the complete documentation for real-time games, please visit the official web at https://developers.google.com/games/services/android/realtimeMultiplayer. Summary We have added Google Play services to YASS, including setting up the game in the developer console and adding the required libraries to the project. Then, we defined a set of achievements and added the code to unlock them. We have used normal, incremental, and hidden achievement types to showcase the different options available. We have also configured a leaderboard and submitted the scores, both when the game is finished and when it is exited via the pause dialog. Finally, we have added links to the native UI for leaderboards and achievements to the main menu. We have also introduced the concepts of events, quests, and gifts and the features of saved games and multiplayer that Google Play Game services offers. The game is ready to publish now. Resources for Article: Further resources on this subject: SceneKit [article] Creating Games with Cocos2d-x is Easy and 100 percent Free [article] SpriteKit Framework and Physics Simulation [article]
Read more
  • 0
  • 0
  • 3837
Modal Close icon
Modal Close icon