Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-cassandra-architecture
Packt
26 Mar 2015
35 min read
Save for later

Cassandra Architecture

Packt
26 Mar 2015
35 min read
In this article by Nishant Neeraj, the author of the book Mastering Apache Cassandra - Second Edition, aims to set you into a perspective where you will be able to see the evolution of the NoSQL paradigm. It will start with a discussion of common problems that an average developer faces when the application starts to scale up and software components cannot keep up with it. Then, we'll see what can be assumed as a thumb rule in the NoSQL world: the CAP theorem that says to choose any two out of consistency, availability, and partition-tolerance. As we discuss this further, we will realize how much more important it is to serve the customers (availability), than to be correct (consistency) all the time. However, we cannot afford to be wrong (inconsistent) for a long time. The customers wouldn't like to see that the items are in stock, but that the checkout is failing. Cassandra comes into picture with its tunable consistency. (For more resources related to this topic, see here.) Problems in the RDBMS world RDBMS is a great approach. It keeps data consistent, it's good for OLTP (http://en.wikipedia.org/wiki/Online_transaction_processing), it provides access to good grammar, and manipulates data supported by all the popular programming languages. It has been tremendously successful in the last 40 years (the relational data model was proposed in its first incarnation by Codd, E.F. (1970) in his research paper A Relational Model of Data for Large Shared Data Banks). However, in early 2000s, big companies such as Google and Amazon, which have a gigantic load on their databases to serve, started to feel bottlenecked with RDBMS, even with helper services such as Memcache on top of them. As a response to this, Google came up with BigTable (http://research.google.com/archive/bigtable.html), and Amazon with Dynamo (http://www.cs.ucsb.edu/~agrawal/fall2009/dynamo.pdf). If you have ever used RDBMS for a complicated web application, you must have faced problems such as slow queries due to complex joins, expensive vertical scaling, and problems in horizontal scaling. Due to these problems, indexing takes a long time. At some point, you may have chosen to replicate the database, but there was still some locking, and this hurts the availability of the system. This means that under a heavy load, locking will cause the user's experience to deteriorate. Although replication gives some relief, a busy slave may not catch up with the master (or there may be a connectivity glitch between the master and the slave). Consistency of such systems cannot be guaranteed. Consistency, the property of a database to remain in a consistent state before and after a transaction, is one of the promises made by relational databases. It seems that one may need to make compromises on consistency in a relational database for the sake of scalability. With the growth of the application, the demand to scale the backend becomes more pressing, and the developer teams may decide to add a caching layer (such as Memcached) at the top of the database. This will alleviate some load off the database, but now the developers will need to maintain the object states in two places: the database, and the caching layer. Although some Object Relational Mappers (ORMs) provide a built-in caching mechanism, they have their own issues, such as larger memory requirement, and often mapping code pollutes application code. In order to achieve more from RDBMS, we will need to start to denormalize the database to avoid joins, and keep the aggregates in the columns to avoid statistical queries. Sharding or horizontal scaling is another way to distribute the load. Sharding in itself is a good idea, but it adds too much manual work, plus the knowledge of sharding creeps into the application code. Sharded databases make the operational tasks (backup, schema alteration, and adding index) difficult. To find out more about the hardships of sharding, visit http://www.mysqlperformanceblog.com/2009/08/06/why-you-dont-want-to-shard/. There are ways to loosen up consistency by providing various isolation levels, but concurrency is just one part of the problem. Maintaining relational integrity, difficulties in managing data that cannot be accommodated on one machine, and difficult recovery, were all making the traditional database systems hard to be accepted in the rapidly growing big data world. Companies needed a tool that could support hundreds of terabytes of data on the ever-failing commodity hardware reliably. This led to the advent of modern databases like Cassandra, Redis, MongoDB, Riak, HBase, and many more. These modern databases promised to support very large datasets that were hard to maintain in SQL databases, with relaxed constrains on consistency and relation integrity. Enter NoSQL NoSQL is a blanket term for the databases that solve the scalability issues which are common among relational databases. This term, in its modern meaning, was first coined by Eric Evans. It should not be confused with the database named NoSQL (http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/Home%20Page). NoSQL solutions provide scalability and high availability, but may not guarantee ACID: atomicity, consistency, isolation, and durability in transactions. Many of the NoSQL solutions, including Cassandra, sit on the other extreme of ACID, named BASE, which stands for basically available, soft-state, eventual consistency. Wondering about where the name, NoSQL, came from? Read Eric Evans' blog at http://blog.sym-link.com/2009/10/30/nosql_whats_in_a_name.html. The CAP theorem In 2000, Eric Brewer (http://en.wikipedia.org/wiki/Eric_Brewer_%28scientist%29), in his keynote speech at the ACM Symposium, said, "A distributed system requiring always-on, highly-available operations cannot guarantee the illusion of coherent, consistent single-system operation in the presence of network partitions, which cut communication between active servers". This was his conjecture based on his experience with distributed systems. This conjecture was later formally proved by Nancy Lynch and Seth Gilbert in 2002 (Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services, published in ACMSIGACT News, Volume 33 Issue 2 (2002), page 51 to 59 available at http://webpages.cs.luc.edu/~pld/353/gilbert_lynch_brewer_proof.pdf). Let's try to understand this. Let's say we have a distributed system where data is replicated at two distinct locations and two conflicting requests arrive, one at each location, at the time of communication link failure between the two servers. If the system (the cluster) has obligations to be highly available (a mandatory response, even when some components of the system are failing), one of the two responses will be inconsistent with what a system with no replication (no partitioning, single copy) would have returned. To understand it better, let's take an example to learn the terminologies. Let's say you are planning to read George Orwell's book Nineteen Eighty-Four over the Christmas vacation. A day before the holidays start, you logged into your favorite online bookstore to find out there is only one copy left. You add it to your cart, but then you realize that you need to buy something else to be eligible for free shipping. You start to browse the website for any other item that you might buy. To make the situation interesting, let's say there is another customer who is trying to buy Nineteen Eighty-Four at the same time. Consistency A consistent system is defined as one that responds with the same output for the same request at the same time, across all the replicas. Loosely, one can say a consistent system is one where each server returns the right response to each request. In our example, we have only one copy of Nineteen Eighty-Four. So only one of the two customers is going to get the book delivered from this store. In a consistent system, only one can check out the book from the payment page. As soon as one customer makes the payment, the number of Nineteen Eighty-Four books in stock will get decremented by one, and one quantity of Nineteen Eighty-Four will be added to the order of that customer. When the second customer tries to check out, the system says that the book is not available any more. Relational databases are good for this task because they comply with the ACID properties. If both the customers make requests at the same time, one customer will have to wait till the other customer is done with the processing, and the database is made consistent. This may add a few milliseconds of wait to the customer who came later. An eventual consistent database system (where consistency of data across the distributed servers may not be guaranteed immediately) may have shown availability of the book at the time of check out to both the customers. This will lead to a back order, and one of the customers will be paid back. This may or may not be a good policy. A large number of back orders may affect the shop's reputation and there may also be financial repercussions. Availability Availability, in simple terms, is responsiveness; a system that's always available to serve. The funny thing about availability is that sometimes a system becomes unavailable exactly when you need it the most. In our example, one day before Christmas, everyone is buying gifts. Millions of people are searching, adding items to their carts, buying, and applying for discount coupons. If one server goes down due to overload, the rest of the servers will get even more loaded now, because the request from the dead server will be redirected to the rest of the machines, possibly killing the service due to overload. As the dominoes start to fall, eventually the site will go down. The peril does not end here. When the website comes online again, it will face a storm of requests from all the people who are worried that the offer end time is even closer, or those who will act quickly before the site goes down again. Availability is the key component for extremely loaded services. Bad availability leads to bad user experience, dissatisfied customers, and financial losses. Partition-tolerance Network partitioning is defined as the inability to communicate between two or more subsystems in a distributed system. This can be due to someone walking carelessly in a data center and snapping the cable that connects the machine to the cluster, or may be network outage between two data centers, dropped packages, or wrong configuration. Partition-tolerance is a system that can operate during the network partition. In a distributed system, a network partition is a phenomenon where, due to network failure or any other reason, one part of the system cannot communicate with the other part(s) of the system. An example of network partition is a system that has some nodes in a subnet A and some in subnet B, and due to a faulty switch between these two subnets, the machines in subnet A will not be able to send and receive messages from the machines in subnet B. The network will be allowed to lose many messages arbitrarily sent from one node to another. This means that even if the cable between the two nodes is chopped, the system will still respond to the requests. The following figure shows the database classification based on the CAP theorem: An example of a partition-tolerant system is a system with real-time data replication with no centralized master(s). So, for example, in a system where data is replicated across two data centers, the availability will not be affected, even if a data center goes down. The significance of the CAP theorem Once you decide to scale up, the first thing that comes to mind is vertical scaling, which means using beefier servers with a bigger RAM, more powerful processor(s), and bigger disks. For further scaling, you need to go horizontal. This means adding more servers. Once your system becomes distributed, the CAP theorem starts to play, which means, in a distributed system, you can choose only two out of consistency, availability, and partition-tolerance. So, let's see how choosing two out of the three options affects the system behavior as follows: CA system: In this system, you drop partition-tolerance for consistency and availability. This happens when you put everything related to a transaction on one machine or a system that fails like an atomic unit, like a rack. This system will have serious problems in scaling. CP system: The opposite of a CA system is a CP system. In a CP system, availability is sacrificed for consistency and partition-tolerance. What does this mean? If the system is available to serve the requests, data will be consistent. In an event of a node failure, some data will not be available. A sharded database is an example of such a system. AP system: An available and partition-tolerant system is like an always-on system that is at risk of producing conflicting results in an event of network partition. This is good for user experience, your application stays available, and inconsistency in rare events may be alright for some use cases. In our example, it may not be such a bad idea to back order a few unfortunate customers due to inconsistency of the system than having a lot of users return without making any purchases because of the system's poor availability. Eventual consistent (also known as BASE system): The AP system makes more sense when viewed from an uptime perspective—it's simple and provides a good user experience. But, an inconsistent system is not good for anything, certainly not good for business. It may be acceptable that one customer for the book Nineteen Eighty-Four gets a back order. But if it happens more often, the users would be reluctant to use the service. It will be great if the system could fix itself (read: repair) as soon as the first inconsistency is observed; or, maybe there are processes dedicated to fixing the inconsistency of a system when a partition failure is fixed or a dead node comes back to life. Such systems are called eventual consistent systems. The following figure shows the life of an eventual consistent system: Quoting Wikipedia, "[In a distributed system] given a sufficiently long time over which no changes [in system state] are sent, all updates can be expected to propagate eventually through the system and the replicas will be consistent". (The page on eventual consistency is available at http://en.wikipedia.org/wiki/Eventual_consistency.) Eventual consistent systems are also called BASE, a made-up term to represent that these systems are on one end of the spectrum, which has traditional databases with ACID properties on the opposite end. Cassandra is one such system that provides high availability and partition-tolerance at the cost of consistency, which is tunable. The preceding figure shows a partition-tolerant eventual consistent system. Cassandra Cassandra is a distributed, decentralized, fault tolerant, eventually consistent, linearly scalable, and column-oriented data store. This means that Cassandra is made to easily deploy over a cluster of machines located at geographically different places. There is no central master server, so no single point of failure, no bottleneck, data is replicated, and a faulty node can be replaced without any downtime. It's eventually consistent. It is linearly scalable, which means that with more nodes, the requests served per second per node will not go down. Also, the total throughput of the system will increase with each node being added. And finally, it's column oriented, much like a map (or better, a map of sorted maps) or a table with flexible columns where each column is essentially a key-value pair. So, you can add columns as you go, and each row can have a different set of columns (key-value pairs). It does not provide any relational integrity. It is up to the application developer to perform relation management. So, if Cassandra is so good at everything, why doesn't everyone drop whatever database they are using and jump start with Cassandra? This is a natural question. Some applications require strong ACID compliance, such as a booking system. If you are a person who goes by statistics, you'd ask how Cassandra fares with other existing data stores. TilmannRabl and others in their paper, Solving Big Data Challenges for Enterprise Application Performance Management (http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf), said that, "In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput from one to 12 nodes. This comes at the price of a high write and read latency. Cassandra's performance is best for high insertion rates". If you go through the paper, Cassandra wins in almost all the criteria. Equipped with proven concepts of distributed computing, made to reliably serve from commodity servers, and simple and easy maintenance, Cassandra is one of the most scalable, fastest, and very robust NoSQL database. So, the next natural question is, "What makes Cassandra so blazing fast?". Let's dive deeper into the Cassandra architecture. Understanding the architecture of Cassandra Cassandra is a relative latecomer in the distributed data-store war. It takes advantage of two proven and closely similar data-store mechanisms, namely Bigtable: A Distributed Storage System for Structured Data, 2006 (http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf) and Amazon Dynamo: Amazon's Highly Available Key-value Store, 2007 (http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf). The following diagram displays the read throughputs that show linear scaling of Cassandra: Like BigTable, it has a tabular data presentation. It is not tabular in the strictest sense. It is rather a dictionary-like structure where each entry holds another sorted dictionary/map. This model is more powerful than the usual key-value store and it is named a table, formerly known as a column family. The properties such as eventual consistency and decentralization are taken from Dynamo. Now, assume a column family is a giant spreadsheet, such as MS Excel. But unlike spreadsheets, each row is identified by a row key with a number (token), and unlike spreadsheets, each cell may have its own unique name within the row. In Cassandra, the columns in the rows are sorted by this unique column name. Also, since the number of partitions is allowed to be very large (1.7*1038), it distributes the rows almost uniformly across all the available machines by dividing the rows in equal token groups. Tables or column families are contained within a logical container or name space called keyspace. A keyspace can be assumed to be more or less similar to database in RDBMS. A word on max number of cells, rows, and partitions A cell in a partition can be assumed as a key-value pair. The maximum number of cells per partition is limited by the Java integer's max value, which is about 2 billion. So, one partition can hold a maximum of 2 billion cells. A row, in CQL terms, is a bunch of cells with predefined names. When you define a table with a primary key that has just one column, the primary key also serves as the partition key. But when you define a composite primary key, the first column in the definition of the primary key works as the partition key. So, all the rows (bunch of cells) that belong to one partition key go into one partition. This means that every partition can have a maximum of X rows, where X = (2*10­9/number_of_columns_in_a_row). Essentially, rows * columns cannot exceed 2 billion per partition. Finally, how many partitions can Cassandra hold for each table or column family? As we know, column families are essentially distributed hashmaps. The keys or row keys or partition keys are generated by taking a consistent hash of the string that you pass. So, the number of partitioned keys is bounded by the number of hashes these functions generate. This means that if you are using the default Murmur3 partitioner (range -263 to +263), the maximum number of partitions that you can have is 1.85*1019. If you use the Random partitioner, the number of partitions that you can have is 1.7*1038. Ring representation A Cassandra cluster is called a ring. The terminology is taken from Amazon Dynamo. Cassandra 1.1 and earlier versions used to have a token assigned to each node. Let's call this value the initial token. Each node is responsible for storing all the rows with token values (a token is basically a hash value of a row key) ranging from the previous node's initial token (exclusive) to the node's initial token (inclusive). This way, the first node, the one with the smallest initial token, will have a range from the token value of the last node (the node with the largest initial token) to the first token value. So, if you jump from node to node, you will make a circle, and this is why a Cassandra cluster is called a ring. Let's take an example. Assume that there is a hashing algorithm (partitioner) that generates tokens from 0 to 127 and you have four Cassandra machines to create a cluster. To allocate equal load, we need to assign each of the four nodes to bear an equal number of tokens. So, the first machine will be responsible for tokens one to 32, the second will hold 33 to 64, third 65 to 96, and fourth 97 to 127 and 0. If you mark each node with the maximum token number that it can hold, the cluster looks like a ring, as shown in the following figure: Token ownership and distribution in a balanced Cassandra ring Virtual nodes In Cassandra 1.1 and previous versions, when you create a cluster or add a node, you manually assign its initial token. This is extra work that the database should handle internally. Apart from this, adding and removing nodes requires manual resetting token ownership for some or all nodes. This is called rebalancing. Yet another problem was replacing a node. In the event of replacing a node with a new one, the data (rows that the to-be-replaced node owns) is required to be copied to the new machine from a replica of the old machine. For a large database, this could take a while because we are streaming from one machine. To solve all these problems, Cassandra 1.2 introduced virtual nodes (vnodes). The following figure shows 16 vnodes distributed over four servers: In the preceding figure, each node is responsible for a single continuous range. In the case of a replication factor of 2 or more, the data is also stored on other machines than the one responsible for the range. (Replication factor (RF) represents the number of copies of a table that exist in the system. So, RF=2, means there are two copies of each record for the table.) In this case, one can say one machine, one range. With vnodes, each machine can have multiple smaller ranges and these ranges are automatically assigned by Cassandra. How does this solve those issues? Let's see. If you have a 30 ring cluster and a node with 256 vnodes had to be replaced. If nodes are well-distributed randomly across the cluster, each physical node in remaining 29 nodes will have 8 or 9 vnodes (256/29) that are replicas of vnodes on the dead node. In older versions, with a replication factor of 3, the data had to be streamed from three replicas (10 percent utilization). In the case of vnodes, all the nodes can participate in helping the new node get up. The other benefit of using vnodes is that you can have a heterogeneous ring where some machines are more powerful than others, and change the vnodes ' settings such that the stronger machines will take proportionally more data than others. This was still possible without vnodes but it needed some tricky calculation and rebalancing. So, let's say you have a cluster of machines with similar hardware specifications and you have decided to add a new server that is twice as powerful as any machine in the cluster. Ideally, you would want it to work twice as harder as any of the old machines. With vnodes, you can achieve this by setting twice as many num_tokens as on the old machine in the new machine's cassandra.yaml file. Now, it will be allotted double the load when compared to the old machines. Yet another benefit of vnodes is faster repair. Node repair requires the creation of a Merkle tree for each range of data that a node holds. The data gets compared with the data on the replica nodes, and if needed, data re-sync is done. Creation of a Merkle tree involves iterating through all the data in the range followed by streaming it. For a large range, the creation of a Merkle tree can be very time consuming while the data transfer might be much faster. With vnodes, the ranges are smaller, which means faster data validation (by comparing with other nodes). Since the Merkle tree creation process is broken into many smaller steps (as there are many small nodes that exist in a physical node), the data transmission does not have to wait till the whole big range finishes. Also, the validation uses all other machines instead of just a couple of replica nodes. As of Cassandra 2.0.9, the default setting for vnodes is "on" with default vnodes per machine as 256. If for some reason you do not want to use vnodes and want to disable this feature, comment out the num_tokens variable and uncomment and set the initial_token variable in cassandra.yaml. If you are starting with a new cluster or migrating an old cluster to the latest version of Cassandra, vnodes are highly recommended. The number of vnodes that you specify on a Cassandra node represents the number of vnodes on that machine. So, the total vnodes on a cluster is the sum total of all the vnodes across all the nodes. One can always imagine a Cassandra cluster as a ring of lots of vnodes. How Cassandra works Diving into various components of Cassandra without having any context is a frustrating experience. It does not make sense why you are studying SSTable, MemTable, and log structured merge (LSM) trees without being able to see how they fit into the functionality and performance guarantees that Cassandra gives. So first we will see Cassandra's write and read mechanism. It is possible that some of the terms that we encounter during this discussion may not be immediately understandable. A rough overview of the Cassandra components is as shown in the following figure: Main components of the Cassandra service The main class of Storage Layer is StorageProxy. It handles all the requests. The messaging layer is responsible for inter-node communications, such as gossip. Apart from this, process-level structures keep a rough idea about the actual data containers and where they live. There are four data buckets that you need to know. MemTable is a hash table-like structure that stays in memory. It contains actual cell data. SSTable is the disk version of MemTables. When MemTables are full, they are persisted to hard disk as SSTable. Commit log is an append only log of all the mutations that are sent to the Cassandra cluster. Mutations can be thought of as update commands. So, insert, update, and delete operations are mutations, since they mutate the data. Commit log lives on the disk and helps to replay uncommitted changes. These three are basically core data. Then there are bloom filters and index. The bloom filter is a probabilistic data structure that lives in the memory. They both live in memory and contain information about the location of data in the SSTable. Each SSTable has one bloom filter and one index associated with it. The bloom filter helps Cassandra to quickly detect which SSTable does not have the requested data, while the index helps to find the exact location of the data in the SSTable file. With this primer, we can start looking into how write and read works in Cassandra. We will see more explanation later. Write in action To write, clients need to connect to any of the Cassandra nodes and send a write request. This node is called the coordinator node. When a node in a Cassandra cluster receives a write request, it delegates the write request to a service called StorageProxy. This node may or may not be the right place to write the data. StorageProxy's job is to get the nodes (all the replicas) that are responsible for holding the data that is going to be written. It utilizes a replication strategy to do this. Once the replica nodes are identified, it sends the RowMutation message to them, the node waits for replies from these nodes, but it does not wait for all the replies to come. It only waits for as many responses as are enough to satisfy the client's minimum number of successful writes defined by ConsistencyLevel. ConsistencyLevel is basically a fancy way of saying how reliable a read or write you want to be. Cassandra has tunable consistency, which means you can define how much reliability is wanted. Obviously, everyone wants a hundred percent reliability, but it comes with latency as the cost. For instance, in a thrice-replicated cluster (replication factor = 3), a write time consistency level TWO, means the write will become successful only if it is written to at least two replica nodes. This request will be faster than the one with the consistency level THREE or ALL, but slower than the consistency level ONE or ANY. The following figure is a simplistic representation of the write mechanism. The operations on node N2 at the bottom represent the node-local activities on receipt of the write request: The following steps show everything that can happen during a write mechanism: If the failure detector detects that there aren't enough live nodes to satisfy ConsistencyLevel, the request fails. If the failure detector gives a green signal, but writes time-out after the request is sent due to infrastructure problems or due to extreme load, StorageProxy writes a local hint to replay when the failed nodes come back to life. This is called hinted hand off. One might think that hinted handoff may be responsible for Cassandra's eventual consistency. But it's not entirely true. If the coordinator node gets shut down or dies due to hardware failure and hints on this machine cannot be forwarded, eventual consistency will not occur. The anti-entropy mechanism is responsible for consistency, rather than hinted hand-off. Anti-entropy makes sure that all replicas are in sync. If the replica nodes are distributed across data centers, it will be a bad idea to send individual messages to all the replicas in other data centers. Rather, it sends the message to one replica in each data center with a header, instructing it to forward the request to other replica nodes in that data center. Now the data is received by the node which should actually store that data. The data first gets appended to the commit log, and pushed to a MemTable appropriate column family in the memory. When the MemTable becomes full, it gets flushed to the disk in a sorted structure named SSTable. With lots of flushes, the disk gets plenty of SSTables. To manage SSTables, a compaction process runs. This process merges data from smaller SSTables to one big sorted file. Read in action Similar to a write case, when StorageProxy of the node that a client is connected to gets the request, it gets a list of nodes containing this key based on the replication strategy. The node's StorageProxy then sorts the nodes based on their proximity to itself. The proximity is determined by the snitch function that is set up for this cluster. Basically, the following types of snitches exist: SimpleSnitch: A closer node is the one that comes first when moving clockwise in the ring. (A ring is when all the machines in the cluster are placed in a circular fashion with each machine having a token number. When you walk clockwise, the token value increases. At the end, it snaps back to the first node.) PropertyFileSnitch: This snitch allows you to specify how you want your machines' location to be interpreted by Cassandra. You do this by assigning a data center name and rack name for all the machines in the cluster in the $CASSANDRA_HOME/conf/cassandra-topology.properties file. Each node has a copy of this file and you need to alter this file each time you add or remove a node. This is what the file looks like: # Cassandra Node IP=Data Center:Rack 192.168.1.100=DC1:RAC1 192.168.2.200=DC2:RAC2 10.0.0.10=DC1:RAC1 10.0.0.11=DC1:RAC1 10.0.0.12=DC1:RAC2 10.20.114.10=DC2:RAC1 10.20.114.11=DC2:RAC1 GossipingPropertyFileSnitch: The PropertyFileSnitch is kind of a pain, even when you think about it. Each node has the locations of all nodes manually written and updated every time a new node joins or an old node retires. And then, we need to copy it on all the servers. Wouldn't it be better if we just specify each node's data center and rack on just that one machine, and then have Cassandra somehow collect this information to understand the topology? This is exactly what GossipingPropertyFileSnitch does. Similar to PropertyFileSnitch, you have a file called $CASSANDRA_HOME/conf/cassandra-rackdc.properties, and in this file you specify the data center and the rack name for that machine. The gossip protocol makes sure that this information gets spread to all the nodes in the cluster (and you do not have to edit properties of files on all the nodes when a new node joins or leaves). Here is what a cassandra-rackdc.properties file looks like: # indicate the rack and dc for this node dc=DC13 rack=RAC42 RackInferringSnitch: This snitch infers the location of a node based on its IP address. It uses the third octet to infer rack name, and the second octet to assign data center. If you have four nodes 10.110.6.30, 10.110.6.4, 10.110.7.42, and 10.111.3.1, this snitch will think the first two live on the same rack as they have the same second octet (110) and the same third octet (6), while the third lives in the same data center but on a different rack as it has the same second octet but the third octet differs. Fourth, however, is assumed to live in a separate data center as it has a different second octet than the three. EC2Snitch: This is meant for Cassandra deployments on Amazon EC2 service. EC2 has regions and within regions, there are availability zones. For example, us-east-1e is an availability zone in the us-east region with availability zone named 1e. This snitch infers the region name (us-east, in this case) as the data center and availability zone (1e) as the rack. EC2MultiRegionSnitch: The multi-region snitch is just an extension of EC2Snitch where data centers and racks are inferred the same way. But you need to make sure that broadcast_address is set to the public IP provided by EC2 and seed nodes must be specified using their public IPs so that inter-data center communication can be done. DynamicSnitch: This Snitch determines closeness based on a recent performance delivered by a node. So, a quick responding node is perceived as being closer than a slower one, irrespective of its location closeness, or closeness in the ring. This is done to avoid overloading a slow performing node. DynamicSnitch is used by all the other snitches by default. You can disable it, but it is not advisable. Now, with knowledge about snitches, we know the list of the fastest nodes that have the desired row keys, it's time to pull data from them. The coordinator node (the one that the client is connected to) sends a command to the closest node to perform a read (we'll discuss local reads in a minute) and return the data. Now, based on ConsistencyLevel, other nodes will send a command to perform a read operation and send just the digest of the result. If we have read repairs (discussed later) enabled, the remaining replica nodes will be sent a message to compute the digest of the command response. Let's take an example. Let's say you have five nodes containing a row key K (that is, RF equals five), your read ConsistencyLevel is three; then the closest of the five nodes will be asked for the data and the second and third closest nodes will be asked to return the digest. If there is a difference in the digests, full data is pulled from the conflicting node and the latest of the three will be sent. These replicas will be updated to have the latest data. We still have two nodes left to be queried. If read repairs are not enabled, they will not be touched for this request. Otherwise, these two will be asked to compute the digest. Depending on the read_repair_chance setting, the request to the last two nodes is done in the background, after returning the result. This updates all the nodes with the most recent value, making all replicas consistent. Let's see what goes on within a node. Take a simple case of a read request looking for a single column within a single row. First, the attempt is made to read from MemTable, which is rapid fast, and since there exists only one copy of data, this is the fastest retrieval. If all required data is not found there, Cassandra looks into SSTable. Now, remember from our earlier discussion that we flush MemTables to disk as SSTables and later when the compaction mechanism wakes up, it merges those SSTables. So, our data can be in multiple SSTables. The following figure represents a simplified representation of the read mechanism. The bottom of the figure shows processing on the read node. The numbers in circles show the order of the event. BF stands for bloom filter. Each SSTable is associated with its bloom filter built on the row keys in the SSTable. Bloom filters are kept in the memory and used to detect if an SSTable may contain (false positive) the row data. Now, we have the SSTables that may contain the row key. The SSTables get sorted in reverse chronological order (latest first). Apart from the bloom filter for row keys, there exists one bloom filter for each row in the SSTable. This secondary bloom filter is created to detect whether the requested column names exist in the SSTable. Now, Cassandra will take SSTables one by one from younger to older, and use the index file to locate the offset for each column value for that row key and the bloom filter associated with the row (built on the column name). On the bloom filter being positive for the requested column, it looks into the SSTable file to read the column value. Note that we may have a column value in other yet-to-be-read SSTables, but that does not matter, because we are reading the most recent SSTables first, and any value that was written earlier to it does not matter. So, the value gets returned as soon as the first column in the most recent SSTable is allocated. Summary By now, you should be familiar with all the nuts and bolts of Cassandra. We have discussed how the pressure to make data stores to web scale inspired a rather not-so-common database mechanism to become mainstream, and how the CAP theorem governs the behavior of such databases. We have seen that Cassandra shines out among its peers. Then, we dipped our toes into the big picture of Cassandra read and write mechanisms. This left us with lots of fancy terms. It is understandable that it may be a lot to take in for someone new to NoSQL systems. It is okay if you do not have complete clarity at this point. As you start working with Cassandra, tweaking it, experimenting with it, and going through the Cassandra mailing list discussions or talks, you will start to come across stuff that you have read in this article and it will start to make sense, and perhaps you may want to come back and refer to this article to improve your clarity. It is not required that you understand this article fully to be able to write queries, set up clusters, maintain clusters, or do anything else related to Cassandra. A general sense of this article will take you far enough to work extremely well with Cassandra-based projects. How does this knowledge help us in building an application? Isn't it just about learning Thrift or CQL API and getting going? You might be wondering why you need to know about the compaction and storage mechanism, when all you need to do is to deliver an application that has a fast backend. It may not be obvious at this point why you are learning this, but as we move ahead with developing an application, we will come to realize that knowledge about underlying storage mechanism helps. Later, if you will deploy a cluster, performance tuning, maintenance, and integrating with other tools such as Apache Hadoop, you may find this article useful. Resources for Article: Further resources on this subject: About Cassandra [article] Replication [article] Getting Up and Running with Cassandra [article]
Read more
  • 0
  • 0
  • 5947

article-image-subscribing-report
Packt
26 Mar 2015
6 min read
Save for later

Subscribing to a report

Packt
26 Mar 2015
6 min read
 In this article by Johan Yu, the author of Salesforce Reporting and Dashboards, we get acquainted to the components used when working with reports on the Salesforce platform. Subscribing to a report is a new feature in Salesforce introduced in the Spring 2015 release. When you subscribe to a report, you will get a notification on weekdays, daily, or weekly, when the reports meet the criteria defined. You just need to subscribe to the report that you most care about. (For more resources related to this topic, see here.) Subscribing to a report is not the same as the report's Schedule Future Run option, where scheduling a report for a future run will keep e-mailing you the report content at a specified frequency defined, without specifying any conditions. But when you subscribe to a report, you will receive notifications when the report output meets the criteria you have defined. Subscribing to a report will not send you the e-mail content, but just an alert that the report you subscribed to meets the conditions specified. To subscribe to a report, you do not need additional permission as our administrator is able to control to enable or disable this feature for the entire organization. By default, this feature will be turned on for customers using the Salesforce Spring 2015 release. If you are an administrator for the organization, you can check out this feature by navigating to Setup | Customize | Reports & Dashboards | Report Notification | Enable report notification subscriptions for all users. Besides receiving notifications via e-mail, you also can opt for Salesforce1 notifications and posts to Chatter feeds, and execute a custom action. Report Subscription To subscribe to a report, you need to define a set of conditions to trigger the notifications. Here is what you need to understand before you subscribe to a report: When: Everytime conditions are met or only the first time conditions are met. Conditions: An aggregate can be a record count or a summarize field. Then define the operator and value you want the aggregate to be compared to. The summarize field means a field that you use in that report to summarize its data as average, smallest, largest, or sum. You can add multiple conditions, but at this moment, you only have the AND condition. Schedule frequency: Schedule weekday, daily, weekly, and the time the report will be run. Actions: E-mail notifications: You will get e-mail alerts when conditions are met. Posts to Chatter feeds: Alerts will be posted to your Chatter feed. Salesforce1 notifications: Alerts in your Salesforce1 app. Execute a custom action: This will trigger a call to the apex class. You will need a developer to write apex code for this. Active: This is a checkbox used to activate or disable subscription. You may just need to disable it when you need to unsubscribe temporarily; otherwise, deleting will remove all the settings defined. The following screenshot shows the conditions set in order to subscribe to a report: Monitoring a report subscription How can you know whether you have subscribed to a report? When you open the report and see the Subscribe button, it means you are not subscribed to that report:   Once you configure the report to subscribe, the button label will turn to Edit Subscription. But, do not get it wrong that not all reports with Edit Subscription, you will get alerts when the report meets the criteria, because the setting may just not be active, remember step above when you subscribe a report. To know all the reports you subscribe to at a glance, as long as you have View Setup and Configuration permissions, navigate to Setup | Jobs | Scheduled Jobs, and look for Type as Reporting Notification, as shown in this screenshot:   Hands-on – subscribing to a report Here is our next use case: you would like to get a notification in your Salesforce1 app—an e-mail notification—and also posts on your Chatter feed once the Closed Won opportunity for the month has reached $50,000. Salesforce should check the report daily, but instead of getting this notification daily, you want to get it only once a week or month; otherwise, it will be disturbing. Creating reports Make sure you set the report with the correct filter, set Close Date as This Month, and summarize the Amount field, as shown in the following screenshot:   Subscribing Click on the Subscribe button and fill in the following details: Type as Only the first time conditions are met Conditions: Aggregate as Sum of Amount Operator as Greater Than or Equal Value as 50000 Schedule: Frequency as Every Weekday Time as 7AM In Actions, select: Send Salesforce1 Notification Post to Chatter Feed Send Email Notification In Active, select the checkbox Testing and saving The good thing of this feature is the ability to test without waiting until the scheduled date or time. Click on the Save & Run Now button. Here is the result: Salesforce1 notifications Open your Salesforce1 mobile app, look for the notification icon, and notice a new alert from the report you subscribed to, as shown in this screenshot: If you click on the notification, it will take you to the report that is shown in the following screenshot:   Chatter feed Since you selected the Post to Chatter Feed action, the same alert will go to your Chatter feed as well. Clicking on the link in the Chatter feed will open the same report in your Salesforce1 mobile app or from the web browser, as shown in this screenshot: E-mail notification The last action we've selected for this exercise is to send an e-mail notification. The following screenshot shows how the e-mail notification would look:   Limitations The following limitations are observed while subscribing to a report: You can set up to five conditions per report, and no OR logic conditions are possible You can subscribe for up to five reports, so use it wisely Summary In this article, you became familiar with components when working with reports on the Salesforce platform. We saw different report formats and the uniqueness of each format. We continued discussions on adding various types of charts to the report with point-and-click effort and no code; all of this can be done within minutes. We saw how to add filters to reports to customize our reports further, including using Filter Logic, Cross Filter, and Row Limit for tabular reports. We walked through managing and customizing custom report types, including how to hide unused report types and report type adoption analysis. In the last part of this article, we saw how easy it is to subscribe to a report and define criteria. Resources for Article: Further resources on this subject: Salesforce CRM – The Definitive Admin Handbook - Third Edition [article] Salesforce.com Customization Handbook [article] Developing Applications with Salesforce Chatter [article]
Read more
  • 0
  • 0
  • 2584

Packt
25 Mar 2015
35 min read
Save for later

Fun with Sprites – Sky Defense

Packt
25 Mar 2015
35 min read
This article is written by Roger Engelbert, the author of Cocos2d-x by Example: Beginner's Guide - Second Edition. Time to build our second game! This time, you will get acquainted with the power of actions in Cocos2d-x. I'll show you how an entire game could be built just by running the various action commands contained in Cocos2d-x to make your sprites move, rotate, scale, fade, blink, and so on. And you can also use actions to animate your sprites using multiple images, like in a movie. So let's get started. In this article, you will learn: How to optimize the development of your game with sprite sheets How to use bitmap fonts in your game How easy it is to implement and run actions How to scale, rotate, swing, move, and fade out a sprite How to load multiple .png files and use them to animate a sprite How to create a universal game with Cocos2d-x (For more resources related to this topic, see here.) The game – sky defense Meet our stressed-out city of...your name of choice here. It's a beautiful day when suddenly the sky begins to fall. There are meteors rushing toward the city and it is your job to keep it safe. The player in this game can tap the screen to start growing a bomb. When the bomb is big enough to be activated, the player taps the screen again to detonate it. Any nearby meteor will explode into a million pieces. The bigger the bomb, the bigger the detonation, and the more meteors can be taken out by it. But the bigger the bomb, the longer it takes to grow it. But it's not just bad news coming down. There are also health packs dropping from the sky and if you allow them to reach the ground, you'll recover some of your energy. The game settings This is a universal game. It is designed for the iPad retina screen and it will be scaled down to fit all the other screens. The game will be played in landscape mode, and it will not need to support multitouch. The start project The command line I used was: cocos new SkyDefense -p com.rengelbert.SkyDefense -l cpp -d /Users/rengelbert/Desktop/SkyDefense In Xcode you must set the Devices field in Deployment Info to Universal, and the Device Family field is set to Universal. And in RootViewController.mm, the supported interface orientation is set to Landscape. The game we are going to build requires only one class, GameLayer.cpp, and you will find that the interface for this class already contains all the information it needs. Also, some of the more trivial or old-news logic is already in place in the implementation file as well. But I'll go over this as we work on the game. Adding screen support for a universal app Now things get a bit more complicated as we add support for smaller screens in our universal game, as well as some of the most common Android screen sizes. So open AppDelegate.cpp. Inside the applicationDidFinishLaunching method, we now have the following code: auto screenSize = glview->getFrameSize(); auto designSize = Size(2048, 1536); glview->setDesignResolutionSize(designSize.width, designSize.height, ResolutionPolicy::EXACT_FIT); std::vector<std::string> searchPaths; if (screenSize.height > 768) {    searchPaths.push_back("ipadhd");    director->setContentScaleFactor(1536/designSize.height); } else if (screenSize.height > 320) {    searchPaths.push_back("ipad");    director->setContentScaleFactor(768/designSize.height); } else {    searchPaths.push_back("iphone");    director->setContentScaleFactor(380/designSize.height); } auto fileUtils = FileUtils::getInstance(); fileUtils->setSearchPaths(searchPaths); Once again, we tell our GLView object (our OpenGL view) that we designed the game for a certain screen size (the iPad retina screen) and once again, we want our game screen to resize to match the screen on the device (ResolutionPolicy::EXACT_FIT). Then we determine where to load our images from, based on the device's screen size. We have art for iPad retina, then for regular iPad which is shared by iPhone retina, and for the regular iPhone. We end by setting the scale factor based on the designed target. Adding background music Still inside AppDelegate.cpp, we load the sound files we'll use in the game, including a background.mp3 (courtesy of Kevin MacLeod from incompetech.com), which we load through the command: auto audioEngine = SimpleAudioEngine::getInstance(); audioEngine->preloadBackgroundMusic(fileUtils->fullPathForFilename("background.mp3").c_str()); We end by setting the effects' volume down a tad: //lower playback volume for effects audioEngine->setEffectsVolume(0.4f); For background music volume, you must use setBackgroundMusicVolume. If you create some sort of volume control in your game, these are the calls you would make to adjust the volume based on the user's preference. Initializing the game Now back to GameLayer.cpp. If you take a look inside our init method, you will see that the game initializes by calling three methods: createGameScreen, createPools, and createActions. We'll create all our screen elements inside the first method, and then create object pools so we don't instantiate any sprite inside the main loop; and we'll create all the main actions used in our game inside the createActions method. And as soon as the game initializes, we start playing the background music, with its should loop parameter set to true: SimpleAudioEngine::getInstance()-   >playBackgroundMusic("background.mp3", true); We once again store the screen size for future reference, and we'll use a _running Boolean for game states. If you run the game now, you should only see the background image: Using sprite sheets in Cocos2d-x A sprite sheet is a way to group multiple images together in one image file. In order to texture a sprite with one of these images, you must have the information of where in the sprite sheet that particular image is found (its rectangle). Sprite sheets are often organized in two files: the image one and a data file that describes where in the image you can find the individual textures. I used TexturePacker to create these files for the game. You can find them inside the ipad, ipadhd, and iphone folders inside Resources. There is a sprite_sheet.png file for the image and a sprite_sheet.plist file that describes the individual frames inside the image. This is what the sprite_sheet.png file looks like: Batch drawing sprites In Cocos2d-x, sprite sheets can be used in conjunction with a specialized node, called SpriteBatchNode. This node can be used whenever you wish to use multiple sprites that share the same source image inside the same node. So you could have multiple instances of a Sprite class that uses a bullet.png texture for instance. And if the source image is a sprite sheet, you can have multiple instances of sprites displaying as many different textures as you could pack inside your sprite sheet. With SpriteBatchNode, you can substantially reduce the number of calls during the rendering stage of your game, which will help when targeting less powerful systems, though not noticeably in more modern devices. Let me show you how to create a SpriteBatchNode. Time for action – creating SpriteBatchNode Let's begin implementing the createGameScreen method in GameLayer.cpp. Just below the lines that add the bg sprite, we instantiate our batch node: void GameLayer::createGameScreen() {   //add bg auto bg = Sprite::create("bg.png"); ...   SpriteFrameCache::getInstance()-> addSpriteFramesWithFile("sprite_sheet.plist"); _gameBatchNode = SpriteBatchNode::create("sprite_sheet.png"); this->addChild(_gameBatchNode); In order to create the batch node from a sprite sheet, we first load all the frame information described by the sprite_sheet.plist file into SpriteFrameCache. And then we create the batch node with the sprite_sheet.png file, which is the source texture shared by all sprites added to this batch node. (The background image is not part of the sprite sheet, so it's added separately before we add _gameBatchNode to GameLayer.) Now we can start putting stuff inside _gameBatchNode. First, the city: for (int i = 0; i < 2; i++) { auto sprite = Sprite::createWithSpriteFrameName   ("city_dark.png");    sprite->setAnchorPoint(Vec2(0.5,0)); sprite->setPosition(_screenSize.width * (0.25f + i * 0.5f),0)); _gameBatchNode->addChild(sprite, kMiddleground); sprite = Sprite::createWithSpriteFrameName ("city_light.png"); sprite->setAnchorPoint(Vec2(0.5,0)); sprite->setPosition(Vec2(_screenSize.width * (0.25f + i * 0.5f), _screenSize.height * 0.1f)); _gameBatchNode->addChild(sprite, kBackground); } Then the trees: //add trees for (int i = 0; i < 3; i++) { auto sprite = Sprite::createWithSpriteFrameName("trees.png"); sprite->setAnchorPoint(Vec2(0.5f, 0.0f)); sprite->setPosition(Vec2(_screenSize.width * (0.2f + i * 0.3f),0)); _gameBatchNode->addChild(sprite, kForeground);   } Notice that here we create sprites by passing it a sprite frame name. The IDs for these frame names were loaded into SpriteFrameCache through our sprite_sheet.plist file. The screen so far is made up of two instances of city_dark.png tiling at the bottom of the screen, and two instances of city_light.png also tiling. One needs to appear on top of the other and for that we use the enumerated values declared in GameLayer.h: enum { kBackground, kMiddleground, kForeground }; We use the addChild( Node, zOrder) method to layer our sprites on top of each other, using different values for their z order. So for example, when we later add three sprites showing the trees.png sprite frame, we add them on top of all previous sprites using the highest value for z that we find in the enumerated list, which is kForeground. Why go through the trouble of tiling the images and not using one large image instead, or combining some of them with the background image? Because I wanted to include the greatest number of images possible inside the one sprite sheet, and have that sprite sheet to be as small as possible, to illustrate all the clever ways you can use and optimize sprite sheets. This is not necessary in this particular game. What just happened? We began creating the initial screen for our game. We are using a SpriteBatchNode to contain all the sprites that use images from our sprite sheet. So SpriteBatchNode behaves as any node does—as a container. And we can layer individual sprites inside the batch node by manipulating their z order. Bitmap fonts in Cocos2d-x The Cocos2d-x Label class has a static create method that uses bitmap images for the characters. The bitmap image we are using here was created with the program GlyphDesigner, and in essence, it works just as a sprite sheet does. As a matter of fact, Label extends SpriteBatchNode, so it behaves just like a batch node. You have images for all individual characters you'll need packed inside a PNG file (font.png), and then a data file (font.fnt) describing where each character is. The following screenshot shows how the font sprite sheet looks like for our game: The difference between Label and a regular SpriteBatchNode class is that the data file also feeds the Label object information on how to write with this font. In other words, how to space out the characters and lines correctly. The Label objects we are using in the game are instantiated with the name of the data file and their initial string value: _scoreDisplay = Label::createWithBMFont("font.fnt", "0"); And the value for the label is changed through the setString method: _scoreDisplay->setString("1000"); Just as with every other image in the game, we also have different versions of font.fnt and font.png in our Resources folders, one for each screen definition. FileUtils will once again do the heavy lifting of finding the correct file for the correct screen. So now let's create the labels for our game. Time for action – creating bitmap font labels Creating a bitmap font is somewhat similar to creating a batch node. Continuing with our createGameScreen method, add the following lines to the score label: _scoreDisplay = Label::createWithBMFont("font.fnt", "0"); _scoreDisplay->setAnchorPoint(Vec2(1,0.5)); _scoreDisplay->setPosition(Vec2   (_screenSize.width * 0.8f, _screenSize.height * 0.94f)); this->addChild(_scoreDisplay); And then add a label to display the energy level, and set its horizontal alignment to Right: _energyDisplay = Label::createWithBMFont("font.fnt", "100%", TextHAlignment::RIGHT); _energyDisplay->setPosition(Vec2   (_screenSize.width * 0.3f, _screenSize.height * 0.94f)); this->addChild(_energyDisplay); Add the following line for an icon that appears next to the _energyDisplay label: auto icon = Sprite::createWithSpriteFrameName ("health_icon.png"); icon->setPosition( Vec2(_screenSize.   width * 0.15f, _screenSize.height * 0.94f) ); _gameBatchNode->addChild(icon, kBackground); What just happened? We just created our first bitmap font object in Cocos2d-x. Now let's finish creating our game's sprites. Time for action – adding the final screen sprites The last sprites we need to create are the clouds, the bomb and shockwave, and our game state messages. Back to the createGameScreen method, add the clouds to the screen: for (int i = 0; i < 4; i++) { float cloud_y = i % 2 == 0 ? _screenSize.height * 0.4f : _screenSize.height * 0.5f; auto cloud = Sprite::createWithSpriteFrameName("cloud.png"); cloud->setPosition(Vec2 (_screenSize.width * 0.1f + i * _screenSize.width * 0.3f, cloud_y)); _gameBatchNode->addChild(cloud, kBackground); _clouds.pushBack(cloud); } Create the _bomb sprite; players will grow when tapping the screen: _bomb = Sprite::createWithSpriteFrameName("bomb.png"); _bomb->getTexture()->generateMipmap(); _bomb->setVisible(false);   auto size = _bomb->getContentSize();   //add sparkle inside bomb sprite auto sparkle = Sprite::createWithSpriteFrameName("sparkle.png"); sparkle->setPosition(Vec2(size.width * 0.72f, size.height *   0.72f)); _bomb->addChild(sparkle, kMiddleground, kSpriteSparkle);   //add halo inside bomb sprite auto halo = Sprite::createWithSpriteFrameName   ("halo.png"); halo->setPosition(Vec2(size.width * 0.4f, size.height *   0.4f)); _bomb->addChild(halo, kMiddleground, kSpriteHalo); _gameBatchNode->addChild(_bomb, kForeground); Then create the _shockwave sprite that appears after the _bomb goes off: _shockWave = Sprite::createWithSpriteFrameName ("shockwave.png"); _shockWave->getTexture()->generateMipmap(); _shockWave->setVisible(false); _gameBatchNode->addChild(_shockWave); Finally, add the two messages that appear on the screen, one for our intro state and one for our gameover state: _introMessage = Sprite::createWithSpriteFrameName ("logo.png"); _introMessage->setPosition(Vec2   (_screenSize.width * 0.5f, _screenSize.height * 0.6f)); _introMessage->setVisible(true); this->addChild(_introMessage, kForeground);   _gameOverMessage = Sprite::createWithSpriteFrameName   ("gameover.png"); _gameOverMessage->setPosition(Vec2   (_screenSize.width * 0.5f, _screenSize.height * 0.65f)); _gameOverMessage->setVisible(false); this->addChild(_gameOverMessage, kForeground); What just happened? There is a lot of new information regarding sprites in the previous code. So let's go over it carefully: We started by adding the clouds. We put the sprites inside a vector so we can move the clouds later. Notice that they are also part of our batch node. Next comes the bomb sprite and our first new call: _bomb->getTexture()->generateMipmap(); With this we are telling the framework to create antialiased copies of this texture in diminishing sizes (mipmaps), since we are going to scale it down later. This is optional of course; sprites can be resized without first generating mipmaps, but if you notice loss of quality in your scaled sprites, you can fix that by creating mipmaps for their texture. The texture must have size values in so-called POT (power of 2: 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, and so on). Textures in OpenGL must always be sized this way; when they are not, Cocos2d-x will do one of two things: it will either resize the texture in memory, adding transparent pixels until the image reaches a POT size, or stop the execution on an assert. With textures used for mipmaps, the framework will stop execution for non-POT textures. I add the sparkle and the halo sprites as children to the _bomb sprite. This will use the container characteristic of nodes to our advantage. When I grow the bomb, all its children will grow with it. Notice too that I use a third parameter to addChild for halo and sparkle: bomb->addChild(halo, kMiddleground, kSpriteHalo); This third parameter is an integer tag from yet another enumerated list declared in GameLayer.h. I can use this tag to retrieve a particular child from a sprite as follows: auto halo = (Sprite *)   bomb->getChildByTag(kSpriteHalo); We now have our game screen in place: Next come object pools. Time for action – creating our object pools The pools are just vectors of objects. And here are the steps to create them: Inside the createPools method, we first create a pool for meteors: void GameLayer::createPools() { int i; _meteorPoolIndex = 0; for (i = 0; i < 50; i++) { auto sprite = Sprite::createWithSpriteFrameName("meteor.png"); sprite->setVisible(false); _gameBatchNode->addChild(sprite, kMiddleground, kSpriteMeteor); _meteorPool.pushBack(sprite); } Then we create an object pool for health packs: _healthPoolIndex = 0; for (i = 0; i < 20; i++) { auto sprite = Sprite::createWithSpriteFrameName("health.png"); sprite->setVisible(false); sprite->setAnchorPoint(Vec2(0.5f, 0.8f)); _gameBatchNode->addChild(sprite, kMiddleground, kSpriteHealth); _healthPool.pushBack(sprite); } We'll use the corresponding pool index to retrieve objects from the vectors as the game progresses. What just happened? We now have a vector of invisible meteor sprites and a vector of invisible health sprites. We'll use their respective pool indices to retrieve these from the vector as needed as you'll see in a moment. But first we need to take care of actions and animations. With object pools, we reduce the number of instantiations during the main loop, and it allows us to never destroy anything that can be reused. But if you need to remove a child from a node, use ->removeChild or ->removeChildByTag if a tag is present. Actions in a nutshell If you remember, a node will store information about position, scale, rotation, visibility, and opacity of a node. And in Cocos2d-x, there is an Action class to change each one of these values over time, in effect animating these transformations. Actions are usually created with a static method create. The majority of these actions are time-based, so usually the first parameter you need to pass an action is the time length for the action. So for instance: auto fadeout = FadeOut::create(1.0f); This creates a fadeout action that will take one second to complete. You can run it on a sprite, or node, as follows: mySprite->runAction(fadeout); Cocos2d-x has an incredibly flexible system that allows us to create any combination of actions and transformations to achieve any effect we desire. You may, for instance, choose to create an action sequence (Sequence) that contains more than one action; or you can apply easing effects (EaseIn, EaseOut, and so on) to your actions. You can choose to repeat an action a certain number of times (Repeat) or forever (RepeatForever); and you can add callbacks to functions you want called once an action is completed (usually inside a Sequence action). Time for action – creating actions with Cocos2d-x Creating actions with Cocos2d-x is a very simple process: Inside our createActions method, we will instantiate the actions we can use repeatedly in our game. Let's create our first actions: void GameLayer::createActions() { //swing action for health drops auto easeSwing = Sequence::create( EaseInOut::create(RotateTo::create(1.2f, -10), 2), EaseInOut::create(RotateTo::create(1.2f, 10), 2), nullptr);//mark the end of a sequence with a nullptr _swingHealth = RepeatForever::create( (ActionInterval *) easeSwing ); _swingHealth->retain(); Actions can be combined in many different forms. Here, the retained _swingHealth action is a RepeatForever action of Sequence that will rotate the health sprite first one way, then the other, with EaseInOut wrapping the RotateTo action. RotateTo takes 1.2 seconds to rotate the sprite first to -10 degrees and then to 10. And the easing has a value of 2, which I suggest you experiment with to get a sense of what it means visually. Next we add three more actions: //action sequence for shockwave: fade out, callback when //done _shockwaveSequence = Sequence::create( FadeOut::create(1.0f), CallFunc::create(std::bind(&GameLayer::shockwaveDone, this)), nullptr); _shockwaveSequence->retain();   //action to grow bomb _growBomb = ScaleTo::create(6.0f, 1.0); _growBomb->retain();   //action to rotate sprites auto rotate = RotateBy::create(0.5f , -90); _rotateSprite = RepeatForever::create( rotate ); _rotateSprite->retain(); First, another Sequence. This will fade out the sprite and call the shockwaveDone function, which is already implemented in the class and turns the _shockwave sprite invisible when called. The last one is a RepeatForever action of a RotateBy action. In half a second, the sprite running this action will rotate -90 degrees and will do that again and again. What just happened? You just got your first glimpse of how to create actions in Cocos2d-x and how the framework allows for all sorts of combinations to accomplish any effect. It may be hard at first to read through a Sequence action and understand what's happening, but the logic is easy to follow once you break it down into its individual parts. But we are not done with the createActions method yet. Next come sprite animations. Animating a sprite in Cocos2d-x The key thing to remember is that an animation is just another type of action, one that changes the texture used by a sprite over a period of time. In order to create an animation action, you need to first create an Animation object. This object will store all the information regarding the different sprite frames you wish to use in the animation, the length of the animation in seconds, and whether it loops or not. With this Animation object, you then create a Animate action. Let's take a look. Time for action – creating animations Animations are a specialized type of action that require a few extra steps: Inside the same createActions method, add the lines for the two animations we have in the game. First, we start with the animation that shows an explosion when a meteor reaches the city. We begin by loading the frames into an Animation object: auto animation = Animation::create(); int i; for(i = 1; i <= 10; i++) { auto name = String::createWithFormat("boom%i.png", i); auto frame = SpriteFrameCache::getInstance()->getSpriteFrameByName(name->getCString()); animation->addSpriteFrame(frame); } Then we use the Animation object inside a Animate action: animation->setDelayPerUnit(1 / 10.0f); animation->setRestoreOriginalFrame(true); _groundHit = Sequence::create(    MoveBy::create(0, Vec2(0,_screenSize.height * 0.12f)),    Animate::create(animation),    CallFuncN::create(CC_CALLBACK_1(GameLayer::animationDone, this)), nullptr); _groundHit->retain(); The same steps are repeated to create the other explosion animation used when the player hits a meteor or a health pack. animation = Animation::create(); for(int i = 1; i <= 7; i++) { auto name = String::createWithFormat("explosion_small%i.png", i); auto frame = SpriteFrameCache::getInstance()->getSpriteFrameByName(name->getCString()); animation->addSpriteFrame(frame); }   animation->setDelayPerUnit(0.5 / 7.0f); animation->setRestoreOriginalFrame(true); _explosion = Sequence::create(      Animate::create(animation),    CallFuncN::create(CC_CALLBACK_1(GameLayer::animationDone, this)), nullptr); _explosion->retain(); What just happened? We created two instances of a very special kind of action in Cocos2d-x: Animate. Here is what we did: First, we created an Animation object. This object holds the references to all the textures used in the animation. The frames were named in such a way that they could easily be concatenated inside a loop (boom1, boom2, boom3, and so on). There are 10 frames for the first animation and seven for the second. The textures (or frames) are SpriteFrame objects we grab from SpriteFrameCache, which as you remember, contains all the information from the sprite_sheet.plist data file. So the frames are in our sprite sheet. Then when all frames are in place, we determine the delay of each frame by dividing the total amount of seconds we want the animation to last by the total number of frames. The setRestoreOriginalFrame method is important here. If we set setRestoreOriginalFrame to true, then the sprite will revert to its original appearance once the animation is over. For example, if I have an explosion animation that will run on a meteor sprite, then by the end of the explosion animation, the sprite will revert to displaying the meteor texture. Time for the actual action. Animate receives the Animation object as its parameter. (In the first animation, we shift the position of the sprite just before the explosion appears, so there is an extra MoveBy method.) And in both instances, I make a call to an animationDone callback already implemented in the class. It makes the calling sprite invisible: void GameLayer::animationDone (Node* pSender) { pSender->setVisible(false); } We could have used the same method for both callbacks (animationDone and shockwaveDone) as they accomplish the same thing. But I wanted to show you a callback that receives as an argument, the node that made the call and one that did not. Respectively, these are CallFuncN and CallFunc, and were used inside the action sequences we just created. Time to make our game tick! Okay, we have our main elements in place and are ready to add the final bit of logic to run the game. But how will everything work? We will use a system of countdowns to add new meteors and new health packs, as well as a countdown that will incrementally make the game harder to play. On touch, the player will start the game if the game is not running, and also add bombs and explode them during gameplay. An explosion creates a shockwave. On update, we will check against collision between our _shockwave sprite (if visible) and all our falling objects. And that's it. Cocos2d-x will take care of all the rest through our created actions and callbacks! So let's implement our touch events first. Time for action – handling touches Time to bring the player to our party: Time to implement our onTouchBegan method. We'll begin by handling the two game states, intro and game over: bool GameLayer::onTouchBegan (Touch * touch, Event * event){   //if game not running, we are seeing either intro or //gameover if (!_running) {    //if intro, hide intro message    if (_introMessage->isVisible()) {      _introMessage->setVisible(false);        //if game over, hide game over message    } else if (_gameOverMessage->isVisible()) {      SimpleAudioEngine::getInstance()->stopAllEffects();      _gameOverMessage->setVisible(false);         }       this->resetGame();    return true; } Here we check to see if the game is not running. If not, we check to see if any of our messages are visible. If _introMessage is visible, we hide it. If _gameOverMessage is visible, we stop all current sound effects and hide the message as well. Then we call a method called resetGame, which will reset all the game data (energy, score, and countdowns) to their initial values, and set _running to true. Next we handle the touches. But we only need to handle one each time so we use ->anyObject() on Set: auto touch = (Touch *)pTouches->anyObject();   if (touch) { //if bomb already growing... if (_bomb->isVisible()) {    //stop all actions on bomb, halo and sparkle    _bomb->stopAllActions();    auto child = (Sprite *) _bomb->getChildByTag(kSpriteHalo);    child->stopAllActions();    child = (Sprite *) _bomb->getChildByTag(kSpriteSparkle);    child->stopAllActions();       //if bomb is the right size, then create shockwave    if (_bomb->getScale() > 0.3f) {      _shockWave->setScale(0.1f);      _shockWave->setPosition(_bomb->getPosition());      _shockWave->setVisible(true);      _shockWave->runAction(ScaleTo::create(0.5f, _bomb->getScale() * 2.0f));      _shockWave->runAction(_shockwaveSequence->clone());      SimpleAudioEngine::getInstance()->playEffect("bombRelease.wav");      } else {      SimpleAudioEngine::getInstance()->playEffect("bombFail.wav");    }    _bomb->setVisible(false);    //reset hits with shockwave, so we can count combo hits    _shockwaveHits = 0; //if no bomb currently on screen, create one } else {    Point tap = touch->getLocation();    _bomb->stopAllActions();    _bomb->setScale(0.1f);    _bomb->setPosition(tap);    _bomb->setVisible(true);    _bomb->setOpacity(50);    _bomb->runAction(_growBomb->clone());         auto child = (Sprite *) _bomb->getChildByTag(kSpriteHalo);      child->runAction(_rotateSprite->clone());      child = (Sprite *) _bomb->getChildByTag(kSpriteSparkle);      child->runAction(_rotateSprite->clone()); } } If _bomb is visible, it means it's already growing on the screen. So on touch, we use the stopAllActions() method on the bomb and we use the stopAllActions() method on its children that we retrieve through our tags: child = (Sprite *) _bomb->getChildByTag(kSpriteHalo); child->stopAllActions(); child = (Sprite *) _bomb->getChildByTag(kSpriteSparkle); child->stopAllActions(); If _bomb is the right size, we start our _shockwave. If it isn't, we play a bomb failure sound effect; there is no explosion and _shockwave is not made visible. If we have an explosion, then the _shockwave sprite is set to 10 percent of the scale. It's placed at the same spot as the bomb, and we run a couple of actions on it: we grow the _shockwave sprite to twice the scale the bomb was when it went off and we run a copy of _shockwaveSequence that we created earlier. Finally, if no _bomb is currently visible on screen, we create one. And we run clones of previously created actions on the _bomb sprite and its children. When _bomb grows, its children grow. But when the children rotate, the bomb does not: a parent changes its children, but the children do not change their parent. What just happened? We just added part of the core logic of the game. It is with touches that the player creates and explodes bombs to stop meteors from reaching the city. Now we need to create our falling objects. But first, let's set up our countdowns and our game data. Time for action – starting and restarting the game Let's add the logic to start and restart the game. Let's write the implementation for resetGame: void GameLayer::resetGame(void) {    _score = 0;    _energy = 100;       //reset timers and "speeds"    _meteorInterval = 2.5;    _meteorTimer = _meteorInterval * 0.99f;    _meteorSpeed = 10;//in seconds to reach ground    _healthInterval = 20;    _healthTimer = 0;    _healthSpeed = 15;//in seconds to reach ground       _difficultyInterval = 60;    _difficultyTimer = 0;       _running = true;       //reset labels    _energyDisplay->setString(std::to_string((int) _energy) + "%");    _scoreDisplay->setString(std::to_string((int) _score)); } Next, add the implementation of stopGame: void GameLayer::stopGame() {       _running = false;       //stop all actions currently running    int i;    int count = (int) _fallingObjects.size();       for (i = count-1; i >= 0; i--) {        auto sprite = _fallingObjects.at(i);        sprite->stopAllActions();        sprite->setVisible(false);        _fallingObjects.erase(i);    }    if (_bomb->isVisible()) {        _bomb->stopAllActions();        _bomb->setVisible(false);        auto child = _bomb->getChildByTag(kSpriteHalo);        child->stopAllActions();        child = _bomb->getChildByTag(kSpriteSparkle);        child->stopAllActions();    }    if (_shockWave->isVisible()) {        _shockWave->stopAllActions();        _shockWave->setVisible(false);    }    if (_ufo->isVisible()) {        _ufo->stopAllActions();        _ufo->setVisible(false);        auto ray = _ufo->getChildByTag(kSpriteRay);       ray->stopAllActions();        ray->setVisible(false);    } } What just happened? With these methods we control gameplay. We start the game with default values through resetGame(), and we stop all actions with stopGame(). Already implemented in the class is the method that makes the game more difficult as time progresses. If you take a look at the method (increaseDifficulty) you will see that it reduces the interval between meteors and reduces the time it takes for meteors to reach the ground. All we need now is the update method to run the countdowns and check for collisions. Time for action – updating the game We already have the code that updates the countdowns inside the update. If it's time to add a meteor or a health pack we do it. If it's time to make the game more difficult to play, we do that too. It is possible to use an action for these timers: a Sequence action with a Delay action object and a callback. But there are advantages to using these countdowns. It's easier to reset them and to change them, and we can take them right into our main loop. So it's time to add our main loop: What we need to do is check for collisions. So add the following code: if (_shockWave->isVisible()) { count = (int) _fallingObjects.size(); for (i = count-1; i >= 0; i--) {    auto sprite = _fallingObjects.at(i);    diffx = _shockWave->getPositionX() - sprite->getPositionX();    diffy = _shockWave->getPositionY() - sprite->getPositionY();    if (pow(diffx, 2) + pow(diffy, 2) <= pow(_shockWave->getBoundingBox().size.width * 0.5f, 2)) {    sprite->stopAllActions();    sprite->runAction( _explosion->clone());    SimpleAudioEngine::getInstance()->playEffect("boom.wav");    if (sprite->getTag() == kSpriteMeteor) {      _shockwaveHits++;      _score += _shockwaveHits * 13 + _shockwaveHits * 2;    }    //play sound    _fallingObjects.erase(i); } } _scoreDisplay->setString(std::to_string(_score)); } If _shockwave is visible, we check the distance between it and each sprite in _fallingObjects vector. If we hit any meteors, we increase the value of the _shockwaveHits property so we can award the player for multiple hits. Next we move the clouds: //move clouds for (auto sprite : _clouds) { sprite->setPositionX(sprite->getPositionX() + dt * 20); if (sprite->getPositionX() > _screenSize.width + sprite->getBoundingBox().size.width * 0.5f)    sprite->setPositionX(-sprite->getBoundingBox().size.width * 0.5f); } I chose not to use a MoveTo action for the clouds to show you the amount of code that can be replaced by a simple action. If not for Cocos2d-x actions, we would have to implement logic to move, rotate, swing, scale, and explode all our sprites! And finally: if (_bomb->isVisible()) {    if (_bomb->getScale() > 0.3f) {      if (_bomb->getOpacity() != 255)        _bomb->setOpacity(255);    } } We give the player an extra visual cue to when a bomb is ready to explode by changing its opacity. What just happened? The main loop is pretty straightforward when you don't have to worry about updating individual sprites, as our actions take care of that for us. We pretty much only need to run collision checks between our sprites, and to determine when it's time to throw something new at the player. So now the only thing left to do is grab the meteors and health packs from the pools when their timers are up. So let's get right to it. Time for action – retrieving objects from the pool We just need to use the correct index to retrieve the objects from their respective vector: To retrieve meteor sprites, we'll use the resetMeteor method: void GameLayer::resetMeteor(void) {    //if too many objects on screen, return    if (_fallingObjects.size() > 30) return;       auto meteor = _meteorPool.at(_meteorPoolIndex);      _meteorPoolIndex++;    if (_meteorPoolIndex == _meteorPool.size())      _meteorPoolIndex = 0;      int meteor_x = rand() % (int) (_screenSize.width * 0.8f) + _screenSize.width * 0.1f;    int meteor_target_x = rand() % (int) (_screenSize.width * 0.8f) + _screenSize.width * 0.1f;       meteor->stopAllActions();    meteor->setPosition(Vec2(meteor_x, _screenSize.height + meteor->getBoundingBox().size.height * 0.5));    //create action    auto rotate = RotateBy::create(0.5f , -90);    auto repeatRotate = RepeatForever::create( rotate );    auto sequence = Sequence::create (                MoveTo::create(_meteorSpeed, Vec2(meteor_target_x, _screenSize.height * 0.15f)),                CallFunc::create(std::bind(&GameLayer::fallingObjectDone, this, meteor) ), nullptr);   meteor->setVisible ( true ); meteor->runAction(repeatRotate); meteor->runAction(sequence); _fallingObjects.pushBack(meteor); } We grab the next available meteor from the pool, then we pick a random start and end x value for its MoveTo action. The meteor starts at the top of the screen and will move to the bottom towards the city, but the x value is randomly picked each time. We rotate the meteor inside a RepeatForever action, and we use Sequence to move the sprite to its target position and then call back fallingObjectDone when the meteor has reached its target. We finish by adding the new meteor we retrieved from the pool to the _fallingObjects vector so we can check collisions with it. The method to retrieve the health (resetHealth) sprites is pretty much the same, except that swingHealth action is used instead of rotate. You'll find that method already implemented in GameLayer.cpp. What just happened? So in resetGame we set the timers, and we update them in the update method. We use these timers to add meteors and health packs to the screen by grabbing the next available one from their respective pool, and then we proceed to run collisions between an exploding bomb and these falling objects. Notice that in both resetMeteor and resetHealth we don't add new sprites if too many are on screen already: if (_fallingObjects->size() > 30) return; This way the game does not get ridiculously hard, and we never run out of unused objects in our pools. And the very last bit of logic in our game is our fallingObjectDone callback, called when either a meteor or a health pack has reached the ground, at which point it awards or punishes the player for letting sprites through. When you take a look at that method inside GameLayer.cpp, you will notice how we use ->getTag() to quickly ascertain which type of sprite we are dealing with (the one calling the method): if (pSender->getTag() == kSpriteMeteor) { If it's a meteor, we decrease energy from the player, play a sound effect, and run the explosion animation; an autorelease copy of the _groundHit action we retained earlier, so we don't need to repeat all that logic every time we need to run this action. If the item is a health pack, we increase the energy or give the player some points, play a nice sound effect, and hide the sprite. Play the game! We've been coding like mad, and it's finally time to run the game. But first, don't forget to release all the items we retained. In GameLayer.cpp, add our destructor method: GameLayer::~GameLayer () {       //release all retained actions    CC_SAFE_RELEASE(_growBomb);    CC_SAFE_RELEASE(_rotateSprite);    CC_SAFE_RELEASE(_shockwaveSequence);    CC_SAFE_RELEASE(_swingHealth);    CC_SAFE_RELEASE(_groundHit);    CC_SAFE_RELEASE(_explosion);    CC_SAFE_RELEASE(_ufoAnimation);    CC_SAFE_RELEASE(_blinkRay);       _clouds.clear();    _meteorPool.clear();    _healthPool.clear();    _fallingObjects.clear(); } The actual game screen will now look something like this: Now, let's take this to Android. Time for action – running the game in Android Follow these steps to deploy the game to Android: This time, there is no need to alter the manifest because the default settings are the ones we want. So, navigate to proj.android and then to the jni folder and open the Android.mk file in a text editor. Edit the lines in LOCAL_SRC_FILES to read as follows: LOCAL_SRC_FILES := hellocpp/main.cpp \                    ../../Classes/AppDelegate.cpp \                    ../../Classes/GameLayer.cpp Follow the instructions from the HelloWorld and AirHockey examples to import the game into Eclipse. Save it and run your application. This time, you can try out different size screens if you have the devices. What just happened? You just ran a universal app in Android. And nothing could have been simpler. Summary In my opinion, after nodes and all their derived objects, actions are the second best thing about Cocos2d-x. They are time savers and can quickly spice things up in any project with professional-looking animations. And I hope with the examples found in this article, you will be able to create any action you need with Cocos2d-x. Resources for Article: Further resources on this subject: Animations in Cocos2d-x [article] Moving the Space Pod Using Touch [article] Cocos2d-x: Installation [article]
Read more
  • 0
  • 0
  • 6889

article-image-geolocating-photos-map
Packt
25 Mar 2015
7 min read
Save for later

Geolocating photos on the map

Packt
25 Mar 2015
7 min read
In this article by Joel Lawhead, author of the book, QGIS Python Programming Cookbook uses the tags to create locations on a map for some photos and provide links to open them. (For more resources related to this topic, see here.) Getting ready You will need to download some sample geotagged photos from https://github.com/GeospatialPython/qgis/blob/gh-pages/photos.zip?raw=true and place them in a directory named photos in your qgis_data directory. How to do it... QGIS requires the Python Imaging Library (PIL), which should already be included with your installation. PIL can parse EXIF tags. We will gather the filenames of the photos, parse the location information, convert it to decimal degrees, create the point vector layer, add the photo locations, and add an action link to the attributes. To do this, we need to perform the following steps: In the QGIS Python Console, import the libraries that we'll need, including k for parsing image data and the glob module for doing wildcard file searches: import globimport Imagefrom ExifTags import TAGS Next, we'll create a function that can parse the header data: def exif(img):   exif_data = {}   try:         i = Image.open(img)       tags = i._getexif()       for tag, value in tags.items():         decoded = TAGS.get(tag, tag)           exif_data[decoded] = value   except:       passreturn exif_data Now, we'll create a function that can convert degrees-minute-seconds to decimal degrees, which is how coordinates are stored in JPEG images: def dms2dd(d, m, s, i):   sec = float((m * 60) + s)   dec = float(sec / 3600)   deg = float(d + dec)   if i.upper() == 'W':       deg = deg * -1   elif i.upper() == 'S':       deg = deg * -1   return float(deg) Next, we'll define a function to parse the location data from the header data: def gps(exif):   lat = None   lon = None   if exif['GPSInfo']:             # Lat       coords = exif['GPSInfo']       i = coords[1]       d = coords[2][0][0]       m = coords[2][1][0]       s = coords[2][2][0]       lat = dms2dd(d, m ,s, i)       # Lon       i = coords[3]       d = coords[4][0][0]       m = coords[4][1][0]       s = coords[4][2][0]       lon = dms2dd(d, m ,s, i)return lat, lon Next, we'll loop through the photos directory, get the filenames, parse the location information, and build a simple dictionary to store the information, as follows: photos = {}photo_dir = "/Users/joellawhead/qgis_data/photos/"files = glob.glob(photo_dir + "*.jpg")for f in files:   e = exif(f)   lat, lon = gps(e) photos[f] = [lon, lat] Now, we'll set up the vector layer for editing: lyr_info = "Point?crs=epsg:4326&field=photo:string(75)"vectorLyr = QgsVectorLayer(lyr_info, "Geotagged Photos" , "memory")vpr = vectorLyr.dataProvider() We'll add the photo details to the vector layer: features = []for pth, p in photos.items():   lon, lat = p   pnt = QgsGeometry.fromPoint(QgsPoint(lon,lat))   f = QgsFeature()   f.setGeometry(pnt)   f.setAttributes([pth])   features.append(f)vpr.addFeatures(features)vectorLyr.updateExtents() Now, we can add the layer to the map and make the active layer: QgsMapLayerRegistry.instance().addMapLayer(vectorLyr)iface.setActiveLayer(vectorLyr)activeLyr = iface.activeLayer() Finally, we'll add an action that allows you to click on it and open the photo: actions = activeLyr.actions()actions.addAction(QgsAction.OpenUrl, "Photos", '[% "photo" %]') How it works... Using the included PIL EXIF parser, getting location information and adding it to a vector layer is relatively straightforward. This action is a default option for opening a URL. However, you can also use Python expressions as actions to perform a variety of tasks. The following screenshot shows an example of the data visualization and photo popup: There's more... Another plugin called Photo2Shape is available, but it requires you to install an external EXIF tag parser. Image change detection Change detection allows you to automatically highlight the differences between two images in the same area if they are properly orthorectified. We'll do a simple difference change detection on two images, which are several years apart, to see the differences in urban development and the natural environment. Getting ready You can download the two images from https://github.com/GeospatialPython/qgis/blob/gh-pages/change-detection.zip?raw=true and put them in a directory named change-detection in the rasters directory of your qgis_data directory. Note that the file is 55 megabytes, so it may take several minutes to download. How to do it... We'll use the QGIS raster calculator to subtract the images in order to get the difference, which will highlight significant changes. We'll also add a color ramp shader to the output in order to visualize the changes. To do this, we need to perform the following steps: First, we need to import the libraries that we need in to the QGIS console: from PyQt4.QtGui import *from PyQt4.QtCore import *from qgis.analysis import * Now, we'll set up the path names and raster names for our images: before = "/Users/joellawhead/qgis_data/rasters/change-detection/before.tif"|after = "/Users/joellawhead/qgis_data/rasters/change-detection/after.tif"beforeName = "Before"afterName = "After" Next, we'll establish our images as raster layers: beforeRaster = QgsRasterLayer(before, beforeName)afterRaster = QgsRasterLayer(after, afterName) Then, we can build the calculator entries: beforeEntry = QgsRasterCalculatorEntry()afterEntry = QgsRasterCalculatorEntry()beforeEntry.raster = beforeRasterafterEntry.raster = afterRasterbeforeEntry.bandNumber = 1afterEntry.bandNumber = 2beforeEntry.ref = beforeName + "@1"afterEntry.ref = afterName + "@2"entries = [afterEntry, beforeEntry] Now, we'll set up the simple expression that does the math for remote sensing: exp = "%s - %s" % (afterEntry.ref, beforeEntry.ref) Then, we can set up the output file path, the raster extent, and pixel width and height: output = "/Users/joellawhead/qgis_data/rasters/change-detection/change.tif"e = beforeRaster.extent()w = beforeRaster.width()h = beforeRaster.height() Now, we perform the calculation: change = QgsRasterCalculator(exp, output, "GTiff", e, w, h, entries)change.processCalculation() Finally, we'll load the output as a layer, create the color ramp shader, apply it to the layer, and add it to the map, as shown here: lyr = QgsRasterLayer(output, "Change")algorithm = QgsContrastEnhancement.StretchToMinimumMaximumlimits = QgsRaster.ContrastEnhancementMinMaxlyr.setContrastEnhancement(algorithm, limits)s = QgsRasterShader()c = QgsColorRampShader()c.setColorRampType(QgsColorRampShader.INTERPOLATED)i = []qri = QgsColorRampShader.ColorRampItemi.append(qri(0, QColor(0,0,0,0), 'NODATA'))i.append(qri(-101, QColor(123,50,148,255), 'Significant Itensity Decrease'))i.append(qri(-42.2395, QColor(194,165,207,255), 'Minor Itensity Decrease'))i.append(qri(16.649, QColor(247,247,247,0), 'No Change'))i.append(qri(75.5375, QColor(166,219,160,255), 'Minor Itensity Increase'))i.append(qri(135, QColor(0,136,55,255), 'Significant Itensity Increase'))c.setColorRampItemList(i)s.setRasterShaderFunction(c)ps = QgsSingleBandPseudoColorRenderer(lyr.dataProvider(), 1, s)lyr.setRenderer(ps)QgsMapLayerRegistry.instance().addMapLayer(lyr) How it works... If a building is added in the new image, it will be brighter than its surroundings. If a building is removed, the new image will be darker in that area. The same holds true for vegetation, to some extent. Summary The concept is simple. We subtract the older image data from the new image data. Concentrating on urban areas tends to be highly reflective and results in higher image pixel values. Resources for Article: Further resources on this subject: Prototyping Arduino Projects using Python [article] Python functions – Avoid repeating code [article] Pentesting Using Python [article]
Read more
  • 0
  • 0
  • 3254

article-image-wlan-encryption-flaws
Packt
25 Mar 2015
21 min read
Save for later

WLAN Encryption Flaws

Packt
25 Mar 2015
21 min read
In this article by Cameron Buchanan, author of the book Kali Linux Wireless Penetration Testing Beginner's Guide. (For more resources related to this topic, see here.) "640K is more memory than anyone will ever need."                                                                             Bill Gates, Founder, Microsoft Even with the best of intentions, the future is always unpredictable. The WLAN committee designed WEP and then WPA to be foolproof encryption mechanisms but, over time, both these mechanisms had flaws that have been widely publicized and exploited in the real world. WLAN encryption mechanisms have had a long history of being vulnerable to cryptographic attacks. It started with WEP in early 2000, which eventually was completely broken. In recent times, attacks are slowly targeting WPA. Even though there is no public attack available currently to break WPA in all general conditions, there are attacks that are feasible under special circumstances. In this section, we will take a look at the following topics: Different encryption schemas in WLANs Cracking WEP encryption Cracking WPA encryption   WLAN encryption WLANs transmit data over the air and thus there is an inherent need to protect data confidentiality. This is best done using encryption. The WLAN committee (IEEE 802.11) formulated the following protocols for data encryption: Wired Equivalent Privacy (WEP) Wi-Fi Protected Access (WPA) Wi-Fi Protection Access v2 (WPAv2) In this section, we will take a look at each of these encryption protocols and demonstrate various attacks against them. WEP encryption The WEP protocol was known to be flawed as early as 2000 but, surprisingly, it is still continuing to be used and access points still ship with WEP enabled capabilities. There are many cryptographic weaknesses in WEP and they were discovered by Walker, Arbaugh, Fluhrer, Martin, Shamir, KoreK, and many others. Evaluation of WEP from a cryptographic standpoint is beyond the scope, as it involves understanding complex math. In this section, we will take a look at how to break WEP encryption using readily available tools on the BackTrack platform. This includes the entire aircrack-ng suite of tools—airmon-ng, aireplay-ng, airodump-ng, aircrack-ng, and others. The fundamental weakness in WEP is its use of RC4 and a short IV value that is recycled every 224 frames. While this is a large number in itself, there is a 50 percent chance of four reuses every 5,000 packets. To use this to our advantage, we generate a large amount of traffic so that we can increase the likelihood of IVs that have been reused and thus compare two cipher texts encrypted with the same IV and key. Let's now first set up WEP in our test lab and see how we can break it. Time for action – cracking WEP Follow the given instructions to get started: Let's first connect to our access point Wireless Lab and go to the settings area that deals with wireless encryption mechanisms: On my access point, this can be done by setting the Security Mode to WEP. We will also need to set the WEP key length. As shown in the following screenshot, I have set WEP to use 128bit keys. I have set the default key to WEP Key 1 and the value in hex to abcdefabcdefabcdefabcdef12 as the 128-bit WEP key. You can set this to whatever you choose: Once the settings are applied, the access point should now be offering WEP as the encryption mechanism of choice. Let's now set up the attacker machine. Let's bring up Wlan0 by issuing the following command: ifconfig wlan0 up Then, we will run the following command: airmon-ng start wlan0 This is done so as to create mon0, the monitor mode interface, as shown in the following screenshot. Verify that the mon0 interface has been created using the iwconfig command: Let's run airodump-ng to locate our lab access point using the following command: airodump-ng mon0 As you can see in the following screenshot, we are able to see the Wireless Lab access point running WEP: For this exercise, we are only interested in the Wireless Lab, so let's enter the following command to only see packets for this network: airodump-ng –bssid 00:21:91:D2:8E:25 --channel 11 --write WEPCrackingDemo mon0 The preceding command line is shown in the following screenshot: We will request airodump-ng to save the packets into a pcap file using the --write directive: Now let's connect our wireless client to the access point and use the WEP key as abcdefabcdefabcdefabcdef12. Once the client has successfully connected, airodump-ng should report it on the screen. If you do an ls in the same directory, you will be able to see files prefixed with WEPCrackingDemo-*, as shown in the following screenshot. These are traffic dump files created by airodump-ng: If you notice the airodump-ng screen, the number of data packets listed under the #Data column is very few in number (only 68). In WEP cracking, we need a large number of data packets, encrypted with the same key to exploit weaknesses in the protocol. So, we will have to force the network to produce more data packets. To do this, we will use the aireplay-ng tool: We will capture ARP packets on the wireless network using Aireplay-ng and inject them back into the network to simulate ARP responses. We will be starting Aireplay-ng in a separate window, as shown in the next screenshot. Replaying these packets a few thousand times, we will generate a lot of data traffic on the network. Even though Aireplay-ng does not know the WEP key, it is able to identify the ARP packets by looking at the size of the packets. ARP is a fixed header protocol; thus, the size of the ARP packets can be easily determined and can be used to identify them even within encrypted traffic. We will run aireplay-ng with the options that are discussed next. The -3 option is for ARP replay, -b specifies the BSSID of our network, and -h specifies the client MAC address that we are spoofing. We need to do this, as replay attacks will only work for authenticated and associated client MAC addresses: Very soon you should see that aireplay-ng was able to sniff ARP packets and started replaying them into the network. If you encounter channel-related errors as I did, append –ignore-negative-one to your command, as shown in the following screenshot: At this point, airodump-ng will also start registering a lot of data packets. All these sniffed packets are being stored in the WEPCrackingDemo-* files that we saw previously: Now let's start with the actual cracking part! We fire up aircrack-ng with the option WEPCRackingDemo-0*.cap in a new window. This will start the aircrack-ng software and it will begin working on cracking the WEP key using the data packets in the file. Note that it is a good idea to have Airodump-ng collect the WEP packets, aireplay-ng do the replay attack, and aircrack-ng attempt to crack the WEP key based on the captured packets, all at the same time. In this experiment, all of them are open in separate windows. Your screen should look like the following screenshot when aircrack-ng is working on the packets to crack the WEP key: The number of data packets required to crack the key is nondeterministic, but generally in the order of a hundred thousand or more. On a fast network (or using aireplay-ng), this should take 5-10 minutes at most. If the number of data packets currently in the file is not sufficient, then aircrack-ng will pause, as shown in the following screenshot, and wait for more packets to be captured; it will then restart the cracking process: Once enough data packets have been captured and processed, aircrack-ng should be able to break the key. Once it does, it proudly displays it in the terminal and exits, as shown in the following screenshot: It is important to note that WEP is totally flawed and any WEP key (no matter how complex) will be cracked by Aircrack-ng. The only requirement is that a large enough number of data packets, encrypted with this key, are made available to aircrack-ng. What just happened? We set up WEP in our lab and successfully cracked the WEP key. In order to do this, we first waited for a legitimate client of the network to connect to the access point. After this, we used the aireplay-ng tool to replay ARP packets into the network. This caused the network to send ARP replay packets, thus greatly increasing the number of data packets sent over the air. We then used the aircrack-ng tool to crack the WEP key by analyzing cryptographic weaknesses in these data packets. Note that we can also fake an authentication to the access point using the Shared Key Authentication bypass technique. This can come in handy if the legitimate client leaves the network. This will ensure that we can spoof an authentication and association and continue to send our replayed packets into the network. Have a go hero – fake authentication with WEP cracking In the previous exercise, if the legitimate client had suddenly logged off the network, we would not have been able to replay the packets as the access point will refuse to accept packets from un-associated clients. While WEP cracking is going on. Log off the legitimate client from the network and verify that you are still able to inject packets into the network and whether the access point accepts and responds to them. WPA/WPA2 WPA( or WPA v1 as it is referred to sometimes) primarily uses the TKIP encryption algorithm. TKIP was aimed at improving WEP, without requiring completely new hardware to run it. WPA2 in contrast mandatorily uses the AES-CCMP algorithm for encryption, which is much more powerful and robust than TKIP. Both WPA and WPA2 allow either EAP-based authentication, using RADIUS servers (Enterprise) or a Pre-Shared key (PSK) (personal)-based authentication schema. WPA/WPA2 PSK is vulnerable to a dictionary attack. The inputs required for this attack are the four-way WPA handshake between client and access point, and a wordlist that contains common passphrases. Then, using tools such as Aircrack-ng, we can try to crack the WPA/WPA2 PSK passphrase. An illustration of the four-way handshake is shown in the following screenshot: The way WPA/WPA2 PSK works is that it derives the per-session key, called the Pairwise Transient Key (PTK), using the Pre-Shared Key and five other parameters—SSID of Network, Authenticator Nounce (ANounce), Supplicant Nounce (SNounce), Authenticator MAC address (Access Point MAC), and Suppliant MAC address (Wi-Fi Client MAC). This key is then used to encrypt all data between the access point and client. An attacker who is eavesdropping on this entire conversation by sniffing the air can get all five parameters mentioned in the previous paragraph. The only thing he does not have is the Pre-Shared Key. So, how is the Pre-Shared Key created? It is derived by using the WPA-PSK passphrase supplied by the user, along with the SSID. The combination of both of these is sent through the Password-Based Key Derivation Function (PBKDF2), which outputs the 256-bit shared key. In a typical WPA/WPA2 PSK dictionary attack, the attacker would use a large dictionary of possible passphrases with the attack tool. The tool would derive the 256-bit Pre-Shared key from each of the passphrases and use it with the other parameters, described earlier, to create the PTK. The PTK will be used to verify the Message Integrity Check (MIC) in one of the handshake packets. If it matches, then the guessed passphrase from the dictionary was correct; if not, it was incorrect. Eventually, if the authorized network passphrase exists in the dictionary, it will be identified. This is exactly how WPA/WPA2 PSK cracking works! The following figure illustrates the steps involved: In the next exercise, we will take a look at how to crack a WPA PSK wireless network. The exact same steps will be involved in cracking a WPA2-PSK network using CCMP(AES) as well. Time for action – cracking WPA-PSK weak passphrases Follow the given instructions to get started: Let's first connect to our access point Wireless Lab and set the access point to use WPA-PSK. We will set the WPA-PSK passphrase to abcdefgh so that it is vulnerable to a dictionary attack: We start airodump-ng with the following command so that it starts capturing and storing all packets for our network: airodump-ng –bssid 00:21:91:D2:8E:25 –channel 11 –write WPACrackingDemo mon0" The following screenshot shows the output: Now we can wait for a new client to connect to the access point so that we can capture the four-way WPA handshake, or we can send a broadcast deauthentication packet to force clients to reconnect. We do the latter to speed things up. The same thing can happen again with the unknown channel error. Again, use –-ignore-negative-one. This can also require more than one attempt: As soon as we capture a WPA handshake, the airodump-ng tool will indicate it in the top-right corner of the screen with a WPA handshake followed by the access point's BSSID. If you are using –ignore-negative-one, the tool may replace the WPA handshake with a fixed channel message. Just keep an eye out for a quick flash of a WPA handshake. We can stop the airodump-ng utility now. Let's open up the cap file in Wireshark and view the four-way handshake. Your Wireshark terminal should look like the following screenshot. I have selected the first packet of the four-way handshake in the trace file in the screenshot. The handshake packets are the one whose protocol is EAPOL: Now we will start the actual key cracking exercise! For this, we need a dictionary of common words. Kali ships with many dictionary files in the metasploit folder located as shown in the following screenshot. It is important to note that, in WPA cracking, you are just as good as your dictionary. BackTrack ships with some dictionaries, but these may be insufficient. Passwords that people choose depend on a lot of things. This includes things such as which country users live in, common names and phrases in that region the, security awareness of the users, and a host of other things. It may be a good idea to aggregate country- and region-specific word lists, when undertaking a penetration test: We will now invoke the aircrack-ng utility with the pcap file as the input and a link to the dictionary file, as shown in the following screenshot. I have used nmap.lst , as shown in the terminal: aircrack-ng uses the dictionary file to try various combinations of passphrases and tries to crack the key. If the passphrase is present in the dictionary file, it will eventually crack it and your screen will look similar to the one in the screenshot: Please note that, as this is a dictionary attack, the prerequisite is that the passphrase must be present in the dictionary file you are supplying to aircrack-ng. If the passphrase is not present in the dictionary, the attack will fail! What just happened? We set up WPA-PSK on our access point with a common passphrase: abcdefgh. We then use a deauthentication attack to have legitimate clients reconnect to the access point. When we reconnect, we capture the four-way WPA handshake between the access point and the client. As WPA-PSK is vulnerable to a dictionary attack, we feed the capture file that contains the WPA four-way handshake and a list of common passphrases (in the form of a wordlist) to Aircrack-ng. As the passphrase abcdefgh is present in the wordlist, Aircrack-ng is able to crack the WPA-PSK shared passphrase. It is very important to note again that, in WPA dictionary-based cracking, you are just as good as the dictionary you have. Thus, it is important to compile a large and elaborate dictionary before you begin. Though BackTrack ships with its own dictionary, it may be insufficient at times and might need more words, especially taking into account the localization factor. Have a go hero – trying WPA-PSK cracking with Cowpatty Cowpatty is a tool that can also crack a WPA-PSK passphrase using a dictionary attack. This tool is included with BackTrack. I leave it as an exercise for you to use Cowpatty to crack the WPA-PSK passphrase. Also, set an uncommon passphrase that is not present in the dictionary and try the attack again. You will now be unsuccessful in cracking the passphrase with both Aircrack-ng and Cowpatty. It is important to note that the same attack applies even to a WPA2 PSK network. I encourage you to verify this independently. Speeding up WPA/WPA2 PSK cracking As we have already seen in the previous section, if we have the correct passphrase in our dictionary, cracking WPA-Personal will work every time like a charm. So, why don't we just create a large elaborate dictionary of millions of common passwords and phrases people use? This would help us a lot and most of the time, we would end up cracking the passphrase. It all sounds great but we are missing one key component here— the time taken. One of the more CPU and time-consuming calculations is that of the Pre-Shared key using the PSK passphrase and the SSID through the PBKDF2. This function hashes the combination of both over 4,096 times before outputting the 256-bit Pre-Shared key. The next step in cracking involves using this key along with parameters in the four-way handshake and verifying against the MIC in the handshake. This step is computationally inexpensive. Also, the parameters will vary in the handshake every time and hence, this step cannot be precomputed. Thus, to speed up the cracking process, we need to make the calculation of the Pre-Shared key from the passphrase as fast as possible. We can speed this up by precalculating the Pre-Shared Key, also called the Pairwise Master Key (PMK) in 802.11 standard parlance. It is important to note that, as the SSID is also used to calculate the PMK, with the same passphrase and with a different SSID, we will end up with a different PMK. Thus, the PMK depends on both the passphrase and the SSID. In the next exercise, we will take a look at how to precalculate the PMK and use it for WPA/WPA2 PSK cracking. Time for action – speeding up the cracking process We can proceed with the following steps: We can precalculate the PMK for a given SSID and wordlist using the genpmk tool with the following command: genpmk –f <chosen wordlist>–d PMK-Wireless-Lab –s "Wireless Lab This creates the PMK-Wireless-Lab file containing the pregenerated PMK: We now create a WPA-PSK network with the passphrase abcdefgh (present in the dictionary we used) and capture a WPA-handshake for that network. We now use Cowpatty to crack the WPA passphrase, as shown in the following screenshot: It takes approximately 7.18 seconds for Cowpatty to crack the key, using the precalculated PMKs. We now use aircrack-ng with the same dictionary file, and the cracking process takes over 22 minutes. This shows how much we are gaining because of the precalculation. In order to use these PMKs with aircrack-ng, we need to use a tool called airolib-ng. We will give it the options airolib-ng, PMK-Aircrack --import,and cowpatty PMK-Wireless-Lab, where PMK-Aircrack is the aircrack-ng compatible database to be created and PMK-Wireless-Lab is the genpmk compliant PMK database that we created previously. We now feed this database to aircrack-ng and the cracking process speeds up remarkably. We use the following command: aircrack-ng –r PMK-Aircrack WPACrackingDemo2-01.cap There are additional tools available on BackTrack such as Pyrit that can leverage multi CPU systems to speed up cracking. We give the pcap filename with the -r option and the genpmk compliant PMK file with the -i option. Even on the same system used with the previous tools, Pyrit takes around 3 seconds to crack the key, using the same PMK file created using genpmk. What just happened? We looked at various different tools and techniques to speed up WPA/WPA2-PSK cracking. The whole idea is to pre-calculate the PMK for a given SSID and a list of passphrases in our dictionary. Decrypting WEP and WPA packets In all the exercises we have done till now, we cracked the WEP and WPA keys using various techniques. What do we do with this information? The first step is to decrypt data packets we have captured using these keys. In the next exercise, we will decrypt the WEP and WPA packets in the same trace file that we captured over the air, using the keys we cracked. Time for action – decrypting WEP and WPA packets We can proceed with the following steps: We will decrypt packets from the WEP capture file we created earlier: WEPCrackingDemo-01.cap. For this, we will use another tool in the Aircrack-ng suite called airdecap-ng. We will run the following command, as shown in the following screenshot, using the WEP key we cracked previously: airdecap-ng -w abcdefabcdefabcdefabcdef12 WEPCrackingDemo-02.cap The decrypted files are stored in a file named WEPCrackingDemo-02-dec.cap. We use the tshark utility to view the first ten packets in the file. Please note that you may see something different based on what you captured: WPA/WPA2 PSK will work in exactly the same way as with WEP, using the airdecap-ng utility, as shown in the following screenshot, with the following command: airdecap-ng –p abdefg WPACrackingDemo-02.cap –e "Wireless Lab" What just happened? We just saw how we can decrypt WEP and WPA/WPA2-PSK encrypted packets using Airdecap-ng. It is interesting to note that we can do the same using Wireshark. We would encourage you to explore how this can be done by consulting the Wireshark documentation. Connecting to WEP and WPA networks We can also connect to the authorized network after we have cracked the network key. This can come in handy during penetration testing. Logging onto the authorized network with the cracked key is the ultimate proof you can provide to your client that his network is insecure. Time for action – connecting to a WEP network We can proceed with the following steps: Use the iwconfig utility to connect to a WEP network, once you have the key. In a past exercise, we broke the WEP key—abcdefabcdefabcdefabcdef12: What just happened? We saw how to connect to a WEP network. Time for action – connecting to a WPA network We can proceed with the following steps: In the case of WPA, the matter is a bit more complicated. The iwconfig utility cannot be used with WPA/WPA2 Personal and Enterprise, as it does not support it. We will use a new tool called WPA_supplicant for this lab. To use WPA_supplicant for a network, we will need to create a configuration file, as shown in the following screenshot. We will name this file wpa-supp.conf: We will then invoke the WPA_supplicant utility with the following options: -D wext -i wlan0 –c wpa-supp.conf to connect to the WPA network we just cracked. Once the connection is successful, WPA_supplicant will give you the message: Connection to XXXX completed. For both the WEP and WPA networks, once you are connected, you can use dhcpclient to grab a DHCP address from the network by typing dhclient3 wlan0. What just happened? The default Wi-Fi utility iwconfig cannot be used to connect to WPA/WPA2 networks. The de-facto tool for this is WPA_Supplicant. In this lab, we saw how we can use it to connect to a WPA network. Summary In this section, we learnt about WLAN encryption. WEP is flawed and no matter what the WEP key is, with enough data packet samples: it is always possible to crack WEP. WPA/WPA2 is cryptographically un-crackable currently; however, under special circumstances, such as when a weak passphrase is chosen in WPA/WPA2-PSK, it is possible to retrieve the passphrase using dictionary attacks. Resources for Article: Further resources on this subject: Veil-Evasion [article] Wireless and Mobile Hacks [article] Social Engineering Attacks [article]
Read more
  • 0
  • 0
  • 11009

article-image-cross-browser-tests-using-selenium-webdriver
Packt
25 Mar 2015
18 min read
Save for later

Cross-browser Tests using Selenium WebDriver

Packt
25 Mar 2015
18 min read
In this article by Prashanth Sams, author of the book Selenium Essentials, helps you to perform efficient compatibility tests. Here, we will also learn about how to run tests on cloud. You will cover the following topics in the article: Selenium WebDriver compatibility tests Selenium cross-browser tests on cloud Selenium headless browser testing (For more resources related to this topic, see here.) Selenium WebDriver compatibility tests Selenium WebDriver handles browser compatibility tests on almost every popular browser, including Chrome, Firefox, Internet Explorer, Safari, and Opera. In general, every browser's JavaScript engine differs from the others, and each browser interprets the HTML tags differently. The WebDriver API drives the web browser as the real user would drive it. By default, FirefoxDriver comes with the selenium-server-standalone.jar library added; however, for Chrome, IE, Safari, and Opera, there are libraries that need to be added or instantiated externally. Let's see how we can instantiate each of the following browsers through its own driver: Mozilla Firefox: The selenium-server-standalone library is bundled with FirefoxDriver to initialize and run tests in a Firefox browser. FirefoxDriver is added to the Firefox profile as a file extension on starting a new instance of FirefoxDriver. Please check the Firefox versions and its suitable drivers at http://selenium.googlecode.com/git/java/CHANGELOG. The following is the code snippet to kick start Mozilla Firefox: WebDriver driver = new FirefoxDriver(); Google Chrome: Unlike FirefoxDriver, the ChromeDriver is an external library file that makes use of WebDriver's wire protocol to run Selenium tests in a Google Chrome web browser. The following is the code snippet to kick start Google Chrome: System.setProperty("webdriver.chrome.driver","C:\chromedriver.exe"); WebDriver driver = new ChromeDriver(); To download ChromeDriver, refer to http://chromedriver.storage.googleapis.com/index.html. Internet Explorer: IEDriverServer is an executable file that uses the WebDriver wire protocol to control the IE browser in Windows. Currently, IEDriverServer supports the IE versions 6, 7, 8, 9, and 10. The following code snippet helps you to instantiate IEDriverServer: System.setProperty("webdriver.ie.driver","C:\IEDriverServer.exe"); DesiredCapabilities dc = DesiredCapabilities.internetExplorer(); dc.setCapability(InternetExplorerDriver.INTRODUCE_FLAKINESS_BY_IGNORING_SECURITY_DOMAINS, true); WebDriver driver = new InternetExplorerDriver(dc); To download IEDriverServer, refer to http://selenium-release.storage.googleapis.com/index.html. Apple Safari: Similar to FirefoxDriver, SafariDriver is internally bound with the latest Selenium servers, which starts the Apple Safari browser without any external library. SafariDriver supports the Safari browser versions 5.1.x and runs only on MAC. For more details, refer to http://elementalselenium.com/tips/69-safari. The following code snippet helps you to instantiate SafariDriver: WebDriver driver = new SafariDriver(); Opera: OperaPrestoDriver (formerly called OperaDriver) is available only for Presto-based Opera browsers. Currently, it does not support Opera versions 12.x and above. However, the recent releases (Opera 15.x and above) of Blink-based Opera browsers are handled using OperaChromiumDriver. For more details, refer to https://github.com/operasoftware/operachromiumdriver. The following code snippet helps you to instantiate OperaChrumiumDriver: DesiredCapabilities capabilities = new DesiredCapabilities(); capabilities.setCapability("opera.binary", "C://Program Files (x86)//Opera//opera.exe"); capabilities.setCapability("opera.log.level", "CONFIG"); WebDriver driver = new OperaDriver(capabilities); To download OperaChromiumDriver, refer to https://github.com/operasoftware/operachromiumdriver/releases. TestNG TestNG (Next Generation) is one of the most widely used unit-testing frameworks implemented for Java. It runs Selenium-based browser compatibility tests with the most popular browsers. The Eclipse IDE users must ensure that the TestNG plugin is integrated with the IDE manually. However, the TestNG plugin is bundled with IntelliJ IDEA as default. The testng.xml file is a TestNG build file to control test execution; the XML file can run through Maven tests using POM.xmlwith the help of the following code snippet: <plugin>    <groupId>org.apache.maven.plugins</groupId>    <artifactId>maven-surefire-plugin</artifactId>    <version>2.12.2</version>    <configuration>      <suiteXmlFiles>      <suiteXmlFile>testng.xml</suiteXmlFile>      </suiteXmlFiles>    </configuration> </plugin> To create a testng.xml file, right-click on the project folder in the Eclipse IDE, navigate to TestNG | Convert to TestNG, and click on Convert to TestNG, as shown in the following screenshot: The testng.xml file manages the entire tests; it acts as a mini data source by passing the parameters directly into the test methods. The location of the testng.xml file is hsown in the following screenshot: As an example, create a Selenium project (for example, Selenium Essentials) along with the testng.xml file, as shown in the previous screenshot. Modify the testng.xml file with the following tags: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE suite SYSTEM "http://testng.org/testng-1.0.dtd"> <suite name="Suite" verbose="3" parallel="tests" thread-count="5">   <test name="Test on Firefox">    <parameter name="browser" value="Firefox" />    <classes>      <class name="package.classname" />      </classes> </test>   <test name="Test on Chrome">    <parameter name="browser" value="Chrome" />    <classes>    <class name="package.classname" />    </classes> </test>   <test name="Test on InternetExplorer">    <parameter name="browser" value="InternetExplorer" />    <classes>       <class name="package.classname" />    </classes> </test>   <test name="Test on Safari">    <parameter name="browser" value="Safari" />    <classes>      <class name="package.classname" />    </classes> </test>   <test name="Test on Opera">    <parameter name="browser" value="Opera" />    <classes>      <class name="package.classname" />    </classes> </test> </suite> <!-- Suite --> Download all the external drivers except FirefoxDriver and SafariDriver, extract the zipped folders, and locate the external drivers in the test script as mentioned in the preceding snippets for each browser. The following Java snippet will explain to you how you can get parameters directly from the testng.xml file and how you can run cross-browser tests as a whole: @BeforeTest @Parameters({"browser"}) public void setUp(String browser) throws MalformedURLException { if (browser.equalsIgnoreCase("Firefox")) {    System.out.println("Running Firefox");    driver = new FirefoxDriver(); } else if (browser.equalsIgnoreCase("chrome")) {    System.out.println("Running Chrome"); System.setProperty("webdriver.chrome.driver", "C:\chromedriver.exe");    driver = new ChromeDriver(); } else if (browser.equalsIgnoreCase("InternetExplorer")) {    System.out.println("Running Internet Explorer"); System.setProperty("webdriver.ie.driver", "C:\IEDriverServer.exe"); DesiredCapabilities dc = DesiredCapabilities.internetExplorer();    dc.setCapability (InternetExplorerDriver.INTRODUCE_FLAKINESS_BY_IGNORING_SECURITY_DOMAINS, true); //If IE fail to work, please remove this line and remove enable protected mode for all the 4 zones from Internet options    driver = new InternetExplorerDriver(dc); } else if (browser.equalsIgnoreCase("safari")) {    System.out.println("Running Safari");    driver = new SafariDriver(); } else if (browser.equalsIgnoreCase("opera")) {    System.out.println("Running Opera"); // driver = new OperaDriver();       --Use this if the location is set properly--    DesiredCapabilities capabilities = new DesiredCapabilities(); capabilities.setCapability("opera.binary", "C://Program Files (x86)//Opera//opera.exe");    capabilities.setCapability("opera.log.level", "CONFIG");    driver = new OperaDriver(capabilities); } } SafariDriver is not yet stable. A few of the major issues in SafariDriver are as follows: SafariDriver won't work properly in Windows SafariDriver does not support modal dialog box interaction You cannot navigate forward or backwards in browser history through SafariDriver Selenium cross-browser tests on the cloud The ability to automate Selenium tests on the cloud is quite interesting, with instant access to real devices. Sauce Labs, BrowserStack, and TestingBot are the leading web-based tools used for cross-browser compatibility checking. These tools contain unique test automation features, such as diagnosing failures through screenshots and video, executing parallel tests, running Appium mobile automation tests, executing tests on internal local servers, and so on. SauceLabs SauceLabs is the standard Selenium test automation web app to do cross-browser compatibility tests on the cloud. It lets you automate tests on your favorite programming languages using test frameworks such as JUnit, TestNG, Rspec, and many more. SauceLabs cloud tests can also be executed from the Selenium Builder IDE interface. Check for the available SauceLabs devices, OS, and platforms at https://saucelabs.com/platforms. Access the websitefrom your web browser, log in, and obtain the Sauce username and Access Key. Make use of the obtained credentials to drive tests over the SauceLabs cloud. SauceLabs creates a new instance of the virtual machine while launching the tests. Parallel automation tests are also possible using SauceLabs. The following is a Java program to run tests over the SauceLabs cloud: package packagename;   import java.net.URL; import org.openqa.selenium.remote.DesiredCapabilities; import org.openqa.selenium.remote.RemoteWebDriver; import java.lang.reflect.*;   public class saucelabs {   private WebDriver driver;   @Parameters({"username", "key", "browser", "browserVersion"}) @BeforeMethod public void setUp(@Optional("yourusername") String username,                     @Optional("youraccesskey") String key,                      @Optional("iphone") String browser,                      @Optional("5.0") String browserVersion,                      Method method) throws Exception {   // Choose the browser, version, and platform to test DesiredCapabilities capabilities = new DesiredCapabilities(); capabilities.setBrowserName(browser); capabilities.setCapability("version", browserVersion); capabilities.setCapability("platform", Platform.MAC); capabilities.setCapability("name", method.getName()); // Create the connection to SauceLabs to run the tests this.driver = new RemoteWebDriver( new URL("http://" + username + ":" + key + "@ondemand.saucelabs.com:80/wd/hub"), capabilities); }   @Test public void Selenium_Essentials() throws Exception {    // Make the browser get the page and check its title driver.get("http://www.google.com"); System.out.println("Page title is: " + driver.getTitle()); Assert.assertEquals("Google", driver.getTitle()); WebElement element = driver.findElement(By.name("q")); element.sendKeys("Selenium Essentials"); element.submit(); } @AfterMethod public void tearDown() throws Exception { driver.quit(); } } SauceLabs has a setup similar to BrowserStack on test execution and generates detailed logs. The breakpoints feature allows the user to manually take control over the virtual machine and pause tests, which helps the user to investigate and debug problems. By capturing JavaScript's console log, the JS errors and network requests are displayed for quick diagnosis while running tests against Google Chrome browser. BrowserStack BrowserStack is a cloud-testing web app to access virtual machines instantly. It allows users to perform multi-browser testing of their applications on different platforms. It provides a setup similar to SauceLabs for cloud-based automation using Selenium. Access the site https://www.browserstack.com from your web browser, log in, and obtain the BrowserStack username and Access Key. Make use of the obtained credentials to drive tests over the BrowserStack cloud. For example, the following generic Java program with TestNG provides a detailed overview of the process that runs on the BrowserStack cloud. Customize the browser name, version, platform, and so on, using capabilities. Let's see the Java program we just talked about: package packagename;   import org.openqa.selenium.remote.DesiredCapabilities; import org.openqa.selenium.remote.RemoteWebDriver;   public class browserstack {   public static final String USERNAME = "yourusername"; public static final String ACCESS_KEY = "youraccesskey"; public static final String URL = "http://" + USERNAME + ":" + ACCESS_KEY + "@hub.browserstack.com/wd/hub";   private WebDriver driver;   @BeforeClass public void setUp() throws Exception {    DesiredCapabilities caps = new DesiredCapabilities();    caps.setCapability("browser", "Firefox");    caps.setCapability("browser_version", "23.0");    caps.setCapability("os", "Windows");    caps.setCapability("os_version", "XP");    caps.setCapability("browserstack.debug", "true"); //This enable Visual Logs      driver = new RemoteWebDriver(new URL(URL), caps); }   @Test public void testOnCloud() throws Exception {    driver.get("http://www.google.com");    System.out.println("Page title is: " + driver.getTitle());    Assert.assertEquals("Google", driver.getTitle());    WebElement element = driver.findElement(By.name("q"));    element.sendKeys("seleniumworks");    element.submit(); }   @AfterClass public void tearDown() throws Exception {    driver.quit(); } } The app generates and stores test logs for the user to access anytime. The generated logs provide a detailed analysis with step-by-step explanations. To enhance the test speed, run parallel Selenium tests on the BrowserStack cloud; however, the automation plan has to be upgraded to increase the number of parallel test runs. TestingBot TestingBot also provides a setup similar to BrowserStack and SauceLabs for cloud-based cross-browser test automation using Selenium. It records a video of the running tests to analyze problems and debug. Additionally, it provides support to capture the screenshots on test failure. To run local Selenium tests, it provides an SSH tunnel tool that lets you run tests against local servers or other web servers. TestingBot uses Amazon's cloud infrastructure to run Selenium scripts in various browsers. Access the site https://testingbot.com/, log in, and obtain Client Key and Client Secret from your TestingBot account. Make use of the obtained credentials to drive tests over the TestingBot cloud. Let's see an example Java test program with TestNG using the Eclipse IDE that runs on the TestingBot cloud: package packagename;   import java.net.URL;   import org.openqa.selenium.remote.DesiredCapabilities; import org.openqa.selenium.remote.RemoteWebDriver;   public class testingbot { private WebDriver driver;   @BeforeClass public void setUp() throws Exception { DesiredCapabilitiescapabillities = DesiredCapabilities.firefox();    capabillities.setCapability("version", "24");    capabillities.setCapability("platform", Platform.WINDOWS);    capabillities.setCapability("name", "testOnCloud");    capabillities.setCapability("screenshot", true);    capabillities.setCapability("screenrecorder", true);    driver = new RemoteWebDriver( new URL ("http://ClientKey:ClientSecret@hub.testingbot.com:4444/wd/hub"), capabillities); }   @Test public void testOnCloud() throws Exception {    driver.get      ("http://www.google.co.in/?gws_rd=cr&ei=zS_mUryqJoeMrQf-yICYCA");    driver.findElement(By.id("gbqfq")).clear();    WebElement element = driver.findElement(By.id("gbqfq"));    element.sendKeys("selenium"); Assert.assertEquals("selenium - Google Search", driver.getTitle()); }   @AfterClass public void tearDown() throws Exception {    driver.quit(); } } Click on the Tests tab to check the log results. The logs are well organized with test steps, screenshots, videos, and a summary. Screenshots are captured on each and every step to make the tests more precise, as follows: capabillities.setCapability("screenshot", true); // screenshot capabillities.setCapability("screenrecorder", true); // video capture TestingBot provides a unique feature by scheduling and running tests directly from the site. The tests can be prescheduled to repeat tests any number of times on a daily or weekly basis. It's even more accurate on scheduling the test start time. You will be apprised of test failures with an alert through e-mail, an API call, an SMS, or a Prowl notification. This feature enables error handling to rerun failed tests automatically as per the user settings. Launch Selenium IDE, record tests, and save the test case or test suite in default format (HTML). Access the https://testingbot.com/ URL from your web browser and click on the Test Lab tab. Now, try to upload the already-saved Selenium test case, select the OS platform and browser name and version. Finally, save the settings and execute tests. The test results are recorded and displayed under Tests. Selenium headless browser testing A headless browser is a web browser without Graphical User Interface (GUI). It accesses and renders web pages but doesn't show them to any human being. A headless browser should be able to parse JavaScript. Currently, most of the systems encourage tests against headless browsers due to its efficiency and time-saving properties. PhantomJS and HTMLUnit are the most commonly used headless browsers. Capybara-webkit is another efficient headless WebKit for rails-based applications. PhantomJS PhantomJS is a headless WebKit scriptable with JavaScript API. It is generally used for headless testing of web applications that comes with built-in GhostDriver. Tests on PhantomJs are obviously fast since it has fast and native support for various web standards, such as DOM handling, CSS selector, JSON, canvas, and SVG. In general, WebKit is a layout engine that allows the web browsers to render web pages. Some of the browsers, such as Safari and Chrome, use WebKit. Apparently, PhantomJS is not a test framework; it is a headless browser that is used only to launch tests via a suitable test runner called GhostDriver. GhostDriver is a JS implementation of WebDriver Wire Protocol for PhantomJS; WebDriver Wire Protocol is a standard API that communicates with the browser. By default, the GhostDriver is embedded with PhantomJS. To download PhantomJS, refer to http://phantomjs.org/download.html. Download PhantomJS, extract the zipped file (for example, phantomjs-1.x.x-windows.zip for Windows) and locate the phantomjs.exe folder. Add the following imports to your test code: import org.openqa.selenium.phantomjs.PhantomJSDriver; import org.openqa.selenium.phantomjs.PhantomJSDriverService; import org.openqa.selenium.remote.DesiredCapabilities; Introduce PhantomJSDriver using capabilities to enable or disable JavaScript or to locate the phantomjs executable file path: DesiredCapabilities caps = new DesiredCapabilities(); caps.setCapability("takesScreenshot", true); caps.setJavascriptEnabled(true); // not really needed; JS is enabled by default caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "C:/phantomjs.exe"); WebDriver driver = new PhantomJSDriver(caps); Alternatively, PhantomJSDriver can also be initialized as follows: System.setProperty("phantomjs.binary.path", "/phantomjs.exe"); WebDriver driver = new PhantomJSDriver(); PhantomJS supports screen capture as well. Since PhantomJS is a WebKit and a real layout and rendering engine, it is feasible to capture a web page as a screenshot. It can be set as follows: caps.setCapability("takesScreenshot", true); The following is the test snippet to capture a screenshot on test run: File scrFile = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE); FileUtils.copyFile(scrFile, new File("c:\sample.jpeg"),true); For example, check the following test program for more details: package packagename;   import java.io.File; import java.util.concurrent.TimeUnit; import org.apache.commons.io.FileUtils; import org.openqa.selenium.*; import org.openqa.selenium.phantomjs.PhantomJSDriver;   public class phantomjs { private WebDriver driver; private String baseUrl;   @BeforeTest public void setUp() throws Exception { System.setProperty("phantomjs.binary.path", "/phantomjs.exe");    driver = new PhantomJSDriver();    baseUrl = "https://www.google.co.in"; driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS); }   @Test public void headlesstest() throws Exception { driver.get(baseUrl + "/"); driver.findElement(By.name("q")).sendKeys("selenium essentials"); File scrFile = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE); FileUtils.copyFile(scrFile, new File("c:\screen_shot.jpeg"), true); }   @AfterTest public void tearDown() throws Exception {    driver.quit(); } } HTMLUnitDriver HTMLUnit is a headless (GUI-less) browser written in Java and is typically used for testing. HTMLUnitDriver, which is based on HTMLUnit, is the fastest and most lightweight implementation of WebDriver. It runs tests using a plain HTTP request, which is quicker than launching a browser and executes tests way faster than other drivers. The HTMLUnitDriver is added to the latest Selenium servers (2.35 or above). The JavaScript engine used by HTMLUnit (Rhino) is unique and different from any other popular browsers available on the market. HTMLUnitDriver supports JavaScript and is platform independent. By default, the JavaScript support for HTMLUnitDriver is disabled. Enabling JavaScript in HTMLUnitDriver slows down the test execution; however, it is advised to enable JavaScript support because most of the modern sites are Ajax-based web apps. By enabling JavaScript, it also throws a number of JavaScript warning messages in the console during test execution. The following snippet lets you enable JavaScript for HTMLUnitDriver: HtmlUnitDriver driver = new HtmlUnitDriver(); driver.setJavascriptEnabled(true); // enable JavaScript The following line of code is an alternate way to enable JavaScript: HtmlUnitDriver driver = new HtmlUnitDriver(true); The following piece of code lets you handle a transparent proxy using HTMLUnitDriver: HtmlUnitDriver driver = new HtmlUnitDriver(); driver.setProxy("xxx.xxx.xxx.xxx", port); // set proxy for handling Transparent Proxy driver.setJavascriptEnabled(true); // enable JavaScript [this emulate IE's js by default] HTMLUnitDriver can emulate the popular browser's JavaScript in a better way. By default, HTMLUnitDriver emulates IE's JavaScript. For example, to handle the Firefox web browser with version 17, use the following snippet: HtmlUnitDriver driver = new HtmlUnitDriver(BrowserVersion.FIREFOX_17); driver.setJavascriptEnabled(true); Here is the snippet to emulate a specific browser's JavaScript using capabilities: DesiredCapabilities capabilities = DesiredCapabilities.htmlUnit(); driver = new HtmlUnitDriver(capabilities); DesiredCapabilities capabilities = DesiredCapabilities.firefox(); capabilities.setBrowserName("Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0"); capabilities.setVersion("24.0"); driver = new HtmlUnitDriver(capabilities); Summary In this article, you learned to perform efficient compatibility tests and also learned about how to run tests on cloud. Resources for Article: Further resources on this subject: Selenium Testing Tools [article] First Steps with Selenium RC [article] Quick Start into Selenium Tests [article]
Read more
  • 0
  • 0
  • 6603
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-prerequisites
Packt
25 Mar 2015
6 min read
Save for later

Prerequisites

Packt
25 Mar 2015
6 min read
In this article by Deepak Vohra, author of the book, Advanced Java® EE Development with WildFly® you will see how to create a Java EE project and its pre-requisites. (For more resources related to this topic, see here.) The objective of the EJB 3.x specification is to simplify its development by improving the EJB architecture. This simplification is achieved by providing metadata annotations to replace XML configuration. It also provides default configuration values by making entity and session beans POJOs (Plain Old Java Objects) and by making component and home interfaces redundant. The EJB 2.x entity beans is replaced with EJB 3.x entities. EJB 3.0 also introduced the Java Persistence API (JPA) for object-relational mapping of Java objects. WildFly 8.x supports EJB 3.2 and the JPA 2.1 specifications from Java EE 7. The sample application is based on Java EE 6 and EJB 3.1. The configuration of EJB 3.x with Java EE 7 is also discussed and the sample application can be used or modified to run on a Java EE 7 project. We have used a Hibernate 4.3 persistence provider. Unlike some of the other persistence providers, the Hibernate persistence provider supports automatic generation of relational database tables including the joining of tables. In this article, we will create an EJB 3.x project. This article has the following topics: Setting up the environment Creating a WildFly runtime Creating a Java EE project Setting up the environment We need to download and install the following software: WildFly 8.1.0.Final: Download wildfly-8.1.0.Final.zip from http://wildfly.org/downloads/. MySQL 5.6 Database-Community Edition: Download this edition from http://dev.mysql.com/downloads/mysql/. When installing MySQL, also install Connector/J. Eclipse IDE for Java EE Developers: Download Eclipse Luna from https://www.eclipse.org/downloads/packages/release/Luna/SR1. JBoss Tools (Luna) 4.2.0.Final: Install this as a plug-in to Eclipse from the Eclipse Marketplace (http://tools.jboss.org/downloads/installation.html). The latest version from Eclipse Marketplace is likely to be different than 4.2.0. Apache Maven: Download version 3.05 or higher from http://maven.apache.org/download.cgi. Java 7: Download Java 7 from http://www.oracle.com/technetwork/java/javase/downloads/index.html?ssSourceSiteId=ocomcn. Set the environment variables: JAVA_HOME, JBOSS_HOME, MAVEN_HOME, and MYSQL_HOME. Add %JAVA_HOME%/bin, %MAVEN_HOME%/bin, %JBOSS_HOME%/bin, and %MYSQL_HOME%/bin to the PATH environment variable. The environment settings used are C:wildfly-8.1.0.Final for JBOSS_HOME, C:Program FilesMySQLMySQL Server 5.6.21 for MYSQL_HOME, C:mavenapache-maven-3.0.5 for MAVEN_HOME, and C:Program FilesJavajdk1.7.0_51 for JAVA_HOME. Run the add-user.bat script from the %JBOSS_HOME%/bin directory to create a user for the WildFly administrator console. When prompted What type of user do you wish to add?, select a) Management User. The other option is b) Application User. Management User is used to log in to Administration Console, and Application User is used to access applications. Subsequently, specify the Username and Password for the new user. When prompted with the question, Is this user going to be used for one AS process to connect to another AS..?, enter the answer as no. When installing and configuring the MySQL database, specify a password for the root user (the password mysql is used in the sample application). Creating a WildFly runtime As the application is run on WildFly 8.1, we need to create a runtime environment for WildFly 8.1 in Eclipse. Select Window | Preferences in Eclipse. In Preferences, select Server | Runtime Environment. Click on the Add button to add a new runtime environment, as shown in the following screenshot: In New Server Runtime Environment, select JBoss Community | WildFly 8.x Runtime. Click on Next: In WildFly Application Server 8.x, which appears below New Server Runtime Environment, specify a Name for the new runtime or choose the default name, which is WildFly 8.x Runtime. Select the Home Directory for the WildFly 8.x server using the Browse button. The Home Directory is the directory where WildFly 8.1 is installed. The default path is C:wildfly-8.1.0.Final. Select the Runtime JRE as JavaSE-1.7. If the JDK location is not added to the runtime list, first add it from the JRE preferences screen in Eclipse. In Configuration base directory, select standalone as the default setting. In Configuration file, select standalone.xml as the default setting. Click on Finish: A new server runtime environment for WildFly 8.x Runtime gets created, as shown in the following screenshot. Click on OK: Creating a Server Runtime Environment for WildFly 8.x is a prerequisite for creating a Java EE project in Eclipse. In the next topic, we will create a new Java EE project for an EJB 3.x application. Creating a Java EE project JBoss Tools provides project templates for different types of JBoss projects. In this topic, we will create a Java EE project for an EJB 3.x application. Select File | New | Other in Eclipse IDE. In the New wizard, select the JBoss Central | Java EE EAR Project wizard. Click on the Next button: The Java EE EAR Project wizard gets started. By default, a Java EE 6 project is created. A Java EE EAR Project is a Maven project. The New Project Example window lists the requirements and runs a test for the requirements. The JBoss AS runtime is required and some plugins (including the JBoss Maven Tools plugin) are required for a Java EE project. Select Target Runtime as WildFly 8.x Runtime, which was created in the preceding topic. Then, check the Create a blank project checkbox. Click on the Next button: Specify Project name as jboss-ejb3, Package as org.jboss.ejb3, and tick the Use default Workspace location box. Click on the Next button: Specify Group Id as org.jboss.ejb3, Artifact Id as jboss-ejb3, Version as 1.0.0, and Package as org.jboss.ejb3.model. Click on Finish: A Java EE project gets created, as shown in the following Project Explorer window. The jboss-ejb3 project consists of three subprojects: jboss-ejb3-ear, jboss-ejb3-ejb, and jboss-ejb3-web. Each subproject consists of a pom.xml file for Maven. The jboss-ejb3-ejb subproject consists of a META-INF/persistence.xml file within the src/main/resources source folder for the JPA database persistence configuration. Summary In this article, we learned how to create a Java EE project and its prerequisites. Resources for Article: Further resources on this subject: Common performance issues [article] Running our first web application [article] Various subsystem configurations [article]
Read more
  • 0
  • 0
  • 5560

article-image-handling-image-and-video-files
Packt
24 Mar 2015
18 min read
Save for later

Handling Image and Video Files

Packt
24 Mar 2015
18 min read
This article written by Gloria Bueno García, Oscar Deniz Suarez, José Luis Espinosa Aranda, Jesus Salido Tercero, Ismael Serrano Gracia, and Noelia Vállez Enanois, the authors of Learning Image Processing with OpenCV, is intended as a first contact with OpenCV, its installation, and first basic programs. We will cover the following topics: A brief introduction to OpenCV for the novice, followed by an easy step-by-step guide to the installation of the library A quick tour of OpenCV's structure after the installation in the user's local disk Quick recipes to create projects using the library with some common programming frameworks How to use the functions to read and write images and videos (For more resources related to this topic, see here.) An introduction to OpenCV Initially developed by Intel, OpenCV (Open Source Computer Vision) is a free cross-platform library for real-time image processing that has become a de facto standard tool for all things related to Computer Vision. The first version was released in 2000 under BSD license and since then, its functionality has been very much enriched by the scientific community. In 2012, the nonprofit foundation OpenCV.org took on the task of maintaining a support site for developers and users. At the time of writing this article, a new major version of OpenCV (Version 3.0) is available, still on beta status. Throughout the article, we will present the most relevant changes brought with this new version. OpenCV is available for the most popular operating systems, such as GNU/Linux, OS X, Windows, Android, iOS, and some more. The first implementation was in the C programming language; however, its popularity grew with its C++ implementation as of Version 2.0. New functions are programmed with C++. However, nowadays, the library has a full interface for other programming languages, such as Java, Python, and MATLAB/Octave. Also, wrappers for other languages (such as C#, Ruby, and Perl) have been developed to encourage adoption by programmers. In an attempt to maximize the performance of computing intensive vision tasks, OpenCV includes support for the following: Multithreading on multicore computers using Threading Building Blocks (TBB)—a template library developed by Intel. A subset of Integrated Performance Primitives (IPP) on Intel processors to boost performance. Thanks to Intel, these primitives are freely available as of Version 3.0 beta. Interfaces for processing on Graphic Processing Unit (GPU) using Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL). The applications for OpenCV cover areas such as segmentation and recognition, 2D and 3D feature toolkits, object identification, facial recognition, motion tracking, gesture recognition, image stitching, high dynamic range (HDR) imaging, augmented reality, and so on. Moreover, to support some of the previous application areas, a module with statistical machine learning functions is included. Downloading and installing OpenCV OpenCV is freely available for download at http://opencv.org. This site provides the last version for distribution (currently, 3.0 beta) and older versions. Special care should be taken with possible errors when the downloaded version is a nonstable release, for example, the current 3.0 beta version. On http://opencv.org/releases.html, suitable versions of OpenCV for each platform can be found. The code and information of the library can be obtained from different repositories depending on the final purpose: The main repository (at http://sourceforge.net/projects/opencvlibrary), devoted to final users. It contains binary versions of the library and ready-to-compile sources for the target platform. The test data repository (at https://github.com/itseez/opencv_extra) with sets of data to test purposes of some library modules. The contributions repository (at http://github.com/itseez/opencv_contrib) with the source code corresponding to extra and cutting-edge features supplied by contributors. This code is more error-prone and less tested than the main trunk. With the last version, OpenCV 3.0 beta, the extra contributed modules are not included in the main package. They should be downloaded separately and explicitly included in the compilation process through the proper options. Be cautious if you include some of those contributed modules, because some of them have dependencies on third-party software not included with OpenCV. The documentation site (at http://docs.opencv.org/master/) for each of the modules, including the contributed ones. The development repository (at https://github.com/Itseez/opencv) with the current development version of the library. It is intended for developers of the main features of the library and the "impatient" user who wishes to use the last update even before it is released. Rather than GNU/Linux and OS X, where OpenCV is distributed as source code only, in the Windows distribution, one can find precompiled (with Microsoft Visual C++ v10, v11, and v12) versions of the library. Each precompiled version is ready to be used with Microsoft compilers. However, if the primary intention is to develop projects with a different compiler framework, we need to compile the library for that specific compiler (for example, GNU GCC). The fastest route to working with OpenCV is to use one of the precompiled versions included with the distribution. Then, a better choice is to build a fine-tuned version of the library with the best settings for the local platform used for software development. This article provides the information to build and install OpenCV on Windows. Further information to set the library on Linux can be found at http://docs.opencv.org/doc/tutorials/introduction/linux_install and https://help.ubuntu.com/community/OpenCV. Getting a compiler and setting CMake A good choice for cross-platform development with OpenCV is to use the GNU toolkit (including gmake, g++, and gdb). The GNU toolkit can be easily obtained for the most popular operating systems. Our preferred choice for a development environment consists of the GNU toolkit and the cross-platform Qt framework, which includes the Qt library and the Qt Creator Integrated Development Environment(IDE). The Qt framework is freely available at http://qt-project.org/. After installing the compiler on Windows, remember to properly set the Path environment variable, adding the path for the compiler's executable, for example, C:QtQt5.2.15.2.1mingw48_32bin for the GNU/compilers included with the Qt framework. On Windows, the free Rapid Environment Editor tool (available at http://www.rapidee.com) provides a convenient way to change Path and other environment variables. To manage the build process for the OpenCV library in a compiler-independent way, CMake is the recommended tool. CMake is a free and open source cross-platform tool available at http://www.cmake.org/. Configuring OpenCV with CMake Once the sources of the library have been downloaded into the local disk, it is required that you configure the makefiles for the compilation process of the library. CMake is the key tool for an easy configuration of OpenCV's installation process. It can be used from the command line or in a more user-friendly way with its Graphical User Interface (GUI) version. The steps to configure OpenCV with CMake can be summarized as follows: Choose the source (let's call it OPENCV_SRC in what follows) and target (OPENCV_BUILD) directories. The target directory is where the compiled binaries will be located. Mark the Grouped and Advanced checkboxes and click on the Configure button. Choose the desired compiler (for example, GNU default compilers, MSVC, and so on). Set the preferred options and unset those not desired. Click on the Configure button and repeat steps 4 and 5 until no errors are obtained. Click on the Generate button and close CMake. The following screenshot shows you the main window of CMake with the source and target directories and the checkboxes to group all the available options:   The main window of CMake after the preconfiguration step For brevity, we use OPENCV_BUILD and OPENCV_SRC in this text to denote the target and source directories of the OpenCV local setup, respectively. Keep in mind that all directories should match your current local configuration. During the preconfiguration process, CMake detects the compilers present and many other local properties to set the build process of OpenCV. The previous screenshot displays the main CMake window after the preconfiguration process, showing the grouped options in red. It is possible to leave the default options unchanged and continue the configuration process. However, some convenient options can be set: BUILD_EXAMPLES: This is set to build some examples using OpenCV. BUILD_opencv_<module_name>: This is set to include the module (module_name) in the build process. OPENCV_EXTRA_MODULES_PATH: This is used when you need some extra contributed module; set the path for the source code of the extra modules here (for example, C:/opencv_contrib-master/modules). WITH_QT: This is turned on to include the Qt functionality into the library. WITH_IPP: This option is turned on by default. The current OpenCV 3.0 version includes a subset of the Intel Integrated Performance Primitives (IPP) that speed up the execution time of the library. If you compile the new OpenCV 3.0 (beta), be cautious because some unexpected errors have been reported related to the IPP inclusion (that is, with the default value of this option). We recommend that you unset the WITH_IPP option. If the configuration steps with CMake (loop through steps 4 and 5) don't produce any further errors, it is possible to generate the final makefiles for the build process. The following screenshot shows you the main window of CMake after a generation step without errors: Compiling and installing the library The next step after the generation process of makefiles with CMake is the compilation with the proper make tool. This tool is usually executed on the command line (the console) from the target directory (the one set at the CMake configuration step). For example, in Windows, the compilation should be launched from the command line as follows: OPENCV_BUILD>mingw32-make This command launches a build process using the makefiles generated by CMake. The whole compilation typically takes several minutes. If the compilation ends without errors, the installation continues with the execution of the following command: OPENCV_BUILD>mingw32-make install This command copies the OpenCV binaries to the OPENCV_BUILDinstall directory. If something went wrong during the compilation, we should run CMake again to change the options selected during the configuration. Then, we should regenerate the makefiles. The installation ends by adding the location of the library binaries (for example, in Windows, the resulting DLL files are located at OPENCV_BUILDinstallx64mingwbin) to the Path environment variable. Without this directory in the Path field, the execution of every OpenCV executable will give an error as the library binaries won't be found. To check the success of the installation process, it is possible to run some of the examples compiled along with the library (if the BUILD_EXAMPLES option was set using CMake). The code samples (written in C++) can be found at OPENCV_BUILDinstallx64mingwsamplescpp. The short instructions given to install OpenCV apply to Windows. A detailed description with the prerequisites for Linux can be read at http://docs.opencv.org/doc/tutorials/introduction/linux_install/linux_install.html. Although the tutorial applies to OpenCV 2.0, almost all the information is still valid for Version 3.0. The structure of OpenCV Once OpenCV is installed, the OPENCV_BUILDinstall directory will be populated with three types of files: Header files: These are located in the OPENCV_BUILDinstallinclude subdirectory and are used to develop new projects with OpenCV. Library binaries: These are static or dynamic libraries (depending on the option selected with CMake) with the functionality of each of the OpenCV modules. They are located in the bin subdirectory (for example, x64mingwbin when the GNU compiler is used). Sample binaries: These are executables with examples that use the libraries. The sources for these samples can be found in the source package (for example, OPENCV_SRCsourcessamples). OpenCV has a modular structure, which means that the package includes a static or dynamic (DLL) library for each module. The official documentation for each module can be found at http://docs.opencv.org/master/. The main modules included in the package are: core: This defines the basic functions used by all the other modules and the fundamental data structures including the important multidimensional array Mat. highgui: This provides simple user interface (UI) capabilities. Building the library with Qt support (the WITH_QT CMake option) allows UI compatibility with such a framework. imgproc: These are image processing functions that include filtering (linear and nonlinear), geometric transformations, color space conversion, histograms, and so on. imgcodecs: This is an easy-to-use interface to read and write images. Pay attention to the changes in modules since OpenCV 3.0 as some functionality has been moved to a new module (for example, reading and writing images functions were moved from highgui to imgcodecs). photo: This includes Computational Photography including inpainting, denoising, High Dynamic Range (HDR) imaging, and some others. stitching: This is used for image stitching. videoio: This is an easy-to-use interface for video capture and video codecs. video: This supplies the functionality of video analysis (motion estimation, background extraction, and object tracking). features2d: These are functions for feature detection (corners and planar objects), feature description, feature matching, and so on. objdetect: These are functions for object detection and instances of predefined detectors (such as faces, eyes, smile, people, cars, and so on). Some other modules are calib3d (camera calibration), flann (clustering and search), ml (machine learning), shape (shape distance and matching), superres (super resolution), video (video analysis), and videostab (video stabilization). As of Version 3.0 beta, the new contributed modules are distributed in a separate package (opencv_contrib-master.zip) that can be downloaded from https://github.com/itseez/opencv_contrib. These modules provide extra features that should be fully understood before using them. For a quick overview of the new functionality in the new release of OpenCV (Version 3.0), refer to the document at http://opencv.org/opencv-3-0-beta.html. Creating user projects with OpenCV In this article, we assume that C++ is the main language for programming image processing applications, although interfaces and wrappers for other programming languages are actually provided (for instance, Python, Java, MATLAB/Octave, and some more). In this section, we explain how to develop applications with OpenCV's C++ API using an easy-to-use cross-platform framework. General usage of the library To develop an OpenCV application with C++, we require our code to: Include the OpenCV header files with definitions Link the OpenCV libraries (binaries) to get the final executable The OpenCV header files are located in the OPENCV_BUILDinstallincludeopencv2 directory where there is a file (*.hpp) for each of the modules. The inclusion of the header file is done with the #include directive, as shown here: #include <opencv2/<module_name>/<module_name>.hpp>// Including the header file for each module used in the code With this directive, it is possible to include every header file needed by the user program. On the other hand, if the opencv.hpp header file is included, all the header files will be automatically included as follows: #include <opencv2/opencv.hpp>// Including all the OpenCV's header files in the code Remember that all the modules installed locally are defined in the OPENCV_BUILDinstallincludeopencv2opencv_modules.hpp header file, which is generated automatically during the building process of OpenCV. The use of the #include directive is not always a guarantee for the correct inclusion of the header files, because it is necessary to tell the compiler where to find the include files. This is achieved by passing a special argument with the location of the files (such as I<location> for GNU compilers). The linking process requires you to provide the linker with the libraries (dynamic or static) where the required OpenCV functionality can be found. This is usually done with two types of arguments for the linker: the location of the library (such as -L<location> for GNU compilers) and the name of the library (such as -l<module_name>). You can find a complete list of available online documentation for GNU GCC and Make at https://gcc.gnu.org/onlinedocs/ and https://www.gnu.org/software/make/manual/. Tools to develop new projects The main prerequisites to develop our own OpenCV C++ applications are: OpenCV header files and library binaries: Of course we need to compile OpenCV, and the auxiliary libraries are prerequisites for such a compilation. The package should be compiled with the same compiler used to generate the user application. A C++ compiler: Some associate tools are convenient as the code editor, debugger, project manager, build process manager (for instance CMake), revision control system (such as Git, Mercurial, SVN, and so on), and class inspector, among others. Usually, these tools are deployed together in a so-called Integrated Development Environment (IDE). Any other auxiliary libraries: Optionally, any other auxiliary libraries needed to program the final application, such as graphical, statistical, and so on will be required. The most popular available compiler kits to program OpenCV C++ applications are: Microsoft Visual C (MSVC): This is only supported on Windows and it is very well integrated with the IDE Visual Studio, although it can be also integrated with other cross-platform IDEs, such as Qt Creator or Eclipse. Versions of MSVC that currently compatible with the latest OpenCV release are VC 10, VC 11, and VC 12 (Visual Studio 2010, 2012, and 2013). GNU Compiler Collection GNU GCC: This is a cross-platform compiler system developed by the GNU project. For Windows, this kit is known as MinGW (Minimal GNU GCC). The version compatible with the current OpenCV release is GNU GCC 4.8. This kit may be used with several IDEs, such as Qt Creator, Code::Blocks, Eclipse, among others. For the examples presented in this article, we used the MinGW 4.8 compiler kit for Windows plus the Qt 5.2.1 library and the Qt Creator IDE (3.0.1). The cross-platform Qt library is required to compile OpenCV with the new UI capabilities provided by such a library. For Windows, it is possible to download a Qt bundle (including Qt library, Qt Creator, and the MinGW kit) from http://qt-project.org/. The bundle is approximately 700 MB. Qt Creator is a cross-platform IDE for C++ that integrates the tools we need to code applications. In Windows, it may be used with MinGW or MSVC. The following screenshot shows you the Qt Creator main window with the different panels and views for an OpenCV C++ project: The main window of Qt Creator with some views from an OpenCV C++ project Creating an OpenCV C++ program with Qt Creator Next, we explain how to create a code project with the Qt Creator IDE. In particular, we apply this description to an OpenCV example. We can create a project for any OpenCV application using Qt Creator by navigating to File | New File or File | Project… and then navigating to Non-Qt Project | Plain C++ Project. Then, we have to choose a project name and the location at which it will be stored. The next step is to pick a kit (that is, the compiler) for the project (in our case, Desktop Qt 5.2.1 MinGW 32 bit) and the location for the binaries generated. Usually, two possible build configurations (profiles) are used: debug and release. These profiles set the appropriate flags to build and run the binaries. When a project is created using Qt Creator, two special files (with .pro and .pro.user extensions) are generated to configure the build and run processes. The build process is determined by the kit chosen during the creation of the project. With the Desktop Qt 5.2.1 MinGW 32 bit kit, this process relies on the qmake and mingw32-make tools. Using the *.pro file as the input, qmake generates the makefile that drives the build process for each profile (that is, release and debug). The qmake tool is used from the Qt Creator IDE as an alternative to CMake to simplify the build process of software projects. It automates the generation of makefiles from a few lines of information. The following lines represent an example of a *.pro file (for example, showImage.pro): TARGET: showImage TEMPLATE = app CONFIG += console CONFIG -= app_bundle CONFIG -= qt SOURCES +=    showImage.cpp INCLUDEPATH += C:/opencv300-buildQt/install/include LIBS += -LC:/opencv300-buildQt/install/x64/mingw/lib    -lopencv_core300.dll    -lopencv_imgcodecs300.dll    -lopencv_highgui300.dll    -lopencv_imgproc300.dll The preceding file illustrates the options that qmake needs to generate the appropriate makefiles to build the binaries for our project. Each line starts with a tag indicating an option (TARGET, CONFIG, SOURCES, INCLUDEPATH, and LIBS) followed with a mark to add (+=) or remove (-=) the value of the option. In this sample project, we use the non-Qt console application. The executable file is showImage.exe (TARGET) and the source file is showImage.cpp (SOURCES). As this project is an OpenCV-based application, the two last tags indicate the location of the header files (INCLUDEPATH) and the OpenCV libraries (LIBS) used by this particular project (core, imgcodecs, highgui, and imgproc). Note that a backslash at the end of the line denotes continuation in the next line. For a detailed description of the tools (including Qt Creator and qmake) developed within the Qt project, visit http://doc.qt.io/. Summary This article has now covered an introduction to OpenCV, how to download and install OpenCV, the structure of OpenCV, and how to create user projects with OpenCV. Resources for Article: Further resources on this subject: Camera Calibration [article] OpenCV: Segmenting Images [article] New functionality in OpenCV 3.0 [article]
Read more
  • 0
  • 0
  • 2084

article-image-creating-custom-reports-and-notifications-vsphere
Packt
24 Mar 2015
34 min read
Save for later

Creating Custom Reports and Notifications for vSphere

Packt
24 Mar 2015
34 min read
In this article by Philip Sellers, the author of PowerCLI Cookbook, you will cover the following topics: Getting alerts from a vSphere environment Basics of formatting output from PowerShell objects Sending output to CSV and HTML Reporting VM objects created during a predefined time period from VI Events object Setting custom properties to add useful context to your virtual machines Using PowerShell native capabilities to schedule scripts (For more resources related to this topic, see here.) This article is all about leveraging the information available to you in PowerCLI. As much as any other topic, figuring out how to tap into the data that PowerCLI offers is as important as understanding the cmdlets and syntax of the language. However, once you obtain your data, you will need to alter the formatting and how it's returned to be used. PowerShell, and by extension PowerCLI, offers a big set of ways to control the formatting and the display of information returned by its cmdlets and data objects. You will explore all of these topics with this article. Getting alerts from a vSphere environment Discovering the data available to you is the most difficult thing that you will learn and adopt in PowerCLI after learning the initial cmdlets and syntax. There is a large amount of data available to you through PowerCLI, but there are techniques to extract the data in a way that you can use. The Get-Member cmdlet is a great tool for discovering the properties that you can use. Sometimes, just listing the data returned by a cmdlet is enough; however, when the property contains other objects, Get-Member can provide context to know that the Alarm property is a Managed Object Reference (MoRef) data type. As your returned objects have properties that contain other objects, you can have multiple layers of data available for you to expose using PowerShell dot notation ($variable.property.property). The ExtensionData property found on most objects has a lot of related data and objects to the primary data. Sometimes, the data found in the property is an object identifier that doesn't mean much to an administrator but represents an object in vSphere. In these cases, the Get-View cmdlet can refer to that identifier and return human-readable data. This topic will walk you through the methods of accessing data and converting it to usable, human-readable data wherever needed so that you can leverage it in scripts. To explore these methods, we will take a look at vSphere's built-in alert system. While PowerCLI has native cmdlets to report on the defined alarm states and actions, it doesn't have a native cmdlet to retrieve the triggered alarms on a particular object. To do this, you must get the datacenter, VMhost, VM, and other objects directly and look at data from the ExtensionData property. Getting ready To begin this topic, you will need a PowerCLI window and an active connection to vCenter. You should also check the vSphere Web Client or the vSphere Windows Client to see whether you have any active alarms and to know what to expect. If you do not have any active VM alarms, you can simulate an alarm condition using a utility such as HeavyLoad. For more information on generating an alarm, see the There's more... section of this topic. How to do it… In order to access data and convert it to usable, human-readable data, perform the following steps: The first step is to retrieve all of the VMs on the system. A simple Get-VM cmdlet will return all VMs on the vCenter you're connected to. Within the VM object returned by Get-VM, one of the properties is ExtensionData. This property is an object that contains many additional properties and objects. One of the properties is TriggeredAlarmState: Get-VM | Where {$_.ExtensionData.TriggeredAlarmState -ne $null} To dig into TriggeredAlarmState more, take the output of the previous cmdlet and store it into a variable. This will allow you to enumerate the properties without having to wait for the Get-VM cmdlet to run. Add a Select -First 1 cmdlet to the command string so that only a single object is returned. This will help you look inside without having to deal with multiple VMs in the variable: $alarms = Get-VM | Where {$_.ExtensionData.TriggeredAlarmState -ne $null} | Select -First 1 Now that you have extracted an alarm, how do you get useful data about what type of alarm it is and which vSphere object has a problem? In this case, you have VM objects since you used Get-VM to find the alarms. To see what is in the TriggeredAlarmState property, output the contents of TriggeredAlarmState and pipe it to Get-Member or its shortcut GM: $alarms.ExtensionData.TriggeredAlarmState | GM The following screenshot shows the output of the preceding command line: List the data in the $alarms variable without the Get-Member cmdlet appended and view the data in a real alarm. The data returned does tell you the time when the alarm was triggered, the OverallStatus property or severity of the alarm, and whether the alarm has been acknowledged by an administrator, who acknowledged it and at what time. You will see that the Entity property contains a reference to a virtual machine. You can use the Get-View cmdlet on a reference to a VM, in this case, the Entity property, and return the virtual machine name and other properties. You will also see that Alarm is referred to in a similar way and we can extract usable information using Get-View also: Get-View $alarms.ExtensionData.TriggeredAlarmState.Entity Get-View $alarms.ExtensionData.TriggeredAlarmState.Alarm You can see how the output from these two views differs. The Entity view provides the name of the VM. You don't really need this data since the top-level object contains the VM name, but it's good to understand how to use Get-View with an entity. On the other hand, the data returned by the Alarm view does not show the name or type of the alarm, but it does include an Info property. Since this is the most likely property with additional information, you should list its contents. To do so, enclose the Get-View cmdlet in parenthesis and then use dot notation to access the Info variable: (Get-View $alarms.ExtensionData.TriggeredAlarmState.Alarm).Info In the output from the Info property, you can see that the example alarm in the screenshot is a Virtual Machine CPU usage alarm. Your alarm can be different, but it should appear similar to this. After retrieving PowerShell objects that contain the data that you need, the easiest way to return the data is to use calculated expressions. Since the Get-VM cmdlet was the source for all lookup data, you will need to use this object with the calculated expressions to display the data. To do this, you will append a Select statement after the Get-VM and Where statement. Notice that you use the same Get-View statement, except that you change your variable to $_, which is the current object being passed into Select: Get-VM | Where {$_.ExtensionData.TriggeredAlarmState -ne $null} | Select Name, @{N="AlarmName";E={(Get-View $_.ExtensionData. TriggeredAlarmState.Alarm).Info.Name}}, @{N="AlarmDescription";E={(Get-View $_.ExtensionData. TriggeredAlarmState.Alarm).Info.Description}}, @ {N="TimeTriggered"; E={$_.ExtensionData.TriggeredAlarmState. Time}}, @{N="AlarmOverallStatus"; E={$_.ExtensionData. TriggeredAlarmState. OverallStatus}} How it works When the data you really need is several levels below the top-level properties of a data object, you need to use calculated expressions to return these at the top level. There are other techniques where you can build your own object with only the data you want returned, but in a large environment with thousands of objects in vSphere, the method in this topic will execute faster than looping through many objects to build a custom object. Calculated expressions are extremely powerful since nearly anything can be done with expressions. More than that, you explored techniques to discover the data you want. Data exploration can provide you with incredible new capabilities. The point is you need to know where the data is and how to pull that data back to the top level. There's more… It is likely that your test environment has no alarms and in this case, it might be up to you to create an alarm situation. One of the easiest to control and create is heavy CPU load with a CPU load-testing tool. JAM Software created software named HeavyLoad that is a stress-testing tool. This utility can be loaded into any Windows VM on your test systems and can consume all of the available CPU that the VM is configured with. To be safe, configure the VM with a single vCPU and the utility will consume all of the available CPU. Once you install the utility, go to the Test Options menu and you can uncheck the Stress GPU option, ensure that Stress CPU and Allocate Memory are checked. The utility also has shortcut buttons on the Menu bar to allow you to set these options. Click on the Start button (which looks like a Play button) and the utility begins to stress the VM. For users who wish to do the same test, but utilize Linux, StressLinux is a great option. StressLinux is a minimal distribution designed to create high load on an operating system. See also You can read more about the HeavyLoad Utility available under the JAM Software page at http://www.jam-software.com/heavyload/ You can read more about StressLinux at http://www.stresslinux.org/sl/ Basics of formatting output from PowerShell objects Anything that exists in a PowerShell object can be output as a report, e-mail, or editable file. Formatting the output is a simple task in PowerShell. Sometimes, the information you receive in the object is in a long decimal number format, but to make it more readable, you want to truncate the output to just a couple decimal places. In this section, you will take a look at the Format-Table, Format-Wide, and Format-List cmdlets. You will dig into the Format-Custom cmdlet and also take a look at the -f format operator. The truth is that native cmdlets do a great job returning data using default formatting. When we start changing and adding our own data to the list of properties returned, the formatting can become unoptimized. Even in the returned values of a native cmdlet, some columns might be too narrow to display all of the information. Getting ready To begin this topic, you will need the PowerShell ISE. How to do it… In order to format the output from PowerShell objects, perform the following steps: Run Add-PSSnapIn VMware.VimAutomation.Core in the PowerShell ISE to initialize a PowerCLI session and bring in the VMware cmdlet. Connect to your vCenter server. Start with a simple object from a Get-VM cmdlet. The default output is in a table format. If you pipe the object to Format-Wide, it will change the default output into a multicolumn with a single property, just like running a dir /w command at the Windows Command Prompt. You can also use FW, an alias for Format-Wide: Get-VM | Format-Wide Get-VM | FW If you take the same object and pipe it to Format-Table or its alias FT, you will receive the same output if you use the default output for Get-VM: Get-VM Get-VM | Format-Table However, as soon as you begin to select a different order of properties, the default formatting disappears. Select the same four properties and watch the formatting change. The default formatting disappears. Get-VM | Select Name, PowerState, NumCPU, MemoryGB | FT To restore formatting to table output, you have a few choices. You can change the formatting on the data in the object using the Select statement and calculated expressions. You can also pass formatting through the Format-Table cmdlet. While setting formatting in the Select statement changes the underlying data, using Format-Table doesn't change the data, but only its display. The formatting looks essentially like a calculated expression in a Select statement. You provide Label, Expression, and formatting commands: Get-VM | Select * | FT Name, PowerState, NumCPU, @{Label="MemoryGB"; Expression={$_.MemoryGB}; FormatString="N2"; Alignment="left"} If you have data in a number data type, you can convert it into a string using the ToString() method on the object. You can try this method on NumCPU: Get-VM | Select * | FT Name, PowerState, @{Label="Num CPUs"; Expression={($_.NumCpu).ToString()}; Alignment="left"}, @{Label="MemoryGB"; Expression={$_.MemoryGB}; FormatString="N2"; Alignment="left"} The other method is to format with the -f operator, which is basically a .NET derivative. To better understand the formatting and string, the structure is {<index>[,<alignment>][:<formatString>]}. Index sets that are a part of the data being passed, will be transformed. The alignment is a numeric value. A positive number will right-align those number of characters. A negative number will left-align those number of characters. The formatString parameter is the part that defines the format to apply. In this example, let's take a datastore and compute the percentage of free disk space. The format for percent is p: Get-Datastore | Select Name, @{N="FreePercent";E={"{0:p} -f ($_.FreeSpaceGB / $_.CapacityGB)}} To make the FreePercent column 15 characters wide, you add 0,15:p to the format string: Get-Datastore | Select Name, @{N="FreePercent";E={"{0,15:p} -f ($_.FreeSpaceGB / $_.CapacityGB)}} How it works… With the Format-Table, Format-List, and Format-Wide cmdlets, you can change the display of data coming from a PowerCLI object. These cmdlets all apply basic transformations without changing the data in the object. This is important to note because once the data is changed, it can prevent you from making changes. For instance, if you take the percentage example, after transforming the FreePercent property, it is stored as a string and no longer as a number, which means that you can't reformat it again. Applying a similar transformation from the Format-Table cmdlet would not alter your data. This doesn't matter when you're performing a one-liner, but in a more complex script or in a routine, where you might need to not only output the data but also reuse it, changing the data in the object is a big deal. There's more… This topic only begins to tap the full potential of PowerShell's native -f format operator. There are hundreds of blog posts about this topic, and there are use cases and examples of how to produce the formatting that you are looking for. The following link also gives you more details about the operator and formatting strings that you can use in your own code. See also For more information, refer to the PowerShell -f Format operator page available at http://ss64.com/ps/syntax-f-operator.html Sending output to CSV and HTML On the screen the output is great, but there are many times when you need to share your results with other people. When looking at sharing information, you want to choose a format that is easy to view and interpret. You might also want a format that is easy to manipulate and change. Comma Separated Values (CSV) files allow the user to take the output you generate and use it easily within a spreadsheet software. This allows you the ability to compare the results from vSphere versus internal tracking databases or other systems easily to find differences. It can also be useful to compare against service contracts for physical hosts as examples. HTML is a great choice for displaying information for reading, but not manipulation. Since e-mails can be in an HTML format, converting the output from PowerCLI (or PowerShell) into an e-mail is an easy way to assemble an e-mail to other areas of the business. What's even better about these cmdlets is the ease of use. If you have a data object in PowerCLI, all that you need to do is pipe that data object into the ConvertTo-CSV or ConvertTo-HTML cmdlets and you instantly get the formatted data. You might not be satisfied with the HTML-generated version alone, but like any other HTML, you can transform the look and formatting of the HTML using CSS by adding a header. In this topic, you will examine the conversion cmdlets with a simple set of Get- cmdlets. You will also take a look at trimming results using the Select statements and formatting HTML results with CSS. This topic will pull a list of virtual machines and their basic properties to send to a manager who can reconcile it against internal records or system monitoring. It will export to a CSV file that will be attached to the e-mail and you will use the HTML to format a list in an e-mail to send to the manager. Getting ready To begin this topic, you will need to use the PowerShell ISE. How to do it… In order to examine the conversion cmdlets using Get- cmdlets, trim results using the Select statements, and format HTML results with CSS, perform the following steps: Open the PowerShell ISE and run Add-PSSnapIn VMware.VimAutomation.Core to initialize a PowerCLI session within the ISE. Again, you will use the Get-VM cmdlet as the base for this topic. The fields that we care about are the name of the VM, the number of CPUs, the amount of memory, and the description: $VMs = Get-VM | Select Name, NumCPU, MemoryGB, Description In addition to the top-level data, you also want to provide the IP address, hostname, and the operating system. These are all available from the ExtensionData.Guest property: $VMs = Get-VM | Select Name, NumCPU, MemoryGB, Description, @{N="Hostname";E={$_.ExtensionData.Guest.HostName}}, @{N="IP";E={$_.ExtensionData.Guest.IPAddress}}, @{N="OS";E={$_.ExtensionData.Guest.GuestFullName}} The next step is to take this data and format it to be sent as an HTML e-mail. Converting the information to HTML is actually easy. Pipe the variable you created with the data into ConvertTo-HTML and store in a new variable. You will need to reuse the data to convert it to a CSV file to attach: $HTMLBody = $VMs | ConvertTo-HTML If you were to output the contents of $HTMLBody, you will see that it is very plain, inheriting the defaults of the browser or e-mail program used to display it. To dress this up, you need to define some basic CSS to add some style for the <body>, <table>, <tr>, <td>, and <th> tags. You can add this by running the ConvertTo-HTML cmdlet again with the -PreContent parameter: $css = "<style> body { font-family: Verdana, sans-serif; fontsize: 14px; color: #666; background: #FFF; } table{ width:100%; border-collapse:collapse; } table td, table th { border:1px solid #333; padding: 4px; } table th { text-align:left; padding: 4px; background-color:#BBB; color:#FFF;} </style>" $HTMLBody = $VMs | ConvertTo-HTML -PreContent $css It might also be nice to add the date and time generated to the end of the file. You can use the -PostContent parameter to add this: $HTMLBody = $VMs | ConvertTo-HTML -PreContent $css -PostContent "<div><strong>Generated:</strong> $(Get-Date)</div>" Now, you have the HTML body of your message. To take the same data from $VMs and save it to a CSV file that you can use, you will need a writable directory, and a good choice is to use your My Documents folder. You can obtain this using an environment variable: $tempdir = [environment]::getfolderpath("mydocuments") Now that you have a temp directory, you can perform your export. Pipe $VMs to Export-CSV and specify the path and filename: $VMs | Export-CSV $tempdirVM_Inventory.csv At this point, you are ready to assemble an e-mail and send it along with your attachment. Most of the cmdlets are straightforward. You set up a $msg variable that is a MailMessage object. You create an Attachment object and populate it with your temporary filename and then create an SMTP server with the server name: $msg = New-Object Net.Mail.MailMessage $attachment = new-object Net.Mail.Attachment("$tempdirVM_Inventory.csv") $smtpServer = New-Object Net.Mail.SmtpClient("hostname") You set the From, To, and Subject parameters of the message variable. All of these are set with dot notation on the $msg variable: $msg.From = "fromaddress@yourcompany.com" $msg.To.Add("admin@yourcompany.com") $msg.Subject = "Weekly VM Report" You set the body you created earlier, as $HTMLBody, but you need to run it through Out-String to convert any other data types to a pure string for e-mailing. This prevents an error where System.String[] appears instead of your content in part of the output: $msg.Body = $HTMLBody | Out-String You need to take the attachment and add it to the message: $msg.Attachments.Add($attachment) You need to set the message to an HTML format; otherwise, the HTML will be sent as plain text and not displayed as an HTML message: $msg.IsBodyHtml = $true Finally, you are ready to send the message using the $smtpServer variable that contains the mail server object. Pass in the $msg variable to the server object using the Send method and it transmits the message via SMTP to the mail server: $smtpServer.Send($msg) Don't forget to clean up the temporary CSV file you generated. To do this, use the PowerShell Remove-Item cmdlet that will remove the file from the filesystem. Add a -Confirm parameter to suppress any prompts: Remove-Item $tempdirVM_Inventory.csv -Confirm:$false How it works… Most of this topic relies on native PowerShell and less on the PowerCLI portions of the language. This is the beauty of PowerCLI. Since it is based on PowerShell and only an extension, you lose none of the functions of PowerShell, a very powerful set of commands in its own right. The ConvertTo-HTML cmdlet is very easy to use. It requires no parameters to produce HTML, but the HTML it produces isn't the most legible if you display it. However, a bit of CSS goes a long way to improve the look of the output. Add some colors and style to the table and it becomes a really easy and quick way to format a mail message of data to be sent to a manager on a weekly basis. The Export-CSV cmdlet lets you take the data returned by a cmdlet and convert that into an editable format for use. You can place this onto a file share for use or you can e-mail it along, as you did in this topic. This topic takes you step by step through the process of creating a mail message, formatting it in HTML, and making sure that it's relayed as an HTML message. You also looked at how to attach a file. To send a mail, you define a mail server as an object and store it in a variable for reuse. You create a message object and store it in a variable and then set all of the appropriate configuration on the message. For an attachment, you create a third object and define a file to be attached. That is set as a property on the message object and then finally, the message object is sent using the server object. There's more… ConvertTo-HTML is just one of four conversion cmdlets in PowerShell. In addition to ConvertTo-HTML, you can convert data objects into XML. ConvertTo-JSON allows you to convert a data object into an XML format specific for web applications. ConvertTo-CSV is identical to Export-CSV except that it doesn't save the content immediately to a defined file. If you had a use case to manipulate the CSV before saving it, such as stripping the double quotes or making other alternations to the contents, you can use ConvertTo-CSV and then save it to a file at a later point in your script. Reporting VM objects created during a predefined time period from VI Events object An important auditing tool in your environment can be a report of when virtual machines were created, cloned, or deleted. Unlike snapshots, that store a created date on the snapshot, virtual machines don't have this property associated with them. Instead, you have to rely on the events log in vSphere to let you know when virtual machines were created. PowerCLI has the Get-VIEvents cmdlet that allows you to retrieve the last 1,000 events on the vCenter, by default. The cmdlet can accept a parameter to include more than the last 1,000 events. The cmdlet also allows you to specify a start date, and this can allow you to search for things within the past week or the past month. At a high level, this topic works the same in both PowerCLI and the vSphere SDK for Perl (VIPerl). They both rely on getting the vSphere events and selecting the specific events that match your criteria. Even though you are looking for VM creation events in this topic, you will see that the code can be easily adapted to look for many other types of events. Getting ready To begin this topic, you will need a PowerCLI window and an active connection to a vCenter server. How to do it… In order to report VM objects created during a predefined time period from VI Events object, perform the following steps: You will use the Get-VIEvent cmdlet to retrieve the VM creation events for this topic. To begin, get a list of the last 50 events from the vCenter host using the -MaxSamples parameter: Get-VIEvent -MaxSamples 50 If you pipe the output from the preceding cmdlet to Get-Member, you will see that this cmdlet can return a lot of different objects. However, the type of object isn't really what you need to find the VM's created events. Looking through the objects, they all include a GetType() method that returns the type of event. Inside the type, there is a name parameter. Create a calculated expression using GetType() and then group it based on this expression, you will get a usable list of events you can search for. This list is also good for tracking the number of events your systems have encountered or generated: Get-VIEvent -MaxSamples 2000 | Select @{N="Type";E={$_.GetType().Name}} | Group Type In the preceding screenshot, you will see that there are VMClonedEvent, VmRemovedEvent, and VmCreatedEvent listed. All of these have to do with creating or removing virtual machines in vSphere. Since you are looking for created events, VMClonedEvent and VmCreatedEvent are the two needed for this script. Write a Where statement to return only these events. To do this, we can use a regular expression with both the event names and the -match PowerShell comparison parameter: Get-VIEvent -MaxSamples 2000 | Where {$_.GetType().Name -match "(VmCreatedEvent|VmClonedEvent)"} Next, you want to select just the properties that you want in your output. To do this, add a Select statement and you will reuse the calculated expression from Step 3. If you want to return the VM name, which is in a Vm property with the type of VMware.Vim.VVmeventArgument, you can create a calculated expression to return the VM name. To round out the output, you can include the FullFormattedMessage, CreatedTime, and UserName properties: Get-VIEvent -MaxSamples 2000 | Where {$_.GetType().Name -match "(VmCreatedEvent|VmClonedEvent)"} | Select @{N="Type",E={$_.GetType().Name}}, @{N="VMName",E={$_.Vm.Name}},FullFormattedMessage, CreatedTime, UserName The last thing you will want to do is go back and add a time frame to the Get-VIEvent cmdlet. You can do this by specifying the -Start parameter along with (Get-Date).AddMonths(-1) to return the last month's events: Get-VIEvent -MaxSamples 2000 | Where {$_.GetType().Name -match "(VmCreatedEvent|VmClonedEvent)"} | Select @{N="Type",E={$_.GetType().Name}}, @{N="VMName",E={$_.Vm.Name}},FullFormattedMessage, CreatedTime, UserName How it works… The Get-VIEvent cmdlet drives a majority of this topic, but in this topic you only scratched the surface of the information you can unearth with Get-VIEvent. As you saw in the screenshot, there are so many different types of events that can be reported, queried, and acted upon from the vCenter server. Once you discover and know which events you are looking for specifically, then it's a matter of scoping down the results with a Where statement. Last, you use calculated expressions to pull data that is several levels deep in the returned data object. One of the primary things employed here is a regular expression used to search for the types of events you were interested in: VmCreatedEvent and VmClonedEvent. By combining a regular expression with the -match operator, you were able to use a quick and very understandable bit of code to find more than one type of object you needed to return. There's more… Regular Expressions (RegEx) are big topics on their own. These types of searches can match any type of pattern that you can establish or in the case of this topic, a number of defined values that you are searching for. RegEx are beyond the scope of this article, but they can be a big help anytime you have a pattern you need to search for and match, or perhaps more importantly, replace. You can use the -replace operator instead of –match to not only to find things that match your pattern, but also change them. See also For more information on Regular Expressions refer to http://ss64.com/ps/syntax-regex.html The PowerShell.com page: Text and Regular Expressions is available at http://powershell.com/cs/blogs/ebookv2/archive/2012/03/20/chapter-13-text-and-regular-expressions.aspx Setting custom properties to add useful context to your virtual machines Building on the use case for the Get-VIEvent cmdlet, Alan Renouf of VMware's PowerCLI team has a useful script posted on his personal blog (refer to the See also section) that helps you pull the created date and the user who created a virtual machine and populate this into a custom attribute. This is a great use for a custom attribute on virtual machines and makes some useful information available that is not normally visible. This is a process that needs to be run often to pick up details for virtual machines that have been created. Rather than looking specifically at a VM and trying to go back and find its creation date as Alan's script does, in this article, you will take a different approach building on the previous article and populate the information from the found creation events. Maintenance in this form would be easier by finding creation events for the last week, running the script weekly, and updating the VMs with the data in the object rather than looking for VMs with missing data and searching through all of the events. This article assumes that you are using a Windows system that is joined to AD on the same domain as your vCenter. It also assumes that you have loaded the Remote Server Administration Tools for Windows so that the Active Directory PowerShell modules are available. This is a separate download for Windows 7. The Active Directory Module for PowerShell can be enabled on Windows 7, Windows 8, Windows Server 2008, and Windows Server 2012 in the Programs and Features control panel under Turn Windows features on or off. Getting ready To begin this script, you will need the PowerShell ISE. How to do it… I order to set custom properties to add useful context to your virtual machines, perform the following steps: Open the PowerShell ISE and run Add-PSSnapIn VMware.VimAutomation.Core to initialize a PowerCLI session within the ISE. The first step is to create a custom attribute in vCenter for the CreatedBy and CreateDate attributes: New-CustomAttribute -TargetType VirtualMachine -Name CreatedBy New-CustomAttribute -TargetType VirtualMachine -Name CreateDate Before you begin the scripting, you will need to run ImportSystemModules to bring in the Active Directory cmdlets that you will use later to lookup the username and reference it back to a display name: ImportSystemModules Next, you need to locate and pull out all of the creation events with the same code as the Reporting VM objects created during a predefined time period from VI Events object topic. You will assign the events to a variable for processing in a loop in this case; however, you will also want to change the period to 1 week (7 days) instead of 1 month: $Events = Get-VIEvent -Start (Get-Date).AddDays(-7) -MaxSamples25000 | Where {$_.GetType().Name -match "(VmCreatedEvent|VmClonedEvent)"} The next step is to begin a ForEach loop to pull the data and populate it into a custom attribute: ForEach ($Event in $Events) { The first thing to do in the loop is to look up the VM referenced in the Event's Vm parameter by name using Get-VM: $VM = Get-VM -Name $Event.Vm.Name Next, you can use the CreatedTime parameter on the event and set this as a custom attribute on the VM using the Set-Annotation cmdlet: $VM | Set-Annotation -CustomAttribute "CreateDate" -Value $Event.CreatedTime Next, you can use the Username parameter to lookup the display name of the user account who created the VM using Active Directory cmdlets. For the Active Directory cmdlets to be available, your client system or server needs to have the Microsoft Remote Server Administration Tools (RSAT) installed to make the Active Directory cmdlets available. The data coming from $Event.Username is in DOMAINusername format. You need just the username to perform a lookup with Get-AdUser, so that you can split on the backslash and return only the second item in the array resulting from the split command. After the lookup, the display name that you will want to use is in the Name property. You can retrieve it with dot notation: $User = (($Event.UserName.split(""))[1]) $DisplayName = (Get-AdUser $User).Name To do this, you need to use a built-in on the event and set this as a custom attribute on the VM using the Set-Annotation cmdlet: $VM | Set-Annotation -CustomAttribute "CreatedBy" -Value $DisplayName Finally, close the ForEach loop. } <# End ForEach #> How it works… This topic works by leveraging the Get-VIEvent cmdlet to search for events in the log from the last number of days. In larger environments, you might need to expand the -MaxSamples cmdlet well beyond the number in this example. There might be thousands of events per day in larger environments. The topic looks through the log and the Where statement returns only the creation events. Once you have the object with all of the creation events, you can loop through this and pull out the username of the person who created each virtual machine and the time they were created. Then, you just need to populate the data into the custom attributes created. There's more… Combine this script with the next topic and you have a great solution for scheduling this routine to run on a daily basis. Running it daily would certainly cut down on the number of events you need to process through to find and update the virtual machines that have been created with the information. You should absolutely go and read Alan Renouf's blog post on which this topic is based. This primary difference between this topic and the one Alan presents is the use of native Windows Active Directory PowerShell lookups in this topic instead of the Quest Active Directory PowerShell cmdlets. See also Virtu-Al.net: Who created that VM? is available at http://www.virtu-al.net/2010/02/23/who-created-that-vm/ Using PowerShell native capabilities to schedule scripts There is potentially a better and easier way to schedule your processes to run from PowerShell and PowerCLI and those are known as Scheduled Jobs. Scheduled Jobs were introduced in PowerShell 3.0 and distributed as part of the Windows Management Framework 3.0 and higher. While Scheduled Tasks can execute any Windows batch file or executable, Scheduled Jobs are specific to PowerShell and are used to generate and create background jobs that run once or on a specified schedule. Scheduled Jobs appear in the Windows Task Scheduler and can be managed with the scheduled task cmdlets of PowerShell. The only difference is that the scheduled jobs cmdlets cannot manage scheduled tasks. These jobs are stored in the MicrosoftWindowsPowerShellScheduledJobs path of the Windows Task Scheduler. You can see and edit them through the management console in Windows after creation. What's even greater about Scheduled Jobs in PowerShell is that you are not forced into creating a .ps1 file for every new job you need to run. If you have a PowerCLI one-liner that provides all of the functionality you need, you can simply include it in a job creation cmdlet without ever needing to save it anywhere. Getting ready To being this topic, you will need a PowerCLI window with an active connection to a vCenter server. How to do it… In order to schedule scripts using the native capabilities of PowerShell, perform the following steps: If you are running PowerCLI on systems lower than Windows 8 or Windows Server 2012, there's a chance that you are running PowerShell 2.0 and you will need to upgrade in order to use this. To check, run Get-PSVersion to see which version is installed on your system. If less than version 3.0, upgrade before continuing this topic. Throw back a script you have already written, the script to find and remove snapshots over 30 days old: Get-Snapshot -VM * | Where {$_.Created -LT (Get-Date).AddDays(-30)} | Remove-Snapshot -Confirm:$false To schedule a new job, the first thing you need to think about is what triggers your job to run. To define a new trigger, you use the New-JobTrigger cmdlet: $WeeklySundayAt6AM = New-JobTrigger -Weekly -At "6:00 AM" -DaysOfWeek Sunday –WeeksInterval 1 Like scheduled tasks, there are some options that can be set for a scheduled job. These include whether to wake the system to run: $Options = New-ScheduledJobOption –WakeToRun –StartIfIdle –MultipleInstancePolicy Queue Next, you will use the Register-ScheduledJob cmdlet. This cmdlet accepts a parameter named ScriptBlock and this is where you will specify the script that you have written. This method works best with one-liners, or scripts that execute in a single line of piped cmdlets. Since this is PowerCLI and not just PowerShell, you will need to add the VMware cmdlets and connect to vCenter at the beginning of the script block. You also need to specify the -Trigger and -ScheduledJobOption parameters that are defined in the previous two steps: Register-ScheduledJob -Name "Cleanup 30 Day Snapshots"-ScriptBlock { Add-PSSnapIn VMware.VimAutomation.Core; Connect-VIServer servers; Get-Snapshot -VM * | Where {$_.Created -LT (Get-Date).AddDays(-30)} | Remove-Snapshot -Confirm:$false} -Trigger$WeeklySundayAt6AM-ScheduledJobOption $Options You are not restricted to only running a script block. If you have a routine in a .ps1 file, you can easily run it from ScheduledJob also. For illustration, if you have a .ps1 file stored in c:Scripts named 30DaySnaps.ps1, you can use the following cmdlet to register a job: Register-ScheduledJob -Name "Cleanup 30 Day Snapshots"–FilePath c:Scripts30DaySnaps.ps1 -Trigger $WeeklySundayAt6AM-ScheduledJobOption $Options Rather than scheduling the scheduled job and defining the PowerShell in the job, a more maintainable method can be to write the module and then call the function from the scheduled job. One other requirement is that Single Sign-On should be configured so that the Connect-VIServer works correctly in the script: Register-ScheduledJob -Name "Cleanup 30 Day Snapshots"-ScriptBlock {Add-PSSnapIn VMware.VimAutomation.Core; Connect-ViServer server; Import-Module 30DaySnaps; Remove-30DaySnaps -VM*} - How it works… This topic leverages the scheduled jobs framework developed specifically for running PowerShell as scheduled tasks. It doesn't require you to configure all of the extra settings as you have seen in previous examples of scheduled tasks. These are PowerShell native cmdlets that know how to implement PowerShell on a schedule. One thing to keep in mind is that these jobs will begin with a normal PowerShell session—one that knows nothing about PowerCLI, by default. You will need to include Add-PSSnapIn VMware.VimAutomation.Core in each script block or the .ps1 file that you use with a scheduled job. There's more… There is a full library of cmdlets to implement and maintain scheduled jobs. You have Set-ScheduleJob that allows you to change the settings of a registered scheduled job on a Windows system. You can disable and enable scheduled jobs using the Disable-ScheduledJob and Enable-Scheduled job cmdlets. This allows you to pause the execution of a job during maintenance, or for other reasons, without needing to remove and resetup the job. This is especially helpful since the script blocks are inside the job and not saved in a separate .ps1 file. You can also configure remote scheduled jobs on other systems using the Invoke-Command PowerShell cmdlet. This concept is shown in examples on Microsoft TechNet in the documentation for the Register-ScheduledJob cmdlet. In addition to scheduling new jobs, you can remove jobs using the Unregister-ScheduledJob cmdlet. This cmdlet requires one of three identifying properties to unschedule a job. You can pass -Name with a string, -ID with the number identifying the job, or an object reference to the scheduled job with -InputObject. You can combine the Get-ScheduledJob cmdlet to find and pass the object by pipeline. See also To read more about Microsoft TechNet: PSScheduledJob Cmdlets, refer to http://technet.microsoft.com/en-us/library/hh849778.aspx Summary This article was all about leveraging the information and data from PowerCLI as well as how we can format and display this information. Resources for Article: Further resources on this subject: Introduction to vSphere Distributed switches [article] Creating an Image Profile by cloning an existing one [article] VMware View 5 Desktop Virtualization [article]
Read more
  • 0
  • 0
  • 8874

article-image-classifying-real-world-examples
Packt
24 Mar 2015
32 min read
Save for later

Classifying with Real-world Examples

Packt
24 Mar 2015
32 min read
This article by the authors, Luis Pedro Coelho and Willi Richert, of the book, Building Machine Learning Systems with Python - Second Edition, focuses on the topic of classification. (For more resources related to this topic, see here.) You have probably already used this form of machine learning as a consumer, even if you were not aware of it. If you have any modern e-mail system, it will likely have the ability to automatically detect spam. That is, the system will analyze all incoming e-mails and mark them as either spam or not-spam. Often, you, the end user, will be able to manually tag e-mails as spam or not, in order to improve its spam detection ability. This is a form of machine learning where the system is taking examples of two types of messages: spam and ham (the typical term for "non spam e-mails") and using these examples to automatically classify incoming e-mails. The general method of classification is to use a set of examples of each class to learn rules that can be applied to new examples. This is one of the most important machine learning modes and is the topic of this article. Working with text such as e-mails requires a specific set of techniques and skills. For the moment, we will work with a smaller, easier-to-handle dataset. The example question for this article is, "Can a machine distinguish between flower species based on images?" We will use two datasets where measurements of flower morphology are recorded along with the species for several specimens. We will explore these small datasets using a few simple algorithms. At first, we will write classification code ourselves in order to understand the concepts, but we will quickly switch to using scikit-learn whenever possible. The goal is to first understand the basic principles of classification and then progress to using a state-of-the-art implementation. The Iris dataset The Iris dataset is a classic dataset from the 1930s; it is one of the first modern examples of statistical classification. The dataset is a collection of morphological measurements of several Iris flowers. These measurements will enable us to distinguish multiple species of the flowers. Today, species are identified by their DNA fingerprints, but in the 1930s, DNA's role in genetics had not yet been discovered. The following four attributes of each plant were measured: sepal length sepal width petal length petal width In general, we will call the individual numeric measurements we use to describe our data features. These features can be directly measured or computed from intermediate data. This dataset has four features. Additionally, for each plant, the species was recorded. The problem we want to solve is, "Given these examples, if we see a new flower out in the field, could we make a good prediction about its species from its measurements?" This is the supervised learning or classification problem: given labeled examples, can we design a rule to be later applied to other examples? A more familiar example to modern readers who are not botanists is spam filtering, where the user can mark e-mails as spam, and systems use these as well as the non-spam e-mails to determine whether a new, incoming message is spam or not. For the moment, the Iris dataset serves our purposes well. It is small (150 examples, four features each) and can be easily visualized and manipulated. Visualization is a good first step Datasets will grow to thousands of features. With only four in our starting example, we can easily plot all two-dimensional projections on a single page. We will build intuitions on this small example, which can then be extended to large datasets with many more features. Visualizations are excellent at the initial exploratory phase of the analysis as they allow you to learn the general features of your problem as well as catch problems that occurred with data collection early. Each subplot in the following plot shows all points projected into two of the dimensions. The outlying group (triangles) are the Iris Setosa plants, while Iris Versicolor plants are in the center (circle) and Iris Virginica are plotted with x marks. We can see that there are two large groups: one is of Iris Setosa and another is a mixture of Iris Versicolor and Iris Virginica.   In the following code snippet, we present the code to load the data and generate the plot: >>> from matplotlib import pyplot as plt >>> import numpy as np   >>> # We load the data with load_iris from sklearn >>> from sklearn.datasets import load_iris >>> data = load_iris()   >>> # load_iris returns an object with several fields >>> features = data.data >>> feature_names = data.feature_names >>> target = data.target >>> target_names = data.target_names   >>> for t in range(3): ...   if t == 0: ...       c = 'r' ...       marker = '>' ...   elif t == 1: ...       c = 'g' ...       marker = 'o' ...   elif t == 2: ...       c = 'b' ...       marker = 'x' ...   plt.scatter(features[target == t,0], ...               features[target == t,1], ...               marker=marker, ...               c=c) Building our first classification model If the goal is to separate the three types of flowers, we can immediately make a few suggestions just by looking at the data. For example, petal length seems to be able to separate Iris Setosa from the other two flower species on its own. We can write a little bit of code to discover where the cut-off is: >>> # We use NumPy fancy indexing to get an array of strings: >>> labels = target_names[target]   >>> # The petal length is the feature at position 2 >>> plength = features[:, 2]   >>> # Build an array of booleans: >>> is_setosa = (labels == 'setosa')   >>> # This is the important step: >>> max_setosa =plength[is_setosa].max() >>> min_non_setosa = plength[~is_setosa].min() >>> print('Maximum of setosa: {0}.'.format(max_setosa)) Maximum of setosa: 1.9.   >>> print('Minimum of others: {0}.'.format(min_non_setosa)) Minimum of others: 3.0. Therefore, we can build a simple model: if the petal length is smaller than 2, then this is an Iris Setosa flower; otherwise it is either Iris Virginica or Iris Versicolor. This is our first model and it works very well in that it separates Iris Setosa flowers from the other two species without making any mistakes. In this case, we did not actually do any machine learning. Instead, we looked at the data ourselves, looking for a separation between the classes. Machine learning happens when we write code to look for this separation automatically. The problem of recognizing Iris Setosa apart from the other two species was very easy. However, we cannot immediately see what the best threshold is for distinguishing Iris Virginica from Iris Versicolor. We can even see that we will never achieve perfect separation with these features. We could, however, look for the best possible separation, the separation that makes the fewest mistakes. For this, we will perform a little computation. We first select only the non-Setosa features and labels: >>> # ~ is the boolean negation operator >>> features = features[~is_setosa] >>> labels = labels[~is_setosa] >>> # Build a new target variable, is_virginica >>> is_virginica = (labels == 'virginica') Here we are heavily using NumPy operations on arrays. The is_setosa array is a Boolean array and we use it to select a subset of the other two arrays, features and labels. Finally, we build a new boolean array, virginica, by using an equality comparison on labels. Now, we run a loop over all possible features and thresholds to see which one results in better accuracy. Accuracy is simply the fraction of examples that the model classifies correctly. >>> # Initialize best_acc to impossibly low value >>> best_acc = -1.0 >>> for fi in range(features.shape[1]): ... # We are going to test all possible thresholds ... thresh = features[:,fi] ... for t in thresh: ...   # Get the vector for feature `fi` ...   feature_i = features[:, fi] ...   # apply threshold `t` ...   pred = (feature_i > t) ...   acc = (pred == is_virginica).mean() ...   rev_acc = (pred == ~is_virginica).mean() ...   if rev_acc > acc: ...       reverse = True ...       acc = rev_acc ...   else: ...       reverse = False ... ...   if acc > best_acc: ...     best_acc = acc ...     best_fi = fi ...     best_t = t ...     best_reverse = reverse We need to test two types of thresholds for each feature and value: we test a greater than threshold and the reverse comparison. This is why we need the rev_acc variable in the preceding code; it holds the accuracy of reversing the comparison. The last few lines select the best model. First, we compare the predictions, pred, with the actual labels, is_virginica. The little trick of computing the mean of the comparisons gives us the fraction of correct results, the accuracy. At the end of the for loop, all the possible thresholds for all the possible features have been tested, and the variables best_fi, best_t, and best_reverse hold our model. This is all the information we need to be able to classify a new, unknown object, that is, to assign a class to it. The following code implements exactly this method: def is_virginica_test(fi, t, reverse, example):    "Apply threshold model to a new example"    test = example[fi] > t    if reverse:        test = not test    return test What does this model look like? If we run the code on the whole data, the model that is identified as the best makes decisions by splitting on the petal width. One way to gain intuition about how this works is to visualize the decision boundary. That is, we can see which feature values will result in one decision versus the other and exactly where the boundary is. In the following screenshot, we see two regions: one is white and the other is shaded in grey. Any datapoint that falls on the white region will be classified as Iris Virginica, while any point that falls on the shaded side will be classified as Iris Versicolor. In a threshold model, the decision boundary will always be a line that is parallel to one of the axes. The plot in the preceding screenshot shows the decision boundary and the two regions where points are classified as either white or grey. It also shows (as a dashed line) an alternative threshold, which will achieve exactly the same accuracy. Our method chose the first threshold it saw, but that was an arbitrary choice. Evaluation – holding out data and cross-validation The model discussed in the previous section is a simple model; it achieves 94 percent accuracy of the whole data. However, this evaluation may be overly optimistic. We used the data to define what the threshold will be, and then we used the same data to evaluate the model. Of course, the model will perform better than anything else we tried on this dataset. The reasoning is circular. What we really want to do is estimate the ability of the model to generalize to new instances. We should measure its performance in instances that the algorithm has not seen at training. Therefore, we are going to do a more rigorous evaluation and use held-out data. For this, we are going to break up the data into two groups: on one group, we'll train the model, and on the other, we'll test the one we held out of training. The full code, which is an adaptation of the code presented earlier, is available on the online support repository. Its output is as follows: Training accuracy was 96.0%. Testing accuracy was 90.0% (N = 50). The result on the training data (which is a subset of the whole data) is apparently even better than before. However, what is important to note is that the result in the testing data is lower than that of the training error. While this may surprise an inexperienced machine learner, it is expected that testing accuracy will be lower than the training accuracy. To see why, look back at the plot that showed the decision boundary. Consider what would have happened if some of the examples close to the boundary were not there or that one of them between the two lines was missing. It is easy to imagine that the boundary will then move a little bit to the right or to the left so as to put them on the wrong side of the border. The accuracy on the training data, the training accuracy, is almost always an overly optimistic estimate of how well your algorithm is doing. We should always measure and report the testing accuracy, which is the accuracy on a collection of examples that were not used for training. These concepts will become more and more important as the models become more complex. In this example, the difference between the accuracy measured on training data and on testing data is not very large. When using a complex model, it is possible to get 100 percent accuracy in training and do no better than random guessing on testing! One possible problem with what we did previously, which was to hold out data from training, is that we only used half the data for training. Perhaps it would have been better to use more training data. On the other hand, if we then leave too little data for testing, the error estimation is performed on a very small number of examples. Ideally, we would like to use all of the data for training and all of the data for testing as well, which is impossible. We can achieve a good approximation of this impossible ideal by a method called cross-validation. One simple form of cross-validation is leave-one-out cross-validation. We will take an example out of the training data, learn a model without this example, and then test whether the model classifies this example correctly. This process is then repeated for all the elements in the dataset. The following code implements exactly this type of cross-validation: >>> correct = 0.0 >>> for ei in range(len(features)):      # select all but the one at position `ei`:      training = np.ones(len(features), bool)      training[ei] = False      testing = ~training      model = fit_model(features[training], is_virginica[training])      predictions = predict(model, features[testing])      correct += np.sum(predictions == is_virginica[testing]) >>> acc = correct/float(len(features)) >>> print('Accuracy: {0:.1%}'.format(acc)) Accuracy: 87.0% At the end of this loop, we will have tested a series of models on all the examples and have obtained a final average result. When using cross-validation, there is no circularity problem because each example was tested on a model which was built without taking that datapoint into account. Therefore, the cross-validated estimate is a reliable estimate of how well the models would generalize to new data. The major problem with leave-one-out cross-validation is that we are now forced to perform many times more work. In fact, you must learn a whole new model for each and every example and this cost will increase as our dataset grows. We can get most of the benefits of leave-one-out at a fraction of the cost by using x-fold cross-validation, where x stands for a small number. For example, to perform five-fold cross-validation, we break up the data into five groups, so-called five folds. Then you learn five models: each time you will leave one fold out of the training data. The resulting code will be similar to the code given earlier in this section, but we leave 20 percent of the data out instead of just one element. We test each of these models on the left-out fold and average the results.   The preceding figure illustrates this process for five blocks: the dataset is split into five pieces. For each fold, you hold out one of the blocks for testing and train on the other four. You can use any number of folds you wish. There is a trade-off between computational efficiency (the more folds, the more computation is necessary) and accurate results (the more folds, the closer you are to using the whole of the data for training). Five folds is often a good compromise. This corresponds to training with 80 percent of your data, which should already be close to what you will get from using all the data. If you have little data, you can even consider using 10 or 20 folds. In the extreme case, if you have as many folds as datapoints, you are simply performing leave-one-out cross-validation. On the other hand, if computation time is an issue and you have more data, 2 or 3 folds may be the more appropriate choice. When generating the folds, you need to be careful to keep them balanced. For example, if all of the examples in one fold come from the same class, then the results will not be representative. We will not go into the details of how to do this, because the machine learning package scikit-learn will handle them for you. We have now generated several models instead of just one. So, "What final model do we return and use for new data?" The simplest solution is now to train a single overall model on all your training data. The cross-validation loop gives you an estimate of how well this model should generalize. A cross-validation schedule allows you to use all your data to estimate whether your methods are doing well. At the end of the cross-validation loop, you can then use all your data to train a final model. Although it was not properly recognized when machine learning was starting out as a field, nowadays, it is seen as a very bad sign to even discuss the training accuracy of a classification system. This is because the results can be very misleading and even just presenting them marks you as a newbie in machine learning. We always want to measure and compare either the error on a held-out dataset or the error estimated using a cross-validation scheme. Building more complex classifiers In the previous section, we used a very simple model: a threshold on a single feature. Are there other types of systems? Yes, of course! Many others. To think of the problem at a higher abstraction level, "What makes up a classification model?" We can break it up into three parts: The structure of the model: How exactly will a model make decisions? In this case, the decision depended solely on whether a given feature was above or below a certain threshold value. This is too simplistic for all but the simplest problems. The search procedure: How do we find the model we need to use? In our case, we tried every possible combination of feature and threshold. You can easily imagine that as models get more complex and datasets get larger, it rapidly becomes impossible to attempt all combinations and we are forced to use approximate solutions. In other cases, we need to use advanced optimization methods to find a good solution (fortunately, scikit-learn already implements these for you, so using them is easy even if the code behind them is very advanced). The gain or loss function: How do we decide which of the possibilities tested should be returned? Rarely do we find the perfect solution, the model that never makes any mistakes, so we need to decide which one to use. We used accuracy, but sometimes it will be better to optimize so that the model makes fewer errors of a specific kind. For example, in spam filtering, it may be worse to delete a good e-mail than to erroneously let a bad e-mail through. In that case, we may want to choose a model that is conservative in throwing out e-mails rather than the one that just makes the fewest mistakes overall. We can discuss these issues in terms of gain (which we want to maximize) or loss (which we want to minimize). They are equivalent, but sometimes one is more convenient than the other. We can play around with these three aspects of classifiers and get different systems. A simple threshold is one of the simplest models available in machine learning libraries and only works well when the problem is very simple, such as with the Iris dataset. In the next section, we will tackle a more difficult classification task that requires a more complex structure. In our case, we optimized the threshold to minimize the number of errors. Alternatively, we might have different loss functions. It might be that one type of error is much costlier than the other. In a medical setting, false negatives and false positives are not equivalent. A false negative (when the result of a test comes back negative, but that is false) might lead to the patient not receiving treatment for a serious disease. A false positive (when the test comes back positive even though the patient does not actually have that disease) might lead to additional tests to confirm or unnecessary treatment (which can still have costs, including side effects from the treatment, but are often less serious than missing a diagnostic). Therefore, depending on the exact setting, different trade-offs can make sense. At one extreme, if the disease is fatal and the treatment is cheap with very few negative side-effects, then you want to minimize false negatives as much as you can. What the gain/cost function should be is always dependent on the exact problem you are working on. When we present a general-purpose algorithm, we often focus on minimizing the number of mistakes, achieving the highest accuracy. However, if some mistakes are costlier than others, it might be better to accept a lower overall accuracy to minimize the overall costs. A more complex dataset and a more complex classifier We will now look at a slightly more complex dataset. This will motivate the introduction of a new classification algorithm and a few other ideas. Learning about the Seeds dataset We now look at another agricultural dataset, which is still small, but already too large to plot exhaustively on a page as we did with Iris. This dataset consists of measurements of wheat seeds. There are seven features that are present, which are as follows: area A perimeter P compactness C = 4pA/P² length of kernel width of kernel asymmetry coefficient length of kernel groove There are three classes, corresponding to three wheat varieties: Canadian, Koma, and Rosa. As earlier, the goal is to be able to classify the species based on these morphological measurements. Unlike the Iris dataset, which was collected in the 1930s, this is a very recent dataset and its features were automatically computed from digital images. This is how image pattern recognition can be implemented: you can take images, in digital form, compute a few relevant features from them, and use a generic classification system. For the moment, we will work with the features that are given to us. UCI Machine Learning Dataset Repository The University of California at Irvine (UCI) maintains an online repository of machine learning datasets (at the time of writing, they list 233 datasets). Both the Iris and the Seeds dataset used in this article were taken from there. The repository is available online at http://archive.ics.uci.edu/ml/. Features and feature engineering One interesting aspect of these features is that the compactness feature is not actually a new measurement, but a function of the previous two features, area and perimeter. It is often very useful to derive new combined features. Trying to create new features is generally called feature engineering. It is sometimes seen as less glamorous than algorithms, but it often matters more for performance (a simple algorithm on well-chosen features will perform better than a fancy algorithm on not-so-good features). In this case, the original researchers computed the compactness, which is a typical feature for shapes. It is also sometimes called roundness. This feature will have the same value for two kernels, one of which is twice as big as the other one, but with the same shape. However, it will have different values for kernels that are very round (when the feature is close to one) when compared to kernels that are elongated (when the feature is closer to zero). The goals of a good feature are to simultaneously vary with what matters (the desired output) and be invariant with what does not. For example, compactness does not vary with size, but varies with the shape. In practice, it might be hard to achieve both objectives perfectly, but we want to approximate this ideal. You will need to use background knowledge to design good features. Fortunately, for many problem domains, there is already a vast literature of possible features and feature-types that you can build upon. For images, all of the previously mentioned features are typical and computer vision libraries will compute them for you. In text-based problems too, there are standard solutions that you can mix and match. When possible, you should use your knowledge of the problem to design a specific feature or to select which ones from the literature are more applicable to the data at hand. Even before you have data, you must decide which data is worthwhile to collect. Then, you hand all your features to the machine to evaluate and compute the best classifier. A natural question is whether we can select good features automatically. This problem is known as feature selection. There are many methods that have been proposed for this problem, but in practice very simple ideas work best. For the small problems we are currently exploring, it does not make sense to use feature selection, but if you had thousands of features, then throwing out most of them might make the rest of the process much faster. Nearest neighbor classification For use with this dataset, we will introduce a new classifier: the nearest neighbor classifier. The nearest neighbor classifier is very simple. When classifying a new element, it looks at the training data for the object that is closest to it, its nearest neighbor. Then, it returns its label as the answer. Notice that this model performs perfectly on its training data! For each point, its closest neighbor is itself, and so its label matches perfectly (unless two examples with different labels have exactly the same feature values, which will indicate that the features you are using are not very descriptive). Therefore, it is essential to test the classification using a cross-validation protocol. The nearest neighbor method can be generalized to look not at a single neighbor, but to multiple ones and take a vote amongst the neighbors. This makes the method more robust to outliers or mislabeled data. Classifying with scikit-learn We have been using handwritten classification code, but Python is a very appropriate language for machine learning because of its excellent libraries. In particular, scikit-learn has become the standard library for many machine learning tasks, including classification. We are going to use its implementation of nearest neighbor classification in this section. The scikit-learn classification API is organized around classifier objects. These objects have the following two essential methods: fit(features, labels): This is the learning step and fits the parameters of the model predict(features): This method can only be called after fit and returns a prediction for one or more inputs Here is how we could use its implementation of k-nearest neighbors for our data. We start by importing the KneighborsClassifier object from the sklearn.neighbors submodule: >>> from sklearn.neighbors import KNeighborsClassifier The scikit-learn module is imported as sklearn (sometimes you will also find that scikit-learn is referred to using this short name instead of the full name). All of the sklearn functionality is in submodules, such as sklearn.neighbors. We can now instantiate a classifier object. In the constructor, we specify the number of neighbors to consider, as follows: >>> classifier = KNeighborsClassifier(n_neighbors=1) If we do not specify the number of neighbors, it defaults to 5, which is often a good choice for classification. We will want to use cross-validation (of course) to look at our data. The scikit-learn module also makes this easy: >>> from sklearn.cross_validation import KFold   >>> kf = KFold(len(features), n_folds=5, shuffle=True) >>> # `means` will be a list of mean accuracies (one entry per fold) >>> means = [] >>> for training,testing in kf: ...   # We fit a model for this fold, then apply it to the ...   # testing data with `predict`: ...   classifier.fit(features[training], labels[training]) ...   prediction = classifier.predict(features[testing]) ... ...   # np.mean on an array of booleans returns fraction ...     # of correct decisions for this fold: ...   curmean = np.mean(prediction == labels[testing]) ...   means.append(curmean) >>> print("Mean accuracy: {:.1%}".format(np.mean(means))) Mean accuracy: 90.5% Using five folds for cross-validation, for this dataset, with this algorithm, we obtain 90.5 percent accuracy. As we discussed in the earlier section, the cross-validation accuracy is lower than the training accuracy, but this is a more credible estimate of the performance of the model. Looking at the decision boundaries We will now examine the decision boundary. In order to plot these on paper, we will simplify and look at only two dimensions. Take a look at the following plot:   Canadian examples are shown as diamonds, Koma seeds as circles, and Rosa seeds as triangles. Their respective areas are shown as white, black, and grey. You might be wondering why the regions are so horizontal, almost weirdly so. The problem is that the x axis (area) ranges from 10 to 22, while the y axis (compactness) ranges from 0.75 to 1.0. This means that a small change in x is actually much larger than a small change in y. So, when we compute the distance between points, we are, for the most part, only taking the x axis into account. This is also a good example of why it is a good idea to visualize our data and look for red flags or surprises. If you studied physics (and you remember your lessons), you might have already noticed that we had been summing up lengths, areas, and dimensionless quantities, mixing up our units (which is something you never want to do in a physical system). We need to normalize all of the features to a common scale. There are many solutions to this problem; a simple one is to normalize to z-scores. The z-score of a value is how far away from the mean it is, in units of standard deviation. It comes down to this operation: In this formula, f is the old feature value, f' is the normalized feature value, μ is the mean of the feature, and σ is the standard deviation. Both μ and σ are estimated from training data. Independent of what the original values were, after z-scoring, a value of zero corresponds to the training mean, positive values are above the mean, and negative values are below it. The scikit-learn module makes it very easy to use this normalization as a preprocessing step. We are going to use a pipeline of transformations: the first element will do the transformation and the second element will do the classification. We start by importing both the pipeline and the feature scaling classes as follows: >>> from sklearn.pipeline import Pipeline >>> from sklearn.preprocessing import StandardScaler Now, we can combine them. >>> classifier = KNeighborsClassifier(n_neighbors=1) >>> classifier = Pipeline([('norm', StandardScaler()), ('knn', classifier)]) The Pipeline constructor takes a list of pairs (str,clf). Each pair corresponds to a step in the pipeline: the first element is a string naming the step, while the second element is the object that performs the transformation. Advanced usage of the object uses these names to refer to different steps. After normalization, every feature is in the same units (technically, every feature is now dimensionless; it has no units) and we can more confidently mix dimensions. In fact, if we now run our nearest neighbor classifier, we obtain 93 percent accuracy, estimated with the same five-fold cross-validation code shown previously! Look at the decision space again in two dimensions:   The boundaries are now different and you can see that both dimensions make a difference for the outcome. In the full dataset, everything is happening on a seven-dimensional space, which is very hard to visualize, but the same principle applies; while a few dimensions are dominant in the original data, after normalization, they are all given the same importance. Binary and multiclass classification The first classifier we used, the threshold classifier, was a simple binary classifier. Its result is either one class or the other, as a point is either above the threshold value or it is not. The second classifier we used, the nearest neighbor classifier, was a natural multiclass classifier, its output can be one of the several classes. It is often simpler to define a simple binary method than the one that works on multiclass problems. However, we can reduce any multiclass problem to a series of binary decisions. This is what we did earlier in the Iris dataset, in a haphazard way: we observed that it was easy to separate one of the initial classes and focused on the other two, reducing the problem to two binary decisions: Is it an Iris Setosa (yes or no)? If not, check whether it is an Iris Virginica (yes or no). Of course, we want to leave this sort of reasoning to the computer. As usual, there are several solutions to this multiclass reduction. The simplest is to use a series of one versus the rest classifiers. For each possible label ℓ, we build a classifier of the type is this ℓ or something else? When applying the rule, exactly one of the classifiers will say yes and we will have our solution. Unfortunately, this does not always happen, so we have to decide how to deal with either multiple positive answers or no positive answers.   Alternatively, we can build a classification tree. Split the possible labels into two, and build a classifier that asks, "Should this example go in the left or the right bin?" We can perform this splitting recursively until we obtain a single label. The preceding diagram depicts the tree of reasoning for the Iris dataset. Each diamond is a single binary classifier. It is easy to imagine that we could make this tree larger and encompass more decisions. This means that any classifier that can be used for binary classification can also be adapted to handle any number of classes in a simple way. There are many other possible ways of turning a binary method into a multiclass one. There is no single method that is clearly better in all cases. The scikit-learn module implements several of these methods in the sklearn.multiclass submodule. Some classifiers are binary systems, while many real-life problems are naturally multiclass. Several simple protocols reduce a multiclass problem to a series of binary decisions and allow us to apply the binary models to our multiclass problem. This means methods that are apparently only for binary data can be applied to multiclass data with little extra effort. Summary Classification means generalizing from examples to build a model (that is, a rule that can automatically be applied to new, unclassified objects). It is one of the fundamental tools in machine. In a sense, this was a very theoretical article, as we introduced generic concepts with simple examples. We went over a few operations with the Iris dataset. This is a small dataset. However, it has the advantage that we were able to plot it out and see what we were doing in detail. This is something that will be lost when we move on to problems with many dimensions and many thousands of examples. The intuitions we gained here will all still be valid. You also learned that the training error is a misleading, over-optimistic estimate of how well the model does. We must, instead, evaluate it on testing data that has not been used for training. In order to not waste too many examples in testing, a cross-validation schedule can get us the best of both worlds (at the cost of more computation). We also had a look at the problem of feature engineering. Features are not predefined for you, but choosing and designing features is an integral part of designing a machine learning pipeline. In fact, it is often the area where you can get the most improvements in accuracy, as better data beats fancier methods. Resources for Article: Further resources on this subject: Ridge Regression [article] The Spark programming model [article] Using cross-validation [article]
Read more
  • 0
  • 0
  • 17882
article-image-category-theory
Packt
24 Mar 2015
7 min read
Save for later

Category Theory

Packt
24 Mar 2015
7 min read
In this article written by Dan Mantyla, author of the book Functional Programming in JavaScript, you would get to learn about some of the JavaScript functors such as Maybes, Promises and Lenses. (For more resources related to this topic, see here.) Maybes Maybes allow us to gracefully work with data that might be null and to have defaults. A maybe is a variable that either has some value or it doesn't. And it doesn't matter to the caller. On its own, it might seem like this is not that big a deal. Everybody knows that null-checks are easily accomplished with an if-else statement: if (getUsername() == null ) { username = 'Anonymous') {    else {    username = getUsername(); } But with functional programming, we're breaking away from the procedural, line-by-line way of doing things and instead working with pipelines of functions and data. If we had to break the chain in the middle just to check if the value existed or not, we would have to create temporary variables and write more code. Maybes are just tools to help us keep the logic flowing through the pipeline. To implement maybes, we'll first need to create some constructors. // the Maybe monad constructor, empty for now var Maybe = function(){};   // the None instance, a wrapper for an object with no value var None = function(){}; None.prototype = Object.create(Maybe.prototype); None.prototype.toString = function(){return 'None';};   // now we can write the `none` function // saves us from having to write `new None()` all the time var none = function(){return new None()};   // and the Just instance, a wrapper for an object with a value var Just = function(x){return this.x = x;}; Just.prototype = Object.create(Maybe.prototype); Just.prototype.toString = function(){return "Just "+this.x;}; var just = function(x) {return new Just(x)}; Finally, we can write the maybe function. It returns a new function that either returns nothing or a maybe. It is a functor. var maybe = function(m){ if (m instanceof None) {    return m; } else if (m instanceof Just) {    return just(m.x);   } else {  throw new TypeError("Error: Just or None expected, " + m.toString() + " given."); } } And we can also create a functor generator just like we did with arrays. var maybeOf = function(f){ return function(m) {    if (m instanceof None) {      return m;    }    else if (m instanceof Just) {      return just(f(m.x));    }    else {      throw new TypeError("Error: Just or None expected, " + m.toString() + " given.");    } } } So Maybe is a monad, maybe is a functor, and maybeOf returns a functor that is already assigned to a morphism. We'll need one more thing before we can move forward. We'll need to add a method to the Maybe monad object that helps us use it more intuitively. Maybe.prototype.orElse = function(y) { if (this instanceof Just) {    return this.x; } else {    return y; } } In its raw form, maybes can be used directly. maybe(just(123)).x; // Returns 123 maybeOf(plusplus)(just(123)).x; // Returns 124 maybe(plusplus)(none()).orElse('none'); // returns 'none' Anything that returns a method that is then executed is complicated enough to be begging for trouble. So we can make it a little cleaner by calling on our curry() function. maybePlusPlus = maybeOf.curry()(plusplus); maybePlusPlus(just(123)).x; // returns 123 maybePlusPlus(none()).orElse('none'); // returns none But the real power of maybes will become clear when the dirty business of directly calling the none() and just() functions is abstracted. We'll do this with an example object User, that uses maybes for the username. var User = function(){ this.username = none(); // initially set to `none` }; User.prototype.setUsername = function(name) { this.username = just(str(name)); // it's now a `just }; User.prototype.getUsernameMaybe = function() { var usernameMaybe = maybeOf.curry()(str); return usernameMaybe(this.username).orElse('anonymous'); };   var user = new User(); user.getUsernameMaybe(); // Returns 'anonymous'   user.setUsername('Laura'); user.getUsernameMaybe(); // Returns 'Laura' And now we have a powerful and safe way to define defaults. Keep this User object in mind because we'll be using it later. Promises The nature of promises is that they remain immune to changing circumstances. - Frank Underwood, House of Cards In functional programming, we're often working with pipelines and data flows: chains of functions where each function produces a data type that is consumed by the next. However, many of these functions are asynchronous: readFile, events, AJAX, and so on. Instead of using a continuation-passing style and deeply nested callbacks, how can we modify the return types of these functions to indicate the result? By wrapping them in promises. Promises are like the functional equivalent of callbacks. Obviously, callbacks are not all that functional because, if more than one function is mutating the same data, then there can be race conditions and bugs. Promises solve that problem. You should use promises to turn this: fs.readFile("file.json", function(err, val) { if( err ) {    console.error("unable to read file"); } else {    try {      val = JSON.parse(val);      console.log(val.success);    }    catch( e ) {      console.error("invalid json in file");    } } }); Into the following code snippet: fs.readFileAsync("file.json").then(JSON.parse) .then(function(val) {    console.log(val.success); }) .catch(SyntaxError, function(e) {    console.error("invalid json in file"); }) .catch(function(e){    console.error("unable to read file") }); The preceding code is from the README for bluebird: a full featured Promises/A+ implementation with exceptionally good performance. Promises/A+ is a specification for implementing promises in JavaScript. Given its current debate within the JavaScript community, we'll leave the implementations up to the Promises/A+ team, as it is much more complex than maybes. But here's a partial implementation: // the Promise monad var Promise = require('bluebird');   // the promise functor var promise = function(fn, receiver) { return function() {    var slice = Array.prototype.slice,    args = slice.call(arguments, 0, fn.length - 1),    promise = new Promise();    args.push(function() {      var results = slice.call(arguments),      error = results.shift();      if (error) promise.reject(error);      else promise.resolve.apply(promise, results);    });    fn.apply(receiver, args);    return promise; }; }; Now we can use the promise() functor to transform functions that take callbacks into functions that return promises. var files = ['a.json', 'b.json', 'c.json']; readFileAsync = promise(fs.readFile); var data = files .map(function(f){    readFileAsync(f).then(JSON.parse) }) .reduce(function(a,b){    return $.extend({}, a, b) }); Lenses Another reason why programmers really like monads is that they make writing libraries very easy. To explore this, let's extend our User object with more functions for getting and setting values but, instead of using getters and setters, we'll use lenses. Lenses are first-class getters and setters. They allow us to not just get and set variables, but also to run functions over it. But instead of mutating the data, they clone and return the new data modified by the function. They force data to be immutable, which is great for security and consistency as well for libraries. They're great for elegant code no matter what the application, so long as the performance-hit of introducing additional array copies is not a critical issue. Before we write the lens() function, let's look at how it works. var first = lens( function (a) { return arr(a)[0]; }, // get function (a, b) { return [b].concat(arr(a).slice(1)); } // set ); first([1, 2, 3]); // outputs 1 first.set([1, 2, 3], 5); // outputs [5, 2, 3] function tenTimes(x) { return x * 10 } first.modify(tenTimes, [1,2,3]); // outputs [10,2,3] And here's how the lens() function works. It returns a function with get, set and mod defined. The lens() function itself is a functor. var lens = fuction(get, set) { var f = function (a) {return get(a)}; f.get = function (a) {return get(a)}; f.set = set; f.mod = function (f, a) {return set(a, f(get(a)))}; return f; }; Let's try an example. We'll extend our User object from the previous example. // userName :: User -> str var userName = lens( function (u) {return u.getUsernameMaybe()}, // get function (u, v) { // set    u.setUsername(v);    return u.getUsernameMaybe(); } );   var bob = new User(); bob.setUsername('Bob'); userName.get(bob); // returns 'Bob' userName.set(bob, 'Bobby'); //return 'Bobby' userName.get(bob); // returns 'Bobby' userName.mod(strToUpper, bob); // returns 'BOBBY' strToUpper.compose(userName.set)(bob, 'robert'); // returns 'ROBERT' userName.get(bob); // returns 'robert' Summary In this article, we got to learn some of the different functors that are used in JavaScript such as Maybes, Promises and Lenses along with their corresponding examples, thus making it easy for us to understand the concept clearly. Resources for Article: Further resources on this subject: Alfresco Web Scrpits [article] Implementing Stacks using JavaScript [article] Creating a basic JavaScript plugin [article]
Read more
  • 0
  • 0
  • 3556

Packt
24 Mar 2015
15 min read
Save for later

REST – What You Didn't Know

Packt
24 Mar 2015
15 min read
Nowadays, topics such as cloud computing and mobile device service feeds, and other data sources being powered by cutting-edge, scalable, stateless, and modern technologies such as RESTful web services, leave the impression that REST has been invented recently. Well, to be honest, it is definitely not! In fact, REST was defined at the end of the 20th century. This article by Valentin Bojinov, author of the book RESTful Web API Design with Node.js, will walk you through REST's history and will teach you how REST couples with the HTTP protocol. You will look at the five key principles that need to be considered while turning an HTTP application into a RESTful-service-enabled application. You will also look at the differences between RESTful and SOAP-based services. Finally, you will learn how to utilize already existing infrastructure for your benefit. In this article, we will cover the following topics: A brief history of REST REST with HTTP RESTful versus SOAP-based services Taking advantage of existing infrastructure (For more resources related to this topic, see here.) A brief history of REST Let's look at a time when the madness around REST made everybody talk restlessly about it! This happened back in 1999, when a request for comments was submitted to the Internet Engineering Task Force (IETF: http://www.ietf.org/) via RFC 2616: "Hypertext Transfer Protocol - HTTP/1.1." One of its authors, Roy Fielding, later defined a set of principles built around the HTTP and URI standards. This gave birth to REST as we know it today. Let's look at the key principles around the HTTP and URI standards, sticking to which will make your HTTP application a RESTful-service-enabled application: Everything is a resource Each resource is identifiable by a unique identifier (URI) Use the standard HTTP methods Resources can have multiple representation Communicate statelessly Principle 1 – everything is a resource To understand this principle, one must conceive the idea of representing data by a specific format and not by a physical file. Each piece of data available on the Internet has a format that could be described by a content type. For example, JPEG Images; MPEG videos; html, xml, and text documents; and binary data are all resources with the following content types: image/jpeg, video/mpeg, text/html, text/xml, and application/octet-stream. Principle 2 – each resource is identifiable by a unique identifier Since the Internet contains so many different resources, they all should be accessible via URIs and should be identified uniquely. Furthermore, the URIs can be in a human-readable format (frankly I do believe they should be), despite the fact that their consumers are more likely to be software programmers rather than ordinary humans. The URI keeps the data self-descriptive and eases further development on it. In addition, using human-readable URIs helps you to reduce the risk of logical errors in your programs to a minimum. Here are a few sample examples of such URIs: http://www.mydatastore.com/images/vacation/2014/summer http://www.mydatastore.com/videos/vacation/2014/winter http://www.mydatastore.com/data/documents/balance?format=xml http://www.mydatastore.com/data/archives/2014 These human-readable URIs expose different types of resources in a straightforward manner. In the example, it is quite clear that the resource types are as follows: Images Videos XML documents Some kinds of binary archive documents Principle 3 – use the standard HTTP methods The native HTTP protocol (RFC 2616) defines eight actions, also known as verbs: GET POST PUT DELETE HEAD OPTIONS TRACE CONNECT The first four of them feel just natural in the context of resources, especially when defining actions for resource data manipulation. Let's make a parallel with relative SQL databases where the native language for data manipulation is CRUD (short for Create, Read, Update, and Delete) originating from the different types of SQL statements: INSERT, SELECT, UPDATE and DELETE respectively. In the same manner, if you apply the REST principles correctly, the HTTP verbs should be used as shown here: HTTP verb Action Response status code GET Request an existing resource "200 OK" if the resource exists, "404 Not Found" if it does not exist, and "500 Internal Server Error" for other errors PUT Create or update a resource "201 CREATED" if a new resource is created, "200 OK" if updated, and "500 Internal Server Error" for other errors POST Update an existing resource "200 OK" if the resource has been updated successfully, "404 Not Found" if the resource to be updated does not exist, and "500 Internal Server Error" for other errors DELETE Delete a resource "200 OK" if the resource has been deleted successfully, "404 Not Found" if the resource to be deleted does not exist, and "500 Internal Server Error" for other errors There is an exception in the usage of the verbs, however. I just mentioned that POST is used to create a resource. For instance, when a resource has to be created under a specific URI, then PUT is the appropriate request: PUT /data/documents/balance/22082014 HTTP/1.1 Content-Type: text/xml Host: www.mydatastore.com <?xml version="1.0" encoding="utf-8"?> <balance date="22082014"> <Item>Sample item</Item> <price currency="EUR">100</price> </balance> HTTP/1.1 201 Created Content-Type: text/xml Location: /data/documents/balance/22082014 However, in your application you may want to leave it up to the server REST application to decide where to place the newly created resource, and thus create it under an appropriate but still unknown or non-existing location. For instance, in our example, we might want the server to create the date part of the URI based on the current date. In such cases, it is perfectly fine to use the POST verb to the main resource URI and let the server respond with the location of the newly created resource: POST /data/documents/balance HTTP/1.1Content-Type: text/xmlHost: www.mydatastore.com<?xml version="1.0" encoding="utf-8"?><balance date="22082014"><Item>Sample item</Item><price currency="EUR">100</price></balance>HTTP/1.1 201 CreatedContent-Type: text/xmlLocation: /data/documents/balance Principle 4 – resources can have multiple representations A key feature of a resource is that they may be represented in a different form than the one it is stored. Thus, it can be requested or posted in different representations. As long as the specified format is supported, the REST-enabled endpoint should use it. In the preceding example, we posted an xml representation of a balance, but if the server supported the JSON format, the following request would have been valid as well: POST /data/documents/balance HTTP/1.1Content-Type: application/jsonHost: www.mydatastore.com{"balance": {"date": ""22082014"","Item": "Sample item","price": {"-currency": "EUR","#text": "100"}}}HTTP/1.1 201 CreatedContent-Type: application/jsonLocation: /data/documents/balance Principle 5 – communicate statelessly Resource manipulation operations through HTTP requests should always be considered atomic. All modifications of a resource should be carried out within an HTTP request in isolation. After the request execution, the resource is left in a final state, which implicitly means that partial resource updates are not supported. You should always send the complete state of the resource. Back to the balance example, updating the price field of a given balance would mean posting a complete JSON document that contains all of the balance data, including the updated price field. Posting only the updated price is not stateless, as it implies that the application is aware that the resource has a price field, that is, it knows its state. Another requirement for your RESTful application is to be stateless; the fact that once deployed in a production environment, it is likely that incoming requests are served by a load balancer, ensuring scalability and high availability. Once exposed via a load balancer, the idea of keeping your application state at server side gets compromised. This doesn't mean that you are not allowed to keep the state of your application. It just means that you should keep it in a RESTful way. For example, keep a part of the state within the URI. The statelessness of your RESTful API isolates the caller against changes at the server side. Thus, the caller is not expected to communicate with the same server in consecutive requests. This allows easy application of changes within the server infrastructure, such as adding or removing nodes. Remember that it is your responsibility to keep your RESTful APIs stateless, as the consumers of the API would expect it to be. Now that you know that REST is around 15 years old, a sensible question would be, "why has it become so popular just quite recently?" My answer to the question is that we as humans usually reject simple, straightforward approaches, and most of the time, we prefer spending more time on turning complex solutions into even more complex and sophisticated solutions. Take classical SOAP web services for example. Their various WS-* specifications are so many and sometimes loosely defined in order to make different solutions from different vendors interoperable. The WS-* specifications need to be unified by another specification, WS-BasicProfile. This mechanism defines extra interoperability rules in order to ensure that all WS-* specifications in SOAP-based web services transported over HTTP provide different means of transporting binary data. This is again described in other sets of specifications such as SOAP with Attachment References (SwaRef) and Message Transmission Optimisation Mechanism (MTOM), mainly because the initial idea of the web service was to execute business logic and return its response remotely, not to transport large amounts of data. Well, I personally think that when it comes to data transfer, things should not be that complex. This is where REST comes into place by introducing the concept of resources and standard means to manipulate them. The REST goals Now that we've covered the main REST principles, let's dive deeper into what can be achieved when they are followed: Separation of the representation and the resource Visibility Reliability Scalability Performance Separation of the representation and the resource A resource is just a set of information, and as defined by principle 4, it can have multiple representations. However, the state of the resource is atomic. It is up to the caller to specify the content-type header of the http request, and then it is up to the server application to handle the representation accordingly and return the appropriate HTTP status code: HTTP 200 OK in the case of success HTTP 400 Bad request if a unsupported content type is requested, or for any other invalid request HTTP 500 Internal Server error when something unexpected happens during the request processing For instance, let's assume that at the server side, we have balance resources stored in an XML file. We can have an API that allows a consumer to request the resource in various formats, such as application/json, application/zip, application/octet-stream, and so on. It would be up to the API itself to load the requested resource, transform it into the requested type (for example, json or xml), and either use zip to compress it or directly flush it to the HTTP response output. It is the Accept HTTP header that specifies the expected representation of the response data. So, if we want to request our balance data inserted in the previous section in XML format, the following request should be executed: GET /data/balance/22082014 HTTP/1.1Host: my-computer-hostnameAccept: text/xmlHTTP/1.1 200 OKContent-Type: text/xmlContent-Length: 140<?xml version="1.0" encoding="utf-8"?><balance date="22082014"><Item>Sample item</Item><price currency="EUR">100</price></balance> To request the same balance in JSON format, the Accept header needs to be set to application/json: GET /data/balance/22082014 HTTP/1.1Host: my-computer-hostnameAccept: application/jsonHTTP/1.1 200 OKContent-Type: application/jsonContent-Length: 120{"balance": {"date": "22082014","Item": "Sample item","price": {"-currency": "EUR","#text": "100"}}} Visibility REST is designed to be visible and simple. Visibility of the service means that every aspect of it should self-descriptive and should follow the natural HTTP language according to principles 3, 4, and 5. Visibility in the context of the outer world would mean that monitoring applications would be interested only in the HTTP communication between the REST service and the caller. Since the requests and responses are stateless and atomic, nothing more is needed to flow the behavior of the application and to understand whether anything has gone wrong. Remember that caching reduces the visibility of you restful applications and should be avoided. Reliability Before talking about reliability, we need to define which HTTP methods are safe and which are idempotent in the REST context. So let's first define what safe and idempotent methods are: An HTTP method is considered to be safe provided that when requested, it does not modify or cause any side effects on the state of the resource An HTTP method is considered to be idempotent if its response is always the same, no matter how many times it is requested The following table lists shows you which HTTP method is safe and which is idempotent: HTTP Method Safe Idempotent GET Yes Yes POST No No PUT No Yes DELETE No Yes Scalability and performance So far, I have often stressed on the importance of having stateless implementation and stateless behavior for a RESTful web application. The World Wide Web is an enormous universe, containing a huge amount of data and a few times more users eager to get that data. Its evolution has brought about the requirement that applications should scale easily as their load increases. Scaling applications that have a state is hardly possible, especially when zero or close-to-zero downtime is needed. That's why being stateless is crucial for any application that needs to scale. In the best-case scenario, scaling your application would require you to put another piece of hardware for a load balancer. There would be no need for the different nodes to sync between each other, as they should not care about the state at all. Scalability is all about serving all your clients in an acceptable amount of time. Its main idea is keep your application running and to prevent Denial of Service (DoS) caused by a huge amount of incoming requests. Scalability should not be confused with performance of an application. Performance is measured by the time needed for a single request to be processed, not by the total number of requests that the application can handle. The asynchronous non-blocking architecture and event-driven design of Node.js make it a logical choice for implementing a well-scalable application that performs well. Working with WADL If you are familiar with SOAP web services, you may have heard of the Web Service Definition Language (WSDL). It is an XML description of the interface of the service. It is mandatory for a SOAP web service to be described by such a WSDL definition. Similar to SOAP web services, RESTful services also offer a description language named WADL. WADL stands for Web Application Definition Language. Unlike WSDL for SOAP web services, a WADL description of a RESTful service is optional, that is, consuming the service has nothing to do with its description. Here is a sample part of a WADL file that describes the GET operation of our balance service: <application ><grammer><include href="balance.xsd"/><include href="error.xsd"/></grammer><resources base="http://localhost:8080/data/balance/"><resource path="{date}"><method name="GET"><request><param name="date" type="xsd:string" style="template"/></request><response status="200"><representation mediaType="application/xml"element="service:balance"/><representation mediaType="application/json" /></response><response status="404"><representation mediaType="application/xml"element="service:balance"/></response></method></resource></resources></application> This extract of a WADL file shows how application-exposing resources are described. Basically, each resource must be a part of an application. The resource provides the URI where it is located with the base attribute, and describes each of its supported HTTP methods in a method. Additionally, an optional doc element can be used at resource and application to provide additional documentation about the service and its operations. Though WADL is optional, it significantly reduces the efforts of discovering RESTful services. Taking advantage of the existing infrastructure The best part of developing and distributing RESTful applications is that the infrastructure needed is already out there waiting restlessly for you. As RESTful applications use the existing web space heavily, you need to do nothing more than following the REST principles when developing. In addition, there are a plenty of libraries available out there for any platform, and I do mean any given platform. This eases development of RESTful applications, so you just need to choose the preferred platform for you and start developing. Summary In this article, you learned about the history of REST, and we made a slight comparison between RESTful services and classical SOAP Web services. We looked at the five key principles that would turn our web application into a REST-enabled application, and finally took a look at how RESTful services are described and how we can simplify the discovery of the services we develop. Now that you know the REST basics, we are ready to dive into the Node.js way of implementing RESTful services. Resources for Article: Further resources on this subject: Creating a RESTful API [Article] So, what is Node.js? [Article] CreateJS – Performing Animation and Transforming Function [Article]
Read more
  • 0
  • 0
  • 1393

article-image-spritekit-framework-and-physics-simulation
Packt
24 Mar 2015
15 min read
Save for later

SpriteKit Framework and Physics Simulation

Packt
24 Mar 2015
15 min read
In this article by Bhanu Birani, author of the book iOS Game Programming Cookbook, you will learn about the SpriteKit game framework and about the physics simulation. (For more resources related to this topic, see here.) Getting started with the SpriteKit game framework With the release of iOS 7.0, Apple has introduced its own native 2D game framework called SpriteKit. SpriteKit is a great 2D game engine, which has support for sprite, animations, filters, masking, and most important is the physics engine to provide a real-world simulation for the game. Apple provides a sample game to get started with the SpriteKit called Adventure Game. The download URL for this example project is http://bit.ly/Rqaeda. This sample project provides a glimpse of the capability of this framework. However, the project is complicated to understand and for learning you just want to make something simple. To have a deeper understanding of SpriteKit-based games, we will be building a bunch of mini games in this book. Getting ready To get started with iOS game development, you have the following prerequisites for SpriteKit: You will need the Xcode 5.x The targeted device family should be iOS 7.0+ You should be running OS X 10.8.X or later If all the above requisites are fulfilled, then you are ready to go with the iOS game development. So let's start with game development using iOS native game framework. How to do it... Let's start building the AntKilling game. Perform the following steps to create your new SpriteKit project: Start your Xcode. Navigate to File | New | Project.... Then from the prompt window, navigate to iOS | Application | SpriteKit Game and click on Next. Fill all the project details in the prompt window and provide AntKilling as the project name with your Organization Name, device as iPhone, and Class Prefix as AK. Click on Next. Select a location on the drive to save the project and click on Create. Then build the sample project to check the output of the sample project. Once you build and run the project with the play button, you should see the following on your device: How it works... The following are the observations of the starter project: As you have seen, the sample project of SpriteKit plays a label with a background color. SpriteKit works on the concept of scenes, which can be understood as the layers or the screens of the game. There can be multiple scenes working at the same time; for example, there can be a gameplay scene, hud scene, and the score scene running at the same time in the game. Now we can look into the project for more detail arrangements of the starter project. The following are the observations: In the main directory, you already have one scene created by default called AKMyScene. Now click on AKMyScene.m to explore the code to add the label on the screen. You should see something similar to the following screenshot: Now we will be updating this file with our code to create our AntKilling game in the next sections. We have to fulfill a few prerequisites to get started with the code, such as locking the orientation to landscape as we want a landscape orientation game. To change the orientation of the game, navigate to AntKilling project settings | TARGETS | General. You should see something similar to the following screenshot: Now in the General tab, uncheck Portrait from the device orientation so that the final settings should look similar to the following screenshot: Now build and run the project. You should be able to see the app running in landscape orientation. The bottom-right corner of the screen shows the number of nodes with the frame rate. Introduction to physics simulation We all like games that have realistic effects and actions. In this article we will learn about the ways to make our games more realistic. Have you ever wondered how to provide realistic effect to game objects? It is physics that provides a realistic effect to the games and their characters. In this article, we will learn how to use physics in our games. While developing the game using SpriteKit, you will need to change the world of your game frequently. The world is the main object in the game that holds all the other game objects and physics simulations. We can also update the gravity of the gaming world according to our need. The default world gravity is 9.8, which is also the earth's gravity, World gravity makes all bodies fall down to the ground as soon as they are created. More about SpriteKit can be explored using the following link: https://developer.apple.com/library/ios/documentation/GraphicsAnimation/Conceptual/SpriteKit_PG/Physics/Physics.html Getting ready The first task is to create the world and then add bodies to it, which can interact according to the principles of physics. You can create game objects in the form of sprites and associate physics bodies to them. You can also set various properties of the object to specify its behavior. How to do it... In this section, we will learn about the basic components that are used to develop games. We will also learn how to set game configurations, including the world settings such as gravity and boundary. The initial step is to apply the gravity to the scene. Every scene has a physics world associated with it. We can update the gravity of the physics world in our scene using the following line of code: self.physicsWorld.gravity = CGVectorMake(0.0f, 0.0f); Currently we have set the gravity of the scene to 0, which means the bodies will be in a state of free fall. They will not experience any force due to gravity in the world. In several games we also need to set a boundary to the games. Usually, the bounds of the view can serve as the bounds for our physics world. The following code will help us to set up the boundary for our game, which will be as per the bounds of our game scene: // 1 Create a physics body that borders the screenSKPhysicsBody* gameBorderBody = [SKPhysicsBody   bodyWithEdgeLoopFromRect:self.frame];// 2 Set physicsBody of scene to gameBorderBodyself.physicsBody = gameBorderBody;// 3 Set the friction of that physicsBody to 0self.physicsBody.friction = 0.0f; In the first line of code we are initializing a SKPhysicsBody object. This object is used to add the physics simulation to any SKSpriteNode. We have created the gameBorderBody as a rectangle with the dimensions equal to the current scene frame. Then we assign that physics object to the physicsBody of our current scene (every SKSpriteNode object has the physicsBody property through which we can associate physics bodies to any node). After this we update the physicsBody.friction. This line of code updates the friction property of our world. The friction property defines the friction value of one physics body with another physics body. Here we have set this to 0, in order to make the objects move freely, without slowing down. Every game object is inherited from the SKSpriteNode class, which allows the physics body to hold on to the node. Let us take an example and create a game object using the following code: // 1SKSpriteNode* gameObject = [SKSpriteNode   spriteNodeWithImageNamed: @"object.png"];gameObject.name = @"game_object";gameObject.position = CGPointMake(self.frame.size.width/3,   self.frame.size.height/3);[self addChild:gameObject]; // 2gameObject.physicsBody = [SKPhysicsBody   bodyWithCircleOfRadius:gameObject.frame.size.width/2];// 3gameObject.physicsBody.friction = 0.0f; We are already familiar with the first few lines of code wherein we are creating the sprite reference and then adding it to the scene. Now in the next line of code, we are associating a physics body with that sprite. We are initializing the circular physics body with radius and associating it with the sprite object. Then we can update various other properties of the physics body such as friction, restitution, linear damping, and so on. The physics body properties also allow you to apply force. To apply force you need to provide the direction where you want to apply force. [gameObject.physicsBody applyForce:CGVectorMake(10.0f,   -10.0f)]; In the code we are applying force in the bottom-right corner of the world. To provide the direction coordinates we have used CGVectorMake, which accepts the vector coordinates of the physics world. You can also apply impulse instead of force. Impulse can be defined as a force that acts for a specific interval of time and is equal to the change in linear momentum produced over that interval. [gameObject.physicsBody applyImpulse:CGVectorMake(10.0f,   -10.0f)]; While creating games, we frequently use static objects. To create a rectangular static object we can use the following code: SKSpriteNode* box = [[SKSpriteNode alloc]   initWithImageNamed: @"box.png"];box.name = @"box_object";box.position = CGPointMake(CGRectGetMidX(self.frame),   box.frame.size.height * 0.6f);[self addChild:box];box.physicsBody = [SKPhysicsBody   bodyWithRectangleOfSize:box.frame.size];box.physicsBody.friction = 0.4f;// make physicsBody staticbox.physicsBody.dynamic = NO; So all the code is the same except one special property, which is dynamic. By default this property is set to YES, which means that all the physics bodies will be dynamic by default and can be converted to static after setting this Boolean to NO. Static bodies do not react to any force or impulse. Simply put, dynamic physics bodies can move while the static physics bodies cannot . Integrating physics engine with games From this section onwards, we will develop a mini game that will have a dynamic moving body and a static body. The basic concept of the game will be to create an infinite bouncing ball with a moving paddle that will be used to give direction to the ball. Getting ready... To develop a mini game using the physics engine, start by creating a new project. Open Xcode and go to File | New | Project and then navigate to iOS | Application | SpriteKit Game. In the pop-up screen, provide the Product Name as PhysicsSimulation, navigate to Devices | iPhone and click on Next as shown in the following screenshot: Click on Next and save the project on your hard drive. Once the project is saved, you should be able to see something similar to the following screenshot: In the project settings page, just uncheck the Portrait from Device Orientation section as we are supporting only landscape mode for this game. Graphics and games cannot be separated for long; you will also need some graphics for this game. Download the graphics folder, drag it and import it into the project. Make sure that the Copy items into destination group's folder (if needed) is checked and then click on Finish button. It should be something similar to the following screenshot: How to do it... Now your project template is ready for a physics-based mini game. We need to update the game template project to get started with code game logic. Take the following steps to integrate the basic physics object in the game. Open the file GameScene.m .This class creates a scene that will be plugged into the games. Remove all code from this class and just add the following function: -(id)initWithSize:(CGSize)size { if (self = [super initWithSize:size]) { SKSpriteNode* background = [SKSpriteNode spriteNodeWithImageNamed:@"bg.png"]; background.position = CGPointMake(self.frame.size.width/2, self.frame.size.height/2); [self addChild:background]; } } This initWithSize method creates an blank scene of the specified size. The code written inside the init function allows you to add the background image at the center of the screen in your game. Now when you compile and run the code, you will observe that the background image is not placed correctly on the scene. To resolve this, open GameViewController.m. Remove all code from this file and add the following function; -(void)viewWillLayoutSubviews {   [super viewWillLayoutSubviews];     // Configure the view.   SKView * skView = (SKView *)self.view;   if (!skView.scene) {       skView.showsFPS = YES;       skView.showsNodeCount = YES;             // Create and configure the scene.       GameScene * scene = [GameScene sceneWithSize:skView.bounds.size];       scene.scaleMode = SKSceneScaleModeAspectFill;             // Present the scene.       [skView presentScene:scene];   }} To ensure that the view hierarchy is properly laid out, we have implemented the viewWillLayoutSubviews method. It does not work perfectly in viewDidLayoutSubviews method because the size of the scene is not known at that time. Now compile and run the app. You should be able to see the background image correctly. It will look something similar to the following screenshot: So now that we have the background image in place, let us add gravity to the world. Open GameScene.m and add the following line of code at the end of the initWithSize method: self.physicsWorld.gravity = CGVectorMake(0.0f, 0.0f); This line of code will set the gravity of the world to 0, which means there will be no gravity. Now as we have removed the gravity to make the object fall freely, it's important to create a boundary around the world, which will hold all the objects of the world and prevent them to go off the screen. Add the following line of code to add the invisible boundary around the screen to hold the physics objects: // 1 Create a physics body that borders the screenSKPhysicsBody* gameborderBody = [SKPhysicsBody bodyWithEdgeLoopFromRect:self.frame];// 2 Set physicsBody of scene to borderBodyself.physicsBody = gameborderBody;// 3 Set the friction of that physicsBody to 0self.physicsBody.friction = 0.0f; In the first line, we are are creating an edge-based physics boundary object, with a screen size frame. This type of a physics body does not have any mass or volume and also remains unaffected by force and impulses. Then we associate the object with the physics body of the scene. In the last line we set the friction of the body to 0, for a seamless interaction between objects and the boundary surface. The final file should look something like the following screenshot: Now we have our surface ready to hold the physics world objects. Let us create a new physics world object using the following line of code: // 1SKSpriteNode* circlularObject = [SKSpriteNode spriteNodeWithImageNamed: @"ball.png"];circlularObject.name = ballCategoryName;circlularObject.position = CGPointMake(self.frame.size.width/3, self.frame.size.height/3);[self addChild:circlularObject]; // 2circlularObject.physicsBody = [SKPhysicsBody bodyWithCircleOfRadius:circlularObject.frame.size.width/2];// 3circlularObject.physicsBody.friction = 0.0f;// 4circlularObject.physicsBody.restitution = 1.0f;// 5circlularObject.physicsBody.linearDamping = 0.0f;// 6circlularObject.physicsBody.allowsRotation = NO; Here we have created the sprite and then we have added it to the scene. Then in the later steps we associate the circular physics body with the sprite object. Finally, we alter the properties of that physics body. Now compile and run the application; you should be able to see the circular ball on the screen as shown in screenshot below: The circular ball is added to the screen, but it does nothing. So it's time to add some action in the code. Add the following line of code at the end of the initWithSize method: [circlularObject.physicsBody applyImpulse:CGVectorMake(10.0f, -10.0f)]; This will apply the force on the physics body, which in turn will move the associated ball sprite as well. Now compile and run the project. You should be able to see the ball moving and then collide with the boundary and bounce back, as there is no friction between the boundary and the ball. So now we have the infinite bouncing ball in the game. How it works… There are several properties used while creating physics bodies to define their behavior in the physics world. The following is a detailed description of the properties used in the preceding code: Restitution property defines the bounciness of an object. Setting the restitution to 1.0f, means that the ball collision will be perfectly elastic with any object. This means that the ball will bounce back with a force equal to the impact. Linear Damping property allows the simulation of fluid or air friction. This is accomplished by reducing the linear velocity of the body. In our case, we do not want the ball to slow down while moving and hence we have set the restitution to 0.0f. There's more… You can read about all these properties in detail at Apple's developer documentation: https://developer.apple.com/library/IOs/documentation/SpriteKit/Reference/SKPhysicsBody_Ref/index.html. Summary In this article, you have learned about the SpriteKit game framework, how to create a simple game using SpriteKit framework, physics simulation, and also how to integrate physics engine with games. Resources for Article: Further resources on this subject: Code Sharing Between iOS and Android [article] Linking OpenCV to an iOS project [article] Interface Designing for Games in iOS [article]
Read more
  • 0
  • 0
  • 3025
article-image-introducing-gamemaker
Packt
24 Mar 2015
5 min read
Save for later

Introducing GameMaker

Packt
24 Mar 2015
5 min read
In this article by Nathan Auckett, author of the book GameMaker Essentials, you will learn what GameMaker is all about, who made it, what it is used for, and more. You will then also be learning how to install GameMaker on your computer that is ready for use. (For more resources related to this topic, see here.) In this article, we will cover the following topics: Understanding GameMaker Installing GameMaker: Studio What is this article about? Understanding GameMaker Before getting started with GameMaker, it is best to know exactly what it is and what it's designed to do. GameMaker is a 2D game creation software by YoYo Games. It was designed to allow anyone to easily develop games without having to learn complex programming languages such as C++ through the use of its drag and drop functionality. The drag and drop functionality allows the user to create games by visually organizing icons on screen, which represent actions and statements that will occur during the game. GameMaker also has a built-in programming language called GameMaker Language, or GML for short. GML allows users to type out code to be run during their game. All drag and drop actions are actually made up of this GML code. GameMaker is primarily designed for 2D games, and most of its features and functions are designed for 2D game creation. However, GameMaker does have the ability to create 3D games and has a number of functions dedicated to this. GameMaker: Studio There are a number of different versions of GameMaker available, most of which are unsupported because they are outdated; however, support can still be found in the GameMaker Community Forums. GameMaker: Studio is the first version of GameMaker after GameMaker HTML5 to allow users to create games and export them for use on multiple devices and operating systems including PC, Mac, Linux, and Android, on both mobile and desktop versions. GameMaker: Studio is designed to allow one code base (GML) to run on any device with minimal changes to the base code. Users are able to export their games to run on any supported device or system such as HTML5 without changing any code to make things work. GameMaker: Studio was also the first version available for download and use through the Steam marketplace. YoYo Games took advantage of the Steam workshop and allowed Steam-based users to post and share their creations through the service. GameMaker: Studio is sold in a number of different versions, which include several enhanced features and capabilities as the price gets higher. The standard version is free to download and use. However, it lacks some advanced features included in higher versions and only allows for exporting to the Windows desktop. The professional version is the second cheapest from the standard version. It includes all features, but only has the Windows desktop and Windows app exports. Other exports can be purchased at an extra cost ranging from $99.99 to $300. The master version is the most expensive of all the options. It comes with every feature and every export, including all future export modules in version 1.x. If you already own exports in the professional version, you can get the prices of those exports taken off the price of the master version. Installing GameMaker: Studio Installing GameMaker is performed much like any other program. In this case, we will be installing GameMaker: Studio as this is the most up-to-date version at this point. You can find the download at the YoYo Games website, https://www.yoyogames.com/. From the site, you can pick the free version or purchase one of the others. All the installations are basically the same. Once the installer is downloaded, we are ready to install GameMaker: Studio. This is just like installing any other program. Just run the file, and then follow the on-screen instructions to accept the license agreement, choose an install location, and install the software. On the first run, you may see a progress bar appear at the top left of your screen. This is just GameMaker running its first time setup. It will also do this during the update process as YoYo Games releases new features. Once it is done, you should see a welcome screen and will be prompted to enter your registration code. The key should be e-mailed to you when you make an account during the purchase/download process. Enter this key and your copy of GameMaker: Studio should be registered. You may be prompted to restart GameMaker at this time. Close GameMaker and re-open it and you should see the welcome screen and be able to choose from a number of options on it: What is this article about? We now have GameMaker: Studio installed and are ready to get started with it. In this article, we will be covering the essential things to know about GameMaker: Studio. This includes everything from drag and drop actions to programming in GameMaker using GameMaker Language (GML). You will learn about how things in GameMaker are structured, and how to organize resources to keep things as clean as possible. Summary In this article, we looked into what GameMaker actually is and learned that there are different versions available. We also looked at the different types of GameMaker: Studio available for download and purchase. We then learned how to install GameMaker: Studio, which is the final step in getting ready to learn the essential skills and starting to make our very own games. Resources for Article: Further resources on this subject: Getting Started – An Introduction to GML [Article] Animating a Game Character [Article] Why should I make cross-platform games? [Article]
Read more
  • 0
  • 0
  • 8915

article-image-understanding-core-data-concepts
Packt
23 Mar 2015
10 min read
Save for later

Understanding Core Data concepts

Packt
23 Mar 2015
10 min read
In this article by Gibson Tang and Maxim Vasilkov, authors of the book Objective-C Memory Management Essentials, you will learn what Core Data is and why you should use it. (For more resources related to this topic, see here.) Core Data allows you to store your data in a variety of storage types. So, if you want to use other types of memory store, such as XML or binary store, you can use the following store types: NSSQLiteStoreType: This is the option you most commonly use as it just stores your database in a SQLite database. NSXMLStoreType: This will store your data in an XML file, which is slower, but you can open the XML file and it will be human readable. This has the option of helping you debug errors relating to storage of data. However, do note that this storage type is only available for Mac OS X. NSBinaryStoreType: This occupies the least amount of space and also produces the fastest speed as it stores all data as a binary file, but the entire database binary need to be able to fit into memory in order to work properly. NSInMemoryStoreType: This stores all data in memory and provides the fastest access speed. However, the size of your database to be saved cannot exceed the available free space in memory since the data is stored in memory. However, do note that memory storage is ephemeral and is not stored permanently to disk. Next, there are two concepts that you need to know, and they are: Entity Attributes Now, these terms may be foreign to you. However, for those of you who have knowledge of databases, you will know it as tables and columns. So, to put it in an easy-to-understand picture, think of Core Data entities as your database tables and Core Data attributes as your database columns. So, Core Data handles data persistence using the concepts of entity and attributes, which are abstract data types, and actually saving the data into plists, SQLite databases, or even XML files (applicable only to the Mac OS). Going back a bit in time, Core Data is a descendant of Apple's Enterprise Objects Framework (EOF) , which was introduced by NeXT, Inc in 1994, and EOF is an Object-relational mapper (ORM), but Core Data itself is not an ORM. Core Data is a framework for managing the object graph, and one of it's powerful capabilities is that it allows you to work with extremely large datasets and object instances that normally would not fit into memory by putting objects in and out of memory when necessary. Core Data will map the Objective-C data type to the related data types, such as string, date, and integer, which will be represented by NSString, NSDate, and NSNumber respectively. So, as you can see, Core Data is not a radically new concept that you need to learn as it is grounded in the simple database concepts that we all know. Since entity and attributes are abstract data types, you cannot access them directly as they do not exist in physical terms. So to access them, you need to use the Core Data classes and methods provided by Apple. The number of classes for Core Data is actually pretty long, and you won't be using all of them regularly. So, here is a list of the more commonly used classes: CLASS NAME EXAMPLE USE CASE NSManagedObject Accessing attributes and rows of data NSManagedObjectContext Fetching data and saving data NSManagedObjectModel Storage NSFetchRequest Requesting data NSPersistentStoreCoordinator Persisting data NSPredicate Data query Now, let's go in-depth into the description of each of these classes: NSManagedObject: This is a record that you will use and perform operations on and all entities will extend this class. NSManagedObjectContext: This can be thought of as an intelligent scratchpad where temporary copies are brought into it after you fetch objects from the persistent store. So, any modifications done in this intelligent scratchpad are not saved until you save those changes into the persistent store, NSManagedObjectModel. Think of this as a collection of entities or a database schema, if you will. NSFetchRequest: This is an operation that describes the search criteria, which you will use to retrieve data from the persistent store, a kind of the common SQL query that most developers are familiar with. NSPersistentStoreCoordinator: This is like the glue that associates your managed object context and persistent. NSPersistentStoreCoordinator: Without this, your modifications will not be saved to the persistent store. NSPredicate: This is used to define logical conditions used in a search or for filtering in-memory. Basically, it means that NSPredicate is used to specify how data is to be fetched or filtered and you can use it together with NSFetchRequest as NSFetchRequest has a predicate property. Putting it into practice Now that we have covered the basics of Core Data, let's proceed with some code examples of how to use Core Data, where we use Core Data to store customer details in a Customer table and the information we want to store are: name email phone_number address age Do note that all attribute names must be in lowercase and have no spaces in them. For example, we will use Core Data to store customer details mentioned earlier as well as retrieve, update, and delete the customer records using the Core Data framework and methods. First, we will select File | New | File and then select iOS | Core Data: Then, we will proceed to create a new Entity called Customer by clicking on the Add Entity button on the bottom left of the screen, as shown here: Then, we will proceed to add in the attributes for our Customer entity and give them the appropriate Type, which can be String for attributes such as name or address and Integer 16 for age. Lastly, we need to add CoreData.framework, as seen in the following screenshot: So with this, we have created a Core Data model class consisting of a Customer entity and some attributes. Do note that all core model classes have the .xcdatamodeld file extension and for us, we can save our Core Data model as Model.xcdatamodeld. Next, we will create a sample application that uses Core Data in the following ways:     Saving a record     Searching for a record     Deleting a record     Loading records Now, I won't cover the usage of UIKit and storyboard, but instead focus on the core code needed to give you an example of Core Data works. So, to start things off, here are a few images of the application for you to have a feel of what we will do: This is the main screen when you start the app: The screen to insert record is shown here: The screen to list all records from our persistent store is as follows: By deleting a record from the persistent store, you will get the following output: Getting into the code Let's get started with our code examples: For our code, we will first declare some Core Data objects in our AppDelegate class inside our AppDelegate.h file such as: @property (readonly, strong, nonatomic) NSManagedObjectContext *managedObjectContext; @property (readonly, strong, nonatomic) NSManagedObjectModel *managedObjectModel; @property (readonly, strong, nonatomic) NSPersistentStoreCoordinator *persistentStoreCoordinator; Next, we will declare the code for each of the objects in AppDelegate.m such as the following lines of code that will create an instance of NSManagedObjectContext and return an existing instance if the instance already exists. This is important as you want only one instance of the context to be present to avoid conflicting access to the context: - (NSManagedObjectContext *)managedObjectContext { if (_managedObjectContext != nil) { return _managedObjectContext; } NSPersistentStoreCoordinator *coordinator = [self persistentStoreCoordinator]; if (coordinator != nil) { _managedObjectContext = [[NSManagedObjectContext alloc] init]; [_managedObjectContext setPersistentStoreCoordinator:coordinator]; } if (_managedObjectContext == nil) NSLog(@"_managedObjectContext is nil"); return _managedObjectContext; } This method will create the NSManagedObjectModel instance and then return the instance, but it will return an existing NSManagedObjectModel if it already exists: // Returns the managed object model for the application. - (NSManagedObjectModel *)managedObjectModel { if (_managedObjectModel != nil) { return _managedObjectModel;//return model since it already exists } //else create the model and return it //CustomerModel is the filename of your *.xcdatamodeld file NSURL *modelURL = [[NSBundle mainBundle] URLForResource:@"CustomerModel" withExtension:@"momd"]; _managedObjectModel = [[NSManagedObjectModel alloc] initWithContentsOfURL:modelURL]; if (_managedObjectModel == nil) NSLog(@"_managedObjectModel is nil"); return _managedObjectModel; } This method will create an instance of the NSPersistentStoreCoordinator class if it does not exist, and also return an existing instance if it already exists. We will also put some logging via NSLog to tell the user if the instance of NSPersistentStoreCoordinator is nil and use the NSSQLiteStoreType keyword to signify to the system that we intend to store the data in a SQLite database: // Returns the persistent store coordinator for the application. - (NSPersistentStoreCoordinator *)persistentStoreCoordinator { NSPersistentStoreCoordinator if (_persistentStoreCoordinator != nil) { return _persistentStoreCoordinator;//return persistent store }//coordinator since it already exists NSURL *storeURL = [[self applicationDocumentsDirectory] URLByAppendingPathComponent:@"CustomerModel.sqlite"]; NSError *error = nil; _persistentStoreCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:[self managedObjectModel]]; if (_persistentStoreCoordinator == nil) NSLog(@"_persistentStoreCoordinator is nil"); if (![_persistentStoreCoordinator addPersistentStoreWithTy pe:NSSQLiteStoreType configuration:nil URL:storeURL options:nil error:&error]) { NSLog(@"Error %@, %@", error, [error userInfo]); abort(); } return _persistentStoreCoordinator; } The following lines of code will return a URL of the location to store your data on the device: #pragma mark - Application's Documents directory// Returns the URL to the application's Documents directory. - (NSURL *)applicationDocumentsDirectory { return [[[NSFileManager defaultManager] URLsForDirectory:NSDocumentDirectory inDomains:NSUserDomainMask] lastObject]; } As you can see, what we have done is to check whether the objects such as _managedObjectModel are nil and if it is not nil, then we return the object, else we will create the object and then return it. This concept is exactly the same concept of lazy loading. We apply the same methodology to managedObjectContext and persistentStoreCoordinator. We did this so that we know that we only have one instance of managedObjectModel, managedObjectContext, and persistentStoreCoordinator created and present at any given time. This is to help us avoid having multiple copies of these objects, which will increase the chance of a memory leak. Note that memory management is still a real issue in the post-ARC world. So what we have done is follow best practices that will help us avoid memory leaks. In the example code that was shown, we adopted a structure so that only one instance of managedObjectModel, managedObjectContext and persistentStoreCoordinator is available at any given time. Next, let's move on to showing you how to store data into our persistent store. As you can see in the preceding screenshot, we have fields such as name, age, address, email, and phone_number, which corresponds to the appropriate fields in our Customer entity. Summary In this article, you learned about Core Data and why you should use it. Resources for Article: Further resources on this subject: BSD Socket Library [article] Marker-based Augmented Reality on iPhone or iPad [article] User Interactivity – Mini Golf [article]
Read more
  • 0
  • 0
  • 5249
Modal Close icon
Modal Close icon