Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-understanding-core-data-concepts
Packt
23 Mar 2015
10 min read
Save for later

Understanding Core Data concepts

Packt
23 Mar 2015
10 min read
In this article by Gibson Tang and Maxim Vasilkov, authors of the book Objective-C Memory Management Essentials, you will learn what Core Data is and why you should use it. (For more resources related to this topic, see here.) Core Data allows you to store your data in a variety of storage types. So, if you want to use other types of memory store, such as XML or binary store, you can use the following store types: NSSQLiteStoreType: This is the option you most commonly use as it just stores your database in a SQLite database. NSXMLStoreType: This will store your data in an XML file, which is slower, but you can open the XML file and it will be human readable. This has the option of helping you debug errors relating to storage of data. However, do note that this storage type is only available for Mac OS X. NSBinaryStoreType: This occupies the least amount of space and also produces the fastest speed as it stores all data as a binary file, but the entire database binary need to be able to fit into memory in order to work properly. NSInMemoryStoreType: This stores all data in memory and provides the fastest access speed. However, the size of your database to be saved cannot exceed the available free space in memory since the data is stored in memory. However, do note that memory storage is ephemeral and is not stored permanently to disk. Next, there are two concepts that you need to know, and they are: Entity Attributes Now, these terms may be foreign to you. However, for those of you who have knowledge of databases, you will know it as tables and columns. So, to put it in an easy-to-understand picture, think of Core Data entities as your database tables and Core Data attributes as your database columns. So, Core Data handles data persistence using the concepts of entity and attributes, which are abstract data types, and actually saving the data into plists, SQLite databases, or even XML files (applicable only to the Mac OS). Going back a bit in time, Core Data is a descendant of Apple's Enterprise Objects Framework (EOF) , which was introduced by NeXT, Inc in 1994, and EOF is an Object-relational mapper (ORM), but Core Data itself is not an ORM. Core Data is a framework for managing the object graph, and one of it's powerful capabilities is that it allows you to work with extremely large datasets and object instances that normally would not fit into memory by putting objects in and out of memory when necessary. Core Data will map the Objective-C data type to the related data types, such as string, date, and integer, which will be represented by NSString, NSDate, and NSNumber respectively. So, as you can see, Core Data is not a radically new concept that you need to learn as it is grounded in the simple database concepts that we all know. Since entity and attributes are abstract data types, you cannot access them directly as they do not exist in physical terms. So to access them, you need to use the Core Data classes and methods provided by Apple. The number of classes for Core Data is actually pretty long, and you won't be using all of them regularly. So, here is a list of the more commonly used classes: CLASS NAME EXAMPLE USE CASE NSManagedObject Accessing attributes and rows of data NSManagedObjectContext Fetching data and saving data NSManagedObjectModel Storage NSFetchRequest Requesting data NSPersistentStoreCoordinator Persisting data NSPredicate Data query Now, let's go in-depth into the description of each of these classes: NSManagedObject: This is a record that you will use and perform operations on and all entities will extend this class. NSManagedObjectContext: This can be thought of as an intelligent scratchpad where temporary copies are brought into it after you fetch objects from the persistent store. So, any modifications done in this intelligent scratchpad are not saved until you save those changes into the persistent store, NSManagedObjectModel. Think of this as a collection of entities or a database schema, if you will. NSFetchRequest: This is an operation that describes the search criteria, which you will use to retrieve data from the persistent store, a kind of the common SQL query that most developers are familiar with. NSPersistentStoreCoordinator: This is like the glue that associates your managed object context and persistent. NSPersistentStoreCoordinator: Without this, your modifications will not be saved to the persistent store. NSPredicate: This is used to define logical conditions used in a search or for filtering in-memory. Basically, it means that NSPredicate is used to specify how data is to be fetched or filtered and you can use it together with NSFetchRequest as NSFetchRequest has a predicate property. Putting it into practice Now that we have covered the basics of Core Data, let's proceed with some code examples of how to use Core Data, where we use Core Data to store customer details in a Customer table and the information we want to store are: name email phone_number address age Do note that all attribute names must be in lowercase and have no spaces in them. For example, we will use Core Data to store customer details mentioned earlier as well as retrieve, update, and delete the customer records using the Core Data framework and methods. First, we will select File | New | File and then select iOS | Core Data: Then, we will proceed to create a new Entity called Customer by clicking on the Add Entity button on the bottom left of the screen, as shown here: Then, we will proceed to add in the attributes for our Customer entity and give them the appropriate Type, which can be String for attributes such as name or address and Integer 16 for age. Lastly, we need to add CoreData.framework, as seen in the following screenshot: So with this, we have created a Core Data model class consisting of a Customer entity and some attributes. Do note that all core model classes have the .xcdatamodeld file extension and for us, we can save our Core Data model as Model.xcdatamodeld. Next, we will create a sample application that uses Core Data in the following ways:     Saving a record     Searching for a record     Deleting a record     Loading records Now, I won't cover the usage of UIKit and storyboard, but instead focus on the core code needed to give you an example of Core Data works. So, to start things off, here are a few images of the application for you to have a feel of what we will do: This is the main screen when you start the app: The screen to insert record is shown here: The screen to list all records from our persistent store is as follows: By deleting a record from the persistent store, you will get the following output: Getting into the code Let's get started with our code examples: For our code, we will first declare some Core Data objects in our AppDelegate class inside our AppDelegate.h file such as: @property (readonly, strong, nonatomic) NSManagedObjectContext *managedObjectContext; @property (readonly, strong, nonatomic) NSManagedObjectModel *managedObjectModel; @property (readonly, strong, nonatomic) NSPersistentStoreCoordinator *persistentStoreCoordinator; Next, we will declare the code for each of the objects in AppDelegate.m such as the following lines of code that will create an instance of NSManagedObjectContext and return an existing instance if the instance already exists. This is important as you want only one instance of the context to be present to avoid conflicting access to the context: - (NSManagedObjectContext *)managedObjectContext { if (_managedObjectContext != nil) { return _managedObjectContext; } NSPersistentStoreCoordinator *coordinator = [self persistentStoreCoordinator]; if (coordinator != nil) { _managedObjectContext = [[NSManagedObjectContext alloc] init]; [_managedObjectContext setPersistentStoreCoordinator:coordinator]; } if (_managedObjectContext == nil) NSLog(@"_managedObjectContext is nil"); return _managedObjectContext; } This method will create the NSManagedObjectModel instance and then return the instance, but it will return an existing NSManagedObjectModel if it already exists: // Returns the managed object model for the application. - (NSManagedObjectModel *)managedObjectModel { if (_managedObjectModel != nil) { return _managedObjectModel;//return model since it already exists } //else create the model and return it //CustomerModel is the filename of your *.xcdatamodeld file NSURL *modelURL = [[NSBundle mainBundle] URLForResource:@"CustomerModel" withExtension:@"momd"]; _managedObjectModel = [[NSManagedObjectModel alloc] initWithContentsOfURL:modelURL]; if (_managedObjectModel == nil) NSLog(@"_managedObjectModel is nil"); return _managedObjectModel; } This method will create an instance of the NSPersistentStoreCoordinator class if it does not exist, and also return an existing instance if it already exists. We will also put some logging via NSLog to tell the user if the instance of NSPersistentStoreCoordinator is nil and use the NSSQLiteStoreType keyword to signify to the system that we intend to store the data in a SQLite database: // Returns the persistent store coordinator for the application. - (NSPersistentStoreCoordinator *)persistentStoreCoordinator { NSPersistentStoreCoordinator if (_persistentStoreCoordinator != nil) { return _persistentStoreCoordinator;//return persistent store }//coordinator since it already exists NSURL *storeURL = [[self applicationDocumentsDirectory] URLByAppendingPathComponent:@"CustomerModel.sqlite"]; NSError *error = nil; _persistentStoreCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel:[self managedObjectModel]]; if (_persistentStoreCoordinator == nil) NSLog(@"_persistentStoreCoordinator is nil"); if (![_persistentStoreCoordinator addPersistentStoreWithTy pe:NSSQLiteStoreType configuration:nil URL:storeURL options:nil error:&error]) { NSLog(@"Error %@, %@", error, [error userInfo]); abort(); } return _persistentStoreCoordinator; } The following lines of code will return a URL of the location to store your data on the device: #pragma mark - Application's Documents directory// Returns the URL to the application's Documents directory. - (NSURL *)applicationDocumentsDirectory { return [[[NSFileManager defaultManager] URLsForDirectory:NSDocumentDirectory inDomains:NSUserDomainMask] lastObject]; } As you can see, what we have done is to check whether the objects such as _managedObjectModel are nil and if it is not nil, then we return the object, else we will create the object and then return it. This concept is exactly the same concept of lazy loading. We apply the same methodology to managedObjectContext and persistentStoreCoordinator. We did this so that we know that we only have one instance of managedObjectModel, managedObjectContext, and persistentStoreCoordinator created and present at any given time. This is to help us avoid having multiple copies of these objects, which will increase the chance of a memory leak. Note that memory management is still a real issue in the post-ARC world. So what we have done is follow best practices that will help us avoid memory leaks. In the example code that was shown, we adopted a structure so that only one instance of managedObjectModel, managedObjectContext and persistentStoreCoordinator is available at any given time. Next, let's move on to showing you how to store data into our persistent store. As you can see in the preceding screenshot, we have fields such as name, age, address, email, and phone_number, which corresponds to the appropriate fields in our Customer entity. Summary In this article, you learned about Core Data and why you should use it. Resources for Article: Further resources on this subject: BSD Socket Library [article] Marker-based Augmented Reality on iPhone or iPad [article] User Interactivity – Mini Golf [article]
Read more
  • 0
  • 0
  • 5249

article-image-overview-automation-and-advent-chef
Packt
23 Mar 2015
14 min read
Save for later

An Overview of Automation and Advent of Chef

Packt
23 Mar 2015
14 min read
In this article by Rishabh Sharma and Mitesh Soni, author of the book Learning Chef, before moving to the details of different Chef components and other practical things, it is recommended that you know the foundation of automation and some of the existing automation tools. This article will provide you with a conceptual understanding of automation, and a comparative analysis of Chef with the existing automation tools. (For more resources related to this topic, see here.) In this article, we will cover the following topics: An overview of automation The need for automation A brief introduction of Chef The salient features of Chef Automation Automation is the process of automating operations that control, regulate, and administrate machines, disparate systems, or software with little or no human intervention. In simple English, automation means automatic processing with "little or no human involvement. An automated system is expected to perform a function more reliably, efficiently, and accurately than a human operator. The automated machine performs a function at a lower cost with higher efficiency than a human operator, thereby, automation is becoming more and more widespread across various service industries as well as in the IT and software industry. Automation basically helps a business in the following ways: It helps to reduce the complexities of processes and sequential steps It helps to reduce the possibilities of human error in repeatable tasks It helps to consistently and predictably improve the performance of a system It helps customers to focus on business rather than how to manage complexities of their system; hence, it increases the productivity and "scope of innovation in a business It improves robustness, agility of application deployment in different environments, and reduces the time to market an application Automation has already helped to solve various engineering problems such as information gathering, preparation of automatic bills, and reports; with the help "of automation, we get high-quality products and products that save cost. IT operations are very much dependent on automation. A high degree of automation in IT operations results in a reduced need for manual work, improved quality of service, and productivity. Why automation is needed Automation has been serving different types of industries such as agriculture, food and drink, and so on for many years, and its usage is well known; here, we will concentrate on automation related to the information technology (IT) service and software industry. Escalation of innovation in information technology has created tremendous opportunities for unbelievable growth in large organizations and small- and medium-sized businesses. IT automation is the process of automated integration and management of multifaceted compute resources, middleware, enterprise applications, and services based on workflow. Obviously, large organizations with gigantic profits can afford costly IT resources, manpower, and sophisticated management tools, while for small- and medium-scale organizations, it is not feasible. In addition to this, huge investments are at stake in all resources and most of the time, this resource management is a manual process, which is prone to errors. Hence, automation in the IT industry can be proved as a boon considering that it has repeatable and error-prone tasks. Let's drill down the reasons for the need of automation in more detail: Agile methodology: An agile approach to develop an application results in the frequent deployment of a process. Multiple deployments in a short interval involve a lot of manual effort and repeatable activities. Continuous delivery: Large number of application releases within a short span of time due to an agile approach of business units or organizations require speedy and frequent deployment in a production environment. Development of a delivery process involves development and operation teams that have different responsibilities for proper delivery of the outcome. Non-effective transition between development and production environment: In a traditional environment, transition of a latest application build from development to production lasts over weeks. Execution steps taken to do this are manual, and hence, it is likely that they will create problems. The complete process is extremely inefficient. It becomes an exhaustive process with a lot of manual effort involved. Inefficient communication and collaboration between teams: Priorities of development and IT operations teams are different in different organizations. A development team is focused on the latest development releases and considers new feature development, fixing the existing bugs, or development of innovative concepts; while an operations team cares about the stability of a production environment. Often, the first deployment takes place in a production-like environment when a development team completes its part. An operations team manages the deployment environment for the application independently, and there is hardly any interaction between both the teams. More often than not, ineffective or virtually, no collaboration and communication between the teams causes many problems in the transition of application package from the deployment environment to the production environment because of the different roles and responsibilities of the respective teams. Cloud computing: The surfacing of cloud computing in the last decade has changed the perspective of the business stakeholders. Organizations are attempting to develop and deploy cloud-based applications to keep up their pace with the current market and technology trends. Cloud computing helps to manage a complex IT infrastructure that includes physical, consolidated, virtualized, and cloud resources, as well as it helps to manage the constant pressure to reduce costs. Infrastructure as a code is an innovative concept that models the infrastructure as a code to pool resources in an abstract manner with seamless operations to provision and deprovision for the infrastructure in a flexible environment of the cloud. Hence, we can consider that the infrastructure is redeployable using configuration management tools. Such an unimaginable agility in resources has provided us with the best platform to develop innovative applications with an agile methodology rather than the slow and linear waterfall of the Software Development Life Cycle (SDLC) model. Automation brings the following benefits to the IT industry by addressing "preceding concerns: Agility: It provides promptness and agility to your IT infrastructure. Productivity and flexibility is the significant advantage of automation, which helps us to compete with the current agile economic condition. Scalability: Using automation, we can manage the complications of the infrastructure and leverage the scalability of resources in order to fulfill our customers demand. It helps to transform infrastructure into a simple code, which means that building, rebuilding, configuring, and scaling of the infrastructure is possible in just a few minutes according to the need of the customers in a real-world environment. Efficiency and consistency: It can handle all the repeated tasks very easily, so that you can concentrate on innovative business. It increases the agility and efficiency of managing a deployment environment and application deployment itself. Effective management of resources: It helps to maintain a model of infrastructure which must be consistent. It provides a code-based design framework that leads us to a flexible and manageable way to know all the fundamentals of the complex network. Deployment accuracy: Application development and delivery is a multifaceted, cumbersome, repetitive, and time-bound endeavor. Using automation, testability of a deployment environment and the enforcing discipline of an accurate scripting of the changes needs to be done to an environment, and the repeatability of those changes can be done very quickly. We have covered DevOps-related aspects in the previous section, where we discussed the need for automation and its benefits. Let's understand it in a more precise manner. Recently, the DevOps culture has become very popular. A DevOps-based application development can handle quick changes, frequent releases, fix bugs and continuous delivery-related issues in the entire SDLC process. In simple English, we can say that DevOps is a blend of the tasks undertaken by the development and operation teams to make application delivery faster and more effective. DevOps (includes coding, testing, continuous integration of applications, and version releases) and various IT operations (includes change, incident, problem management, escalation, and monitoring,) can work together in a highly collaborative environment. It means that there must be a strong collaboration, integration, and communication between software developers and IT operations team. The following figure shows you the applied view of DevOps, and how a development and an operations team collaborate with each other with the help of different types of tools. For different kinds of operations such as configuration management and deployment, both Chef and Puppet are being used. DevOps also shows how cloud management tools such as Dell Cloud Manager, formerly known as Enstratius, RightScale, and Scalr can be used to manage cloud resources for development and operations activities: DevOps is not a technology or a product, but it is a combination of culture, people, process, and technology. Everyone who is involved in the software development process, including managers, works together and collaboratively on all the aspects of a project. DevOps represents an important opportunity for organizations to stay ahead of their competition by building better applications and services, thus opening the door for increased revenue and improved customer experiences. DevOps is the solution for the problems that arise from the interdependence of IT operations and software development. There are various benefits of DevOps: DevOps targets application delivery, new feature development, bug fixing, testing, and maintenance of new releases It provides stable operating environments similar to an actual deployment environment and hence, results in less errors or unknown scenarios It supports an effective application release management process by providing better control over the distributed development efforts, and by regulating development and deployment environments It provides continuous delivery of applications and hence provides faster solutions to problems It provides faster development and delivery cycles, which help us to increase our response to customer feedback in a timely manner and enhance customer experience and loyalty It improves efficiency, security, reliability, predictability of outcome, and faster development and deployment cycles In the following figure, we can see all the necessities that are based on the development of DevOps. In order to serve most of the necessities of DevOps, we need a tool for configuration management, such as Chef: In order to support DevOps-based application development and delivery approach, infrastructure automation is mandatory, considering extreme need of agility. The entire infrastructure and platform layer should be configurable in the form of code or a script. These scripts will manage to install operating systems, install and configure servers on different instances or on virtual machines, and these scripts will manage to install and configure the required software and services on particular machines. Hence, it is an opportunistic time for organizations that need to deliver innovative business value in terms of services or offerings in the form of working outcome – deployment ready applications. With an automation script, same configuration can be applied to a single server or thousands of identical servers simultaneously. Thereby, it can handle error-prone manual tasks more efficiently without any intervention, and manage horizontal scalability efficiently and easily. In the past few years, several open-source commercial tools have emerged for infrastructure automation, in which, Bcfg2, Cobbler, CFEngine, Puppet, and Chef are the most popular. These automation tools can be used to manage all types of infrastructure environments such as physical or virtual machines, or clouds. Our objective is to understand Chef in detail, and hence, we will look at the overview of the Chef tool in the next section. Introduction to Chef Chef is an open source configuration management tool developed by the Opscode community in 2008. They launched its first edition in January 2009. Opscode is run by individuals from the data center teams of Amazon and Microsoft. Chef supports a variety of operating systems; it typically runs on Linux, but supports Windows 7 and Windows Server too. Chef is written in Ruby and Erlang, both are real-time programming languages. The Chef server, workstation, and nodes are the three major components of Chef. The Chef server stores data to configure and manage nodes effectively. A Chef workstation works as a local Chef repository. Knife is installed on a workstation. Knife is used to upload cookbooks to a Chef server. Cookbook is a collection of recipes. Recipes execute actions that are meant to be automated. A node communicates with a Chef server and gets the configuration data related to it and executes it to install packages or to perform any other operations for configuration management . Most of the outages that impact the core services of business organizations are caused by human errors during configuration changes and release management. Chef helps software developers and engineers to manage server and application configurations, and provides for hardware or virtual resources by writing code rather than running commands manually. Hence, it is possible to apply best practices of coding and design patterns to automate infrastructure. Chef was developed to handle most critical infrastructure challenges in the current scenario; it makes deployment of server and applications to any physical, virtual, or cloud instances easy. Chef transforms infrastructure to code. Considering virtual machines in a cloud environment, we can easily visualize the possibility of keeping versions of infrastructure and its configurations and creating infrastructure repeatedly and proficiently. Additionally, Chef also supports system administration, network management, and continuous delivery of an application. The salient features of Chef Based on comparative analysis with Chef's competitors, the following are the salient features of Chef, which make it an outstanding and the most popular choice among developers in the current IT infrastructure automation scenario: Chef has different flavors of automated solutions for current IT operations such as Open Source Chef, Hosted Chef, and Private Chef. Chef enables the highly scalable, secure, and fault-tolerant automation capability features of your infrastructure. Every flavor has a specific solution to handle different kinds of infrastructure needs. For example, the Open Source Chef server is freely available for all, but supports limited features, while the Hosted Chef server is managed by Opscode as a service with subscription fees for standard and premium support. The Private Chef server provides an on-premise automated solution with a subscription price and licensing plans. Chef has given us flexibility. According to the current industry use cases, we can choose among Open Source, Hosted, and Private Chef server as per our requirement. Chef has the facility to integrate with third-party tools such as Test Kitchen, Vagrant, and Foodcritic. These integrations help developers to test Chef scripts and perform proof of concept (POC) before deploying an actual automation. These tools are very useful to learn and test Chef scripting. Chef has a very strong community. The website, https://www.chef.io/ can help you get started with Chef and publish things. Opscode has hosted numerous webinars, it publishes training material, and makes it very easy for developers to contribute to new patches and releases. Chef can quickly handle all types of traditional dependencies and manual processes of the entire network. Chef has a strong dependency management approach, which means that only the sequence of order matters, and all dependencies would be met if they are specified in the proper order. Chef is well suited for cloud instances, and it is the first choice of developers who are associated with cloud infrastructure automation. Therefore, demand for Chef automation is growing exponentially. Within a short span of time, Chef has acquired a good market reputation and reliability. In the following figure, we can see the key features of Chef automation, which make it the most popular choice of developers in the current industry scenario: Summary Here, we got the fundamental understanding of automation and the various ways in which automation helps the IT industry. DevOps is most popular nowadays, because it brings a highly collaborative environment in the entire software development process. We got an overview of several traditional and advance automation tools, which have been used over the past 15 years. We got a clear idea why Chef is needed in the current scenario of IT automation process and why it is more preferable. We saw the evolution of various IT automation tools and the comparison of some advance automation tools. We also discussed the salient features of Chef. Resources for Article: Further resources on this subject: Chef Infrastructure [Article] External Tools and the Puppet Ecosystem [Article] Getting started with using Chef [Article]
Read more
  • 0
  • 0
  • 2423

article-image-gps-enabled-time-lapse-recorder
Packt
23 Mar 2015
17 min read
Save for later

GPS-enabled Time-lapse Recorder

Packt
23 Mar 2015
17 min read
In this article by Dan Nixon, the author of the book Raspberry Pi Blueprints, we will see the recording of time-lapse captures using the Raspberry Pi camera module. (For more resources related to this topic, see here.) One of the possible uses of the Raspberry Pi camera module is the recording of time-lapse captures, which takes a still image at a set interval over a long period of time. This can then be used to create an accelerated video of a long-term event that takes place (for example, a building being constructed). One alteration to this is to have the camera mounted on a moving vehicle. Use the time lapse to record a journey; with the addition of GPS data, this can provide an interesting record of a reasonably long journey. In this article, we will use the Raspberry Pi camera module board to create a location-aware time-lapse recorder that will store the GPS position with each image in the EXIF metadata. To do this, we will use a GPS module that connects to the Pi over the serial connection on the GPIO port and a custom Python program that listens for new GPS data during the time lapse. For this project, we will use the Raspbian distribution. What you will need This is a list of things that you will need to complete this project. All of these are available at most electronic components stores and online retailers: The Raspberry Pi A relatively large SD card (at least 8 GB is recommended) The Pi camera board A GPS module (http://www.adafruit.com/product/746) 0.1 inch female to female pin jumper wires A USB power bank (this is optional and is used to power the Pi when no other power is available) Setting up the hardware The first thing we will do is set up the two pieces of hardware and verify that they are working correctly before moving on to the software. The camera board The first (and the most important) piece of hardware we need is the camera board. Firstly, start by connecting the camera board to the Pi. Connecting the camera module to the Pi The camera is connected to the Pi via a 15-pin flat, flex ribbon cable, which can be physically connected to two connectors on the Pi. However, the connector it should be connected to is the one nearest to the Ethernet jack; the other connector is for display. To connect the cable first, lift the top retention clip on the connector, as shown in the following image: Insert the flat, flex cable with the silver contacts facing the HDMI port and the rigid, blue plastic part of the ribbon connector facing the Ethernet port on the Pi: Finally, press down the cable retention clip to secure the cable into the connector. If this is done correctly, the cable should be perpendicular to the printed circuit board (PCB) and should remain seated in the connector if you try to use a little force to pull it out: Next, we will move on to set up the camera driver, libraries, and software within Raspbian. Setting up the Raspberry Pi camera Firstly, we need to enable support for the camera in the operating system itself by performing the following steps: This is done by the raspi-config utility from a terminal (either locally or over SSH). Enter the following command: sudo raspi-config This command will open the following configuration page: This will load the configuration utility. Scroll down to the Enable Camera option using the arrow keys and select it using Enter. Next, highlight Enable and select it using Enter: Once this is done, you will be taken back to the main raspi-config menu. Exitraspi-config, and reboot the Pi to continue. Next, we will look for any updates to the Pi kernel, as using an out-of-date kernel can sometimes cause issues with the low-level hardware, such as the camera module and GPIO. We also need to get a library that allows control of the camera from Python. Both of these installations can be done with the following two commands: sudo rpi-update sudo apt-get install python-picamera Once this is complete, reboot the Pi using the following command: sudo reboot Next, we will test out the camera using the python-picamera library we just installed.To do this, create a simple test script using nano: nano canera_test.py The following code will capture a still image after opening the preview for 5 seconds. Having the preview open before a capture is a good idea as this gives the camera time to adjust capture parameters of the environment: import sys import time import picamera with picamera.PiCamera() as cam:    cam.resolution = (1280, 1024)    cam.start_preview()    time.sleep(5)    cam.capture(sys.argv[1])    cam.stop_preview() Save the script using Ctrl + X and enter Y to confirm. Now, test it by using the following command: python camera_test.py image.jpg This will capture a single, still image and save it to image.jpg. It is worth downloading the image using SFTP to verify that the camera is working properly. The GPS module Before connecting the GPS module to the Pi, there are a couple of important modifications that need to be made to the way the Pi boots up. By default, Raspbian uses the on-board serial port on the GPIO header as a serial terminal for the Pi (this allows you to connect to the Pi and run commands in a similar way to SSH). However, this is of little use to us here and can interfere with the communication between the GPS module and the Pi if the serial terminal is left enabled. This can be disabled by modifying a couple of configuration files: First, start with: sudo nano /boot/cmdline.txt Here, you will need to remove any references to ttyAMA0 (the name for the on-board serial port). In my case, there was a single entry of console=ttyAMA0,115200, which had to be removed. Once this is done, the file should look something like what is shown in the following screenshot: Next, we need to stop the Pi by using the serial port for the TTY session. To do this, edit this file: sudo nano /etc/inittab Here, look for the following line and comment it out: T0:23:respawn:/sbin/getty -L ttyAMA0 115200 vt100 Once this is done, the file should look like what is shown in the following screenshot: After both the files are changed, power down the Pi using the following command: sudo shutdown -h now Next, we need to connect the GPS module to the Pi GPIO port. One important thing to note when you do this is that the GPS module must be able to run on 3.3 V or at least be able to use a 3.3 V logic level (such as the Adafruit module I am using here). As with any device that connects to the Pi GPIO header, using a 5 V logic device can cause irreparable damage to the Pi. Next, connect the GPS module to the Pi, as shown in the following diagram. If you are using the Adafruit module, then all the pins are labeled on the PCB itself. For other modules, you may need to check the data sheet to find which pins to connect: Once this is completed, the wiring to the GPS module should look similar to what is shown in the following image: After the GPS module is connected and the Pi is powered up, we will install, configure, and test the driver and libraries that are needed to access the data that is sent to the Pi from the GPS module: Start by installing some required packages. Here, gpsd is the daemon that managed data from GPS devices connected to a system, gpsd-clients contains a client that we will use to test the GPS module, and python-gps contains the Python client for gpsd, which is used in the time-lapse capture application: sudo apt-get install gpsd gpsd-clients python-gps Once they are installed, we need to configure gpsd to work in the way we want. To do this, use the following command: sudo dpkg-reconfigure gpsd This will open a configuration page similar to raspi-config. First, you will be asked whether you want gpsd to start on boot. Select Yes here: Next, it will ask whether we are using USB GPS receivers. Since we are not using one, select No here: Next, it will ask for the device (that is, serial port) the GPS receiver is connected to. Since we are using the on-board serial port on the Pi GPIO header, enter /dev/ttyAMA0 here: Next, it will ask for any custom parameters to pass to gpsd, when it is executed. Here, we will enter -n -G. -n, which tells gpsd to poll the GPS module even before a client has requested any data (this has been known to cause problems with some applications) and -G tells gpsd to accept connections from devices other then the Pi itself (this is not really required, but is a good debugging tool): When you start gpsd with the -G option, you can then use cgps to view the GPS data from any device by using the command where [IP] is the IP address of the Pi: cgps [IP] Finally, you will be asked for the location of the control socket. The default value should be kept here so just select Ok: After the configuration is done, reboot the Pi and use the following command to test the configuration: cgps -s This should give output similar to what is shown in the following screenshot, if everything works: If the status indication reads NO FIX, then you may need to move the GPS module into an area with a clear view of the sky for testing. If cgps times out and exits, then gpsd has failed to communicate with your GPS module. Go back and double-check the configuration and wiring. Setting up the capture software Now, we need to get the capture software installed on the Pi. First, copy the recorder folder onto the Pi using FileZilla and SFTP. We need to install some packages and Python libraries that are used by the capture application. To do this, first install the Python setup tools that I have used to package the capture application: sudo apt-get install python-setuptools git Next, run the following commands to download and install the pexif library, which is used to save the GPS position from which each image was taken into the image EXIF data: git clone https://github.com/bennoleslie/pexif.git pexif cd pexif sudo python setup.py install Once this is done, SSH into the Pi can change directory to the recorder folder and run the following command: sudo python setup.py install Now that the application is installed, we can take a look at the list of commands it accepts using: gpstimelapse -h This shows the list of commands, as shown in the following screenshot: A few of the options here can be ignored; --log-file, --log-level, and --verbose were mainly added for debugging while I was writing the application. The --gps option will not need to be set, as it defaults to connect to the local gpsd instance, which if the application is running on the Pi, will always be correct. The --width and --height options are simply used to set the resolution of the captured image. Without them, the capture software will default to capture 1248 x 1024 images. The --interval option is used to specify how long, in seconds, to wait before it captures another time-lapse frame. It is recommended that you set this value at least 10 seconds in order to avoid filling the SD card too quickly (especially if the time lapse will run over a long period of time) and to ensure that any video created with the frames is of a reasonably length (that is, not too long). The --distance option allows you to specify a minimum distance, in kilometers, that must be travelled since the last image was captured and before another image is captured. This can be useful to record a time lapse where, whatever holds the Pi, may stop in the same position for periods of time (for example, if the camera is in a car dashboard, this would prevent it from capturing several identical frames if the car is waiting in traffic). This option can also be used to capture a set of images based alone on the distance travelled, disregarding the amount of time that has passed. This can be done by setting the --interval option to 1 (a value of 1 is used as data is only taken from the GPS module every second, so checking the distance travelled faster than this would be a waste of time). The folder structure is used to store the frames. While being slightly complex at first sight, this is a good method that allows you to take multiple captures without ever having to SSH into the Pi. Using the --folder option, you can set the folder under which all captures are saved. In this folder, the application looks for folders with a numerical name and creates a new folder that is one higher than the highest number it finds. This is where it will save the images for the current capture. The filename for each image is given by the --filename option. This option specifies the filename of each image that will be captured. It must contain %d, which is used to indicate the frame number (for example, image_%d.jpg). For example, if I pass --folder captures --filename image_%d.jpg to the program, the first frame will be saved as ./captures/0/image_0/jpg, and the second as ./captures/0/image_1.jpg. Here are some examples of how the application can be used: gpstimelapse --folder captures --filename i_%d.jpg --interval 30: This will capture a frame in every 30 seconds gpstimelapse --folder captures --filename i_%d.jpg --interval 30 --distance 0.05: This will capture a frame in every 30 seconds, provided that 50 meters have been travelled gpstimelapse --folder captures --filename i_%d.jpg --interval 1 --distance 0.05: This will capture a frame in every 50 meters that have been travelled Now that you are able to run the time-lapse recorder application, you are ready to configure it to start as soon as the Pi boots. Removing the need for an active network connection and the ability to interface with the Pi to start the capture. To do this, we will add a command to the /etc/rc.local file. This can be edited using the following command: sudo nano /etc/rc.local The line you will add will depend on how exactly you want the recorder to behave. In this case, I have set it to record an image at the default resolution every minute. As before, ensure that the command is placed just before the line containing exit 0: Now, you can reboot the Pi and test out the recorder. A good indication that the capture is working is the red LED on the camera board that lights up constantly. This shows that the camera preview is open, which should always be the case with this application. Also note that, the capture will not begin until the GPS module has a fix. On the Adafruit module, this is indicated by a quick blink every 15 seconds on the fix LED (no fix is indicated by a steady blink once per second). One issue you may have with this project is the amount of power required to power the camera and GPS module on top of the Pi. To power this while on the move, I recommend that you use one of the USB power banks that have a 2 A output (such power banks are readily available on Amazon). Using the captures Now that we have a set of recorded time-lapse frames, where each has a GPS position attached, there are a number of things that can be done with this data. Here, we will have a quick look at a couple of instances for which we can use the captured frames. Creating a time-lapse video The first and probably the most obvious thing that can be done with the images is you can create a time-lapse video in which, each time-lapse image is shown as a single frame of the video, and the length (or speed) of the video is controlled by changing the number of frames per second. One of the simplest ways to do this is by using either the ffmpeg or avconv utility (depending on your version of Linux; the parameters to each are identical in our case). This utility is available on most Linux distributions, including Raspbian. There are also precompiled executables available for Mac and Windows. However, here I will only discuss using it on Linux, but rest assured, any instructions given here will also work on the Pi itself. To create a time lapse, form a set of images. You can use the following command: avconv -framerate FPS -i FILENAME -c:v libx264 -r 30 -pix_fmt yuv420p OUTPUT Here, FPS is the number of the time-lapse frames you want to display every second, FILENAME is the filename format with %d that marks the frame number, and OUTPUT is the output's filename. This will give output similar to the following: Exporting GPS data as CSV We can also extract GPS data from each of the captured time-lapse images and save it as a comma-separated value (CSV) file. This will allow us to import the data into third-party applications, such as Google Maps and Google Earth. To do this, we can use the frames_to_gps_path.py Python script. This takes the file format for the time-lapse frames and a name for the output file. For example, to create a CSV file called gps_data.csv for images in the frame_%d.jpg format, you can use the following command: python frames_to_gps_points.py -f frame_%d.jpg -o gps_points.csv The output is a CSV file in the following format: [frame number],[latitude],[longitude],[image filename] The script also has the option to restrict the maximum number of output points. Passing the --max-points N parameter will ensure that no more than N points are in the CSV file. This can be useful for importing data into applications that limit the number of points that can be imported. Summary In this article, we had a look at how to use the serial interface on the GPIO port in order to interface with some external hardware. The knowledge of how to do this will allow you to interface the Pi with a much wider range of hardware in future projects. We also took a look at the camera board and how it can be used from within Python. This camera is a very versatile device and has a very wide range of uses in portable projects and ubiquitous computing. You are encouraged to take a deeper look at the source code for the time-lapse recorder application. This will get you on your way to understand the structure of moderately complex Python programs and the way they can be packaged and distributed. Resources for Article: Further resources on this subject: Central Air and Heating Thermostat [article] Raspberry Pi Gaming Operating Systems [article] The Raspberry Pi and Raspbian [article]
Read more
  • 0
  • 0
  • 7059

article-image-making-games-pixijs
Alvin Ourrad
23 Mar 2015
6 min read
Save for later

Making Games with Pixi.js

Alvin Ourrad
23 Mar 2015
6 min read
In this post I will introduce you to pixi.js, a super-fast rending engine that is also a swiss-army-knife tool with a friendly API. What ? Pixi.js is a rendering engine that allows you to use the power of WebGL and canvas to render your content on your screen in a completely seamless way. In fact, pixi.js features both a WebGL and a canvas renderer, and can fall back to the latter for lower-end devices. You can then harness the power of WebGL and hardware-accelerated graphics on devices that are powerful enough to use it. If one of your users is on an older device, the engine falls back to the canvas renderer automatically and there is no difference for the person browsing your website, so you don't have to worry about those users any more. WebGL for 2D ? If you have heard or browsed a web product that was showcased as using WebGL, you probably have memories of a 3D game, a 3D earth visualization, or something similar. WebGL was originally highlighted and brought to the public for its capability to render 3D graphics in the browser, because it was the only way that was fast enough to allow them to do it.  But the underlying technology is not 3D only, nor is it 2D, you make it do what you want, so the idea behind pixi.js was to bring this speed and quality of rendering to 2D graphics and games, and of course to the general public. You might argue that you do not need this level of accuracy and fine-grain control for 2D, and the WebGL API might be a bit of an overhead for a 2D application, but with browsers becoming more powerful, the expectations of the users are getting higher and higher and this technology with its speed allows you to compete with the applications that used to be flash-only. Tour/Overview Pixi.js was created by a former flash developer, so consequently its syntax is very similar to ActionScript3. Here is a little tour of the core components that you need to create when using pixi. The renderer I already gave you a description of its features and capabilities, so the only thing to bear in mind is that there are two ways of creating a renderer. You can specify the renderer that you want, or let the engine decide according to the current device. // When you let the engine decide : var renderer = PIXI.autoDetectRenderer(800,600); // When you specifically want one or the other renderer: var renderer = new PIXI.WebGLRenderer(800,600); // and for canvas you'd write : // var renderer = new PIXI.WebGLRenderer(800,600); The stage Pixi mimics the Flash API in how it deals with object’s positioning. Basically, the object's coordinates are always relative to their parent container. Flash and pixi allow you to create special objects that are called containers. They are not images or graphics, they are abstract ways to group objects together.  Say you have a landscape made of various things such as trees, rocks, and so on. If you add them to a container and move this container, you can move all of these objects together by moving the container. Here is how it works:   Don't run away just yet, this is where the stage comes in. The Stage is the root container that everything is added to. The stage isn't meant to move, so when a sprite is added directly to the stage, you can be sure its position will be the same as its position on-screen (well, within your canvas). // here is how you create a stage var stage = new PIXI.Stage(); Let's make a thing Ok, enough of the scene-graph theory, it's time to make something. As I wrote before, pixi is a rendering engine, so you will need to tell the renderer to render its stage, otherwise nothing will happen. So this is the bare bones template you'll use for anything pixi: // create an new instance of a pixi stage var stage = new PIXI.Stage(0x0212223); // create a renderer instance var renderer = PIXI.autoDetectRenderer(window.innerWidth, window.innerHeight); // add the renderer view element to the DOM document.body.appendChild(renderer.view); // create a new Sprite using the texture var bunny = new PIXI.Sprite.fromImage("assets/bunny.png"); bunny.position.set(200,230); stage.addChild(bunny); animate(); function animate() { // render the stage renderer.render(stage); requestAnimFrame(animate); } First, you create a renderer and a stage, just like I showed you before, then you create the most important pixi object, a Sprite, which is basically an image rendered on your screen.  var sprite = new PIXI.Sprite.fromImage("assets/image.png"); Sprites, are the core of your game, and the thing you will use the most in pixi and any major game framework. However, pixi being not really a game framework, but a level lower, you need to manually add your sprites to the stage. So whenever something is not visible, make sure to double-check that you have added it to the stage like this:  stage.addChild(sprite); Then, you can create a function that creates a bunch of sprites.  function createParticles () { for (var i = 0; i < 40; i++) { // create a new Sprite using the texture var bunny = new PIXI.Sprite.fromImage("assets/bunny.png"); bunny.xSpeed = (Math.random()*20)-10; bunny.ySpeed = (Math.random()*20)-10; bunny.tint = Math.random() * 0xffffff; bunny.rotation = Math.random() * 6; stage.addChild(bunny); } } And then, you can leverage the update loop to move these sprites around randomly: if(count > 10){ createParticles(); count = 0; } if(stage.children.length > 20000){ stage.children.shift()} for (var i = 0; i < stage.children.length; i++) { var sprite = stage.children[i]; sprite.position.x += sprite.xSpeed; sprite.position.y += sprite.ySpeed; if(sprite.position.x > renderer.width){ sprite.position.x = 0; } if(sprite.position.y > renderer.height){ sprite.position.y = 0; } }; </code> That's it, for this blog post. Feel free to have a play with pixi and browse the dedicated website. Games development, web development, native apps... Visit our JavaScript page for more tutorials and content on the frameworks and tools essential for any software developers toolkit. About the author Alvin is a web developer fond of the web and the power of open standards. A lover of open source, he likes experimenting with interactivity in the browser. He currently works as an HTML5 game developer.
Read more
  • 0
  • 0
  • 10253

article-image-introducing-interactive-plotting
Packt
20 Mar 2015
29 min read
Save for later

Introducing Interactive Plotting

Packt
20 Mar 2015
29 min read
This article is written by Benjamin V. Root, the author of Interactive Applications using Matplotlib. The goal of any interactive application is to provide as much information as possible while minimizing complexity. If it can't provide the information the users need, then it is useless to them. However, if the application is too complex, then the information's signal gets lost in the noise of the complexity. A graphical presentation often strikes the right balance. The Matplotlib library can help you present your data as graphs in your application. Anybody can make a simple interactive application without knowing anything about draw buffers, event loops, or even what a GUI toolkit is. And yet, the Matplotlib library will cede as much control as desired to allow even the most savvy GUI developer to create a masterful application from scratch. Like much of the Python language, Matplotlib's philosophy is to give the developer full control, but without being stupidly unhelpful and tedious. (For more resources related to this topic, see here.) Installing Matplotlib There are many ways to install Matplotlib on your system. While the library used to have a reputation for being difficult to install on non-Linux systems, it has come a long way since then, along with the rest of the Python ecosystem. Refer to the following command: $ pip install matplotlib Most likely, the preceding command would work just fine from the command line. Python Wheels (the next-generation Python package format that has replaced "eggs") for Matplotlib are now available from PyPi for Windows and Mac OS X systems. This method would also work for Linux users; however, it might be more favorable to install it via the system's built-in package manager. While the core Matplotlib library can be installed with few dependencies, it is a part of a much larger scientific computing ecosystem known as SciPy. Displaying your data is often the easiest part of your application. Processing it is much more difficult, and the SciPy ecosystem most likely has the packages you need to do that. For basic numerical processing and N-dimensional data arrays, there is NumPy. For more advanced but general data processing tools, there is the SciPy package (the name was so catchy, it ended up being used to refer to many different things in the community). For more domain-specific needs, there are "Sci-Kits" such as scikit-learn for artificial intelligence, scikit-image for image processing, and statsmodels for statistical modeling. Another very useful library for data processing is pandas. This was just a short summary of the packages available in the SciPy ecosystem. Manually managing all of their installations, updates, and dependencies would be difficult for many who just simply want to use the tools. Luckily, there are several distributions of the SciPy Stack available that can keep the menagerie under control. The following are Python distributions that include the SciPy Stack along with many other popular Python packages or make the packages easily available through package management software: Anaconda from Continuum Analytics Canopy from Enthought SciPy Superpack Python(x, y) (Windows only) WinPython (Windows only) Pyzo (Python 3 only) Algorete Loopy from Dartmouth College Show() your work With Matplotlib installed, you are now ready to make your first simple plot. Matplotlib has multiple layers. Pylab is the topmost layer, often used for quick one-off plotting from within a live Python session. Start up your favorite Python interpreter and type the following: >>> from pylab import * >>> plot([1, 2, 3, 2, 1]) Nothing happened! This is because Matplotlib, by default, will not display anything until you explicitly tell it to do so. The Matplotlib library is often used for automated image generation from within Python scripts, with no need for any interactivity. Also, most users would not be done with their plotting yet and would find it distracting to have a plot come up automatically. When you are ready to see your plot, use the following command: >>> show() Interactive navigation A figure window should now appear, and the Python interpreter is not available for any additional commands. By default, showing a figure will block the execution of your scripts and interpreter. However, this does not mean that the figure is not interactive. As you mouse over the plot, you will see the plot coordinates in the lower right-hand corner. The figure window will also have a toolbar: From left to right, the following are the tools: Home, Back, and Forward: These are similar to that of a web browser. These buttons help you navigate through the previous views of your plot. The "Home" button will take you back to the first view when the figure was opened. "Back" will take you to the previous view, while "Forward" will return you to the previous views. Pan (and zoom): This button has two modes: pan and zoom. Press the left mouse button and hold it to pan the figure. If you press x or y while panning, the motion will be constrained to just the x or y axis, respectively. Press the right mouse button to zoom. The plot will be zoomed in or out proportionate to the right/left and up/down movements. Use the X, Y, or Ctrl key to constrain the zoom to the x axis or the y axis or preserve the aspect ratio, respectively. Zoom-to-rectangle: Press the left mouse button and drag the cursor to a new location and release. The axes view limits will be zoomed to the rectangle you just drew. Zoom out using your right mouse button, placing the current view into the region defined by the rectangle you just drew. Subplot configuration: This button brings up a tool to modify plot spacing. Save: This button brings up a dialog that allows you to save the current figure. The figure window would also be responsive to the keyboard. The default keymap is fairly extensive (and will be covered fully later), but some of the basic hot keys are the Home key for resetting the plot view, the left and right keys for back and forward actions, p for pan/zoom mode, o for zoom-to-rectangle mode, and Ctrl + s to trigger a file save. When you are done viewing your figure, close the window as you would close any other application window, or use Ctrl + w. Interactive plotting When we did the previous example, no plots appeared until show() was called. Furthermore, no new commands could be entered into the Python interpreter until all the figures were closed. As you will soon learn, once a figure is closed, the plot it contains is lost, which means that you would have to repeat all the commands again in order to show() it again, perhaps with some modification or additional plot. Matplotlib ships with its interactive plotting mode off by default. There are a couple of ways to turn the interactive plotting mode on. The main way is by calling the ion() function (for Interactive ON). Interactive plotting mode can be turned on at any time and turned off with ioff(). Once this mode is turned on, the next plotting command will automatically trigger an implicit show() command. Furthermore, you can continue typing commands into the Python interpreter. You can modify the current figure, create new figures, and close existing ones at any time, all from the current Python session. Scripted plotting Python is known for more than just its interactive interpreters; it is also a fully fledged programming language that allows its users to easily create programs. Having a script to display plots from daily reports can greatly improve your productivity. Alternatively, you perhaps need a tool that can produce some simple plots of the data from whatever mystery data file you have come across on the network share. Here is a simple example of how to use Matplotlib's pyplot API and the argparse Python standard library tool to create a simple CSV plotting script called plotfile.py. Code: chp1/plotfile.py#!/usr/bin/env python from argparse import ArgumentParserimport matplotlib.pyplot as pltif __name__ == '__main__':    parser = ArgumentParser(description="Plot a CSV file")    parser.add_argument("datafile", help="The CSV File")    # Require at least one column name    parser.add_argument("columns", nargs='+',                        help="Names of columns to plot")    parser.add_argument("--save", help="Save the plot as...")    parser.add_argument("--no-show", action="store_true",                        help="Don't show the plot")    args = parser.parse_args()      plt.plotfile(args.datafile, args.columns)    if args.save:        plt.savefig(args.save)    if not args.no_show:        plt.show() Note the two optional command-line arguments: --save and --no-show. With the --save option, the user can have the plot automatically saved (the graphics format is determined automatically from the filename extension). Also, the user can choose not to display the plot, which when coupled with the --save option might be desirable if the user is trying to plot several CSV files. When calling this script to show a plot, the execution of the script will stop at the call to plt.show(). If the interactive plotting mode was on, then the execution of the script would continue past show(), terminating the script, thus automatically closing out any figures before the user has had a chance to view them. This is why the interactive plotting mode is turned off by default in Matplotlib. Also note that the call to plt.savefig() is before the call to plt.show(). As mentioned before, when the figure window is closed, the plot is lost. You cannot save a plot after it has been closed. Getting help We have covered how to install Matplotlib and went over how to make very simple plots from a Python session or a Python script. Most likely, this went very smoothly for you.. You may be very curious and want to learn more about the many kinds of plots this library has to offer, or maybe you want to learn how to make new kinds of plots. Help comes in many forms. The Matplotlib website (http://matplotlib.org) is the primary online resource for Matplotlib. It contains examples, FAQs, API documentation, and, most importantly, the gallery. Gallery Many users of Matplotlib are often faced with the question, "I want to make a plot that has this data along with that data in the same figure, but it needs to look like this other plot I have seen." Text-based searches on graphing concepts are difficult, especially if you are unfamiliar with the terminology. The gallery showcases the variety of ways in which one can make plots, all using the Matplotlib library. Browse through the gallery, click on any figure that has pieces of what you want in your plot, and see the code that generated it. Soon enough, you will be like a chef, mixing and matching components to produce that perfect graph. Mailing lists and forums When you are just simply stuck and cannot figure out how to get something to work or just need some hints on how to get started, you will find much of the community at the Matplotlib-users mailing list. This mailing list is an excellent resource of information with many friendly members who just love to help out newcomers. Be persistent! While many questions do get answered fairly quickly, some will fall through the cracks. Try rephrasing your question or with a plot showing your attempts so far. The people at Matplotlib-users love plots, so an image that shows what is wrong often gets the quickest response. A newer community resource is StackOverflow, which has many very knowledgeable users who are able to answer difficult questions. From front to backend So far, we have shown you bits and pieces of two of Matplotlib's topmost abstraction layers: pylab and pyplot. The layer below them is the object-oriented layer (the OO layer). To develop any type of application, you will want to use this layer. Mixing the pylab/pyplot layers with the OO layer will lead to very confusing behaviors when dealing with multiple plots and figures. Below the OO layer is the backend interface. Everything above this interface level in Matplotlib is completely platform-agnostic. It will work the same regardless of whether it is in an interactive GUI or comes from a driver script running on a headless server. The backend interface abstracts away all those considerations so that you can focus on what is most important: writing code to visualize your data. There are several backend implementations that are shipped with Matplotlib. These backends are responsible for taking the figures represented by the OO layer and interpreting it for whichever "display device" they implement. The backends are chosen automatically but can be explicitly set, if needed. Interactive versus non-interactive There are two main classes of backends: ones that provide interactive figures and ones that don't. Interactive backends are ones that support a particular GUI, such as Tcl/Tkinter, GTK, Qt, Cocoa/Mac OS X, wxWidgets, and Cairo. With the exception of the Cocoa/Mac OS X backend, all interactive backends can be used on Windows, Linux, and Mac OS X. Therefore, when you make an interactive Matplotlib application that you wish to distribute to users of any of those platforms, unless you are embedding Matplotlib, you will not have to concern yourself with writing a single line of code for any of these toolkits—it has already been done for you! Non-interactive backends are used to produce image files. There are backends to produce Postscript/EPS, Adobe PDF, and Scalable Vector Graphics (SVG) as well as rasterized image files such as PNG, BMP, and JPEGs. Anti-grain geometry The open secret behind the high quality of Matplotlib's rasterized images is its use of the Anti-Grain Geometry (AGG) library (http://agg.sourceforge.net/antigrain.com/index.html). The quality of the graphics generated from AGG is far superior than most other toolkits available. Therefore, not only is AGG used to produce rasterized image files, but it is also utilized in most of the interactive backends as well. Matplotlib maintains and ships with its own fork of the library in order to ensure you have consistent, high quality image products across all platforms and toolkits. What you see on your screen in your interactive figure window will be the same as the PNG file that is produced when you call savefig(). Selecting your backend When you install Matplotlib, a default backend is chosen for you based upon your OS and the available GUI toolkits. For example, on Mac OS X systems, your installation of the library will most likely set the default interactive backend to MacOSX or CocoaAgg for older Macs. Meanwhile, Windows users will most likely have a default of TkAgg or Qt5Agg. In most situations, the choice of interactive backends will not matter. However, in certain situations, it may be necessary to force a particular backend to be used. For example, on a headless server without an active graphics session, you would most likely need to force the use of the non-interactive Agg backend: import matplotlibmatplotlib.use("Agg") When done prior to any plotting commands, this will avoid loading any GUI toolkits, thereby bypassing problems that occur when a GUI fails on a headless server. Any call to show() effectively becomes a no-op (and the execution of the script is not blocked). Another purpose of setting your backend is for scenarios when you want to embed your plot in a native GUI application. Therefore, you will need to explicitly state which GUI toolkit you are using. Finally, some users simply like the look and feel of some GUI toolkits better than others. They may wish to change the default backend via the backend parameter in the matplotlibrc configuration file. Most likely, your rc file can be found in the .matplotlib directory or the .config/matplotlib directory under your home folder. If you can't find it, then use the following set of commands: >>> import matplotlib >>> matplotlib.matplotlib_fname() u'/home/ben/.config/matplotlib/matplotlibrc' Here is an example of the relevant section in my matplotlibrc file: #### CONFIGURATION BEGINS HERE   # the default backend; one of GTK GTKAgg GTKCairo GTK3Agg # GTK3Cairo CocoaAgg MacOSX QtAgg Qt4Agg TkAgg WX WXAgg Agg Cairo # PS PDF SVG # You can also deploy your own backend outside of matplotlib by # referring to the module name (which must be in the PYTHONPATH) # as 'module://my_backend' #backend     : GTKAgg #backend     : QT4Agg backend     : TkAgg # If you are using the Qt4Agg backend, you can choose here # to use the PyQt4 bindings or the newer PySide bindings to # the underlying Qt4 toolkit. #backend.qt4 : PyQt4       # PyQt4 | PySide This is the global configuration file that is used if one isn't found in the current working directory when Matplotlib is imported. The settings contained in this configuration serves as default values for many parts of Matplotlib. In particular, we see that the choice of backends can be easily set without having to use a single line of code. The Matplotlib figure-artist hierarchy Everything that can be drawn in Matplotlib is called an artist. Any artist can have child artists that are also drawable. This forms the basis of a hierarchy of artist objects that Matplotlib sends to a backend for rendering. At the root of this artist tree is the figure. In the examples so far, we have not explicitly created any figures. The pylab and pyplot interfaces will create the figures for us. However, when creating advanced interactive applications, it is highly recommended that you explicitly create your figures. You will especially want to do this if you have multiple figures being displayed at the same time. This is the entry into the OO layer of Matplotlib: fig = plt.figure() Canvassing the figure The figure is, quite literally, your canvas. Its primary component is the FigureCanvas instance upon which all drawing occurs. Unless you are embedding your Matplotlib figures into a GUI application, it is very unlikely that you will need to interact with this object directly. Instead, as plotting commands are issued, artist objects are added to the canvas automatically. While any artist can be added directly to the figure, usually only Axes objects are added. A figure can have many axes objects, typically called subplots. Much like the figure object, our examples so far have not explicitly created any axes objects to use. This is because the pylab and pyplot interfaces will also automatically create and manage axes objects for a figure if needed. For the same reason as for figures, you will want to explicitly create these objects when building your interactive applications. If an axes or a figure is not provided, then the pyplot layer will have to make assumptions about which axes or figure you mean to apply a plotting command to. While this might be fine for simple situations, these assumptions get hairy very quickly in non-trivial applications. Luckily, it is easy to create both your figure and its axes using a single command: fig, axes = plt.subplots(2, 1) # 2x1 grid of subplots These objects are highly advanced complex units that most developers will utilize for their plotting needs. Once placed on the figure canvas, the axes object will provide the ticks, axis labels, axes title(s), and the plotting area. An axes is an artist that manages all of its scale and coordinate transformations (for example, log scaling and polar coordinates), automated tick labeling, and automated axis limits. In addition to these responsibilities, an axes object provides a wide assortment of plotting functions. A sampling of plotting functions is as follows: Function Description bar Make a bar plot barbs Plot a two-dimensional field of barbs boxplot Make a box and whisker plot cohere Plot the coherence between x and y contour Plot contours errorbar Plot an errorbar graph hexbin Make a hexagonal binning plot hist Plot a histogram imshow Display an image on the axes pcolor Create a pseudocolor plot of a two-dimensional array pcolormesh Plot a quadrilateral mesh pie Plot a pie chart plot Plot lines and/or markers quiver Plot a two-dimensional field of arrows sankey Create a Sankey flow diagram scatter Make a scatter plot of x versus y stem Create a stem plot streamplot Draw streamlines of a vector flow This application will be a storm track editing application. Given a series of radar images, the user can circle each storm cell they see in the radar image and link those storm cells across time. The application will need the ability to save and load track data and provide the user with mechanisms to edit the data. Along the way, we will learn about Matplotlib's structure, its artists, the callback system, doing animations, and finally, embedding this application within a larger GUI application. So, to begin, we first need to be able to view a radar image. There are many ways to load data into a Python program but one particular favorite among meteorologists is the Network Common Data Form (NetCDF) file. The SciPy package has built-in support for NetCDF version 3, so we will be using an hour's worth of radar reflectivity data prepared using this format from a NEXRAD site near Oklahoma City, OK on the evening of May 10, 2010, which produced numerous tornadoes and severe storms. The NetCDF binary file is particularly nice to work with because it can hold multiple data variables in a single file, with each variable having an arbitrary number of dimensions. Furthermore, metadata can be attached to each variable and to the dataset itself, allowing you to self-document data files. This particular data file has three variables, namely Reflectivity, lat, and lon to record the radar reflectivity values and the latitude and longitude coordinates of each pixel in the reflectivity data. The reflectivity data is three-dimensional, with the first dimension as time and the other two dimensions as latitude and longitude. The following code example shows how easy it is to load this data and display the first image frame using SciPy and Matplotlib. Code: chp1/simple_radar_viewer.py import matplotlib.pyplot as plt from scipy.io import netcdf_file   ncf = netcdf_file('KTLX_20100510_22Z.nc') data = ncf.variables['Reflectivity'] lats = ncf.variables['lat'] lons = ncf.variables['lon'] i = 0   cmap = plt.get_cmap('gist_ncar') cmap.set_under('lightgrey')   fig, ax = plt.subplots(1, 1) im = ax.imshow(data[i], origin='lower',                extent=(lons[0], lons[-1], lats[0], lats[-1]),               vmin=0.1, vmax=80, cmap='gist_ncar') cb = fig.colorbar(im)   cb.set_label('Reflectivity (dBZ)') ax.set_xlabel('Longitude') ax.set_ylabel('Latitude') plt.show() Running this script should result in a figure window that will display the first frame of our storms. The plot has a colorbar and the axes ticks label the latitudes and longitudes of our data. What is probably most important in this example is the imshow() call. Being an image, traditionally, the origin of the image data is shown in the upper-left corner and Matplotlib follows this tradition by default. However, this particular dataset was saved with its origin in the lower-left corner, so we need to state this with the origin parameter. The extent parameter is a tuple describing the data extent of the image. By default, it is assumed to be at (0, 0) and (N – 1, M – 1) for an MxN shaped image. The vmin and vmax parameters are a good way to ensure consistency of your colormap regardless of your input data. If these two parameters are not supplied, then imshow() will use the minimum and maximum of the input data to determine the colormap. This would be undesirable as we move towards displaying arbitrary frames of radar data. Finally, one can explicitly specify the colormap to use for the image. The gist_ncar colormap is very similar to the official NEXRAD colormap for radar data, so we will use it here: The gist_ncar colormap, along with some other colormaps packaged with Matplotlib such as the default jet colormap, are actually terrible for visualization. See the Choosing Colormaps page of the Matplotlib website for an explanation of why, and guidance on how to choose a better colormap. The menagerie of artists Whenever a plotting function is called, the input data and parameters are processed to produce new artists to represent the data. These artists are either primitives or collections thereof. They are called primitives because they represent basic drawing components such as lines, images, polygons, and text. It is with these primitives that your data can be represented as bar charts, line plots, errorbars, or any other kinds of plots. Primitives There are four drawing primitives in Matplotlib: Line2D, AxesImage, Patch, and Text. It is through these primitive artists that all other artist objects are derived from, and they comprise everything that can be drawn in a figure. A Line2D object uses a list of coordinates to draw line segments in between. Typically, the individual line segments are straight, and curves can be approximated with many vertices; however, curves can be specified to draw arcs, circles, or any other Bezier-approximated curves. An AxesImage class will take two-dimensional data and coordinates and display an image of that data with a colormap applied to it. There are actually other kinds of basic image artists available besides AxesImage, but they are typically for very special uses. AxesImage objects can be very tricky to deal with, so it is often best to use the imshow() plotting method to create and return these objects. A Patch object is an arbitrary two-dimensional object that has a single color for its "face." A polygon object is a specific instance of the slightly more general patch. These objects have a "path" (much like a Line2D object) that specifies segments that would enclose a face with a single color. The path is known as an "edge," and can have its own color as well. Besides the Polygons that one sees for bar plots and pie charts, Patch objects are also used to create arrows, legend boxes, and the markers used in scatter plots and elsewhere. Finally, the Text object takes a Python string, a point coordinate, and various font parameters to form the text that annotates plots. Matplotlib primarily uses TrueType fonts. It will search for fonts available on your system as well as ship with a few FreeType2 fonts, and it uses Bitstream Vera by default. Additionally, a Text object can defer to LaTeX to render its text, if desired. While specific artist classes will have their own set of properties that make sense for the particular art object they represent, there are several common properties that can be set. The following table is a listing of some of these properties. Property Meaning alpha 0 represents transparent and 1 represents opaque color Color name or other color specification visible boolean to flag whether to draw the artist or not zorder value of the draw order in the layering engine Let's extend the radar image example by loading up already saved polygons of storm cells in the tutorial.py file. Code: chp1/simple_storm_cell_viewer.py import matplotlib.pyplot as plt from scipy.io import netcdf_file from matplotlib.patches import Polygon from tutorial import polygon_loader   ncf = netcdf_file('KTLX_20100510_22Z.nc') data = ncf.variables['Reflectivity'] lats = ncf.variables['lat'] lons = ncf.variables['lon'] i = 0   cmap = plt.get_cmap('gist_ncar') cmap.set_under('lightgrey')   fig, ax = plt.subplots(1, 1) im = ax.imshow(data[i], origin='lower',                extent=(lons[0], lons[-1], lats[0], lats[-1]),                vmin=0.1, vmax=80, cmap='gist_ncar') cb = fig.colorbar(im)   polygons = polygon_loader('polygons.shp') for poly in polygons[i]:    p = Polygon(poly, lw=3, fc='k', ec='w', alpha=0.45)    ax.add_artist(p) cb.set_label("Reflectivity (dBZ)") ax.set_xlabel("Longitude") ax.set_ylabel("Latitude") plt.show() The polygon data returned from polygon_loader() is a dictionary of lists keyed by a frame index. The list contains Nx2 numpy arrays of vertex coordinates in longitude and latitude. The vertices form the outline of a storm cell. The Polygon constructor, like all other artist objects, takes many optional keyword arguments. First, lw is short for linewidth, (referring to the outline of the polygon), which we specify to be three points wide. Next is fc, which is short for facecolor, and is set to black ('k'). This is the color of the filled-in region of the polygon. Then edgecolor (ec) is set to white ('w') to help the polygons stand out against a dark background. Finally, we set the alpha argument to be slightly less than half to make the polygon fairly transparent so that one can still see the reflectivity data beneath the polygons. Note a particular difference between how we plotted the image using imshow() and how we plotted the polygons using polygon artists. For polygons, we called a constructor and then explicitly called ax.add_artist() to add each polygon instance as a child of the axes. Meanwhile, imshow() is a plotting function that will do all of the hard work in validating the inputs, building the AxesImage instance, making all necessary modifications to the axes instance (such as setting the limits and aspect ratio), and most importantly, adding the artist object to the axes. Finally, all plotting functions in Matplotlib return artists or a list of artist objects that it creates. In most cases, you will not need to save this return value in a variable because there is nothing else to do with them. In this case, we only needed the returned AxesImage so that we could pass it to the fig.colorbar() method. This is so that it would know what to base the colorbar upon. The plotting functions in Matplotlib exist to provide convenience and simplicity to what can often be very tricky to get right by yourself. They are not magic! They use the same OO interface that is accessible to application developers. Therefore, anyone can write their own plotting functions to make complicated plots easier to perform. Collections Any artist that has child artists (such as a figure or an axes) is called a container. A special kind of container in Matplotlib is called a Collection. A collection usually contains a list of primitives of the same kind that should all be treated similarly. For example, a CircleCollection would have a list of Circle objects, all with the same color, size, and edge width. Individual values for artists in the collection can also be set. A collection makes management of many artists easier. This becomes especially important when considering the number of artist objects that may be needed for scatter plots, bar charts, or any other kind of plot or diagram. Some collections are not just simply a list of primitives, but are artists in their own right. These special kinds of collections take advantage of various optimizations that can be assumed when rendering similar or identical things. RegularPolyCollection, for example, just needs to know the points of a single polygon relative to its center (such as a star or box) and then just needs a list of all the center coordinates, avoiding the need to store all the vertices of every polygon in its collection in memory. In the following example, we will display storm tracks as LineCollection. Note that instead of using ax.add_artist() (which would work), we will use ax.add_collection() instead. This has the added benefit of performing special handling on the object to determine its bounding box so that the axes object can incorporate the limits of this collection with any other plotted objects to automatically set its own limits which we trigger with the ax.autoscale(True) call. Code: chp1/linecoll_track_viewer.py import matplotlib.pyplot as plt from matplotlib.collections import LineCollection from tutorial import track_loader   tracks = track_loader('polygons.shp') # Filter out non-tracks (unassociated polygons given trackID of -9) tracks = {tid: t for tid, t in tracks.items() if tid != -9}   fig, ax = plt.subplots(1, 1) lc = LineCollection(tracks.values(), color='b') ax.add_collection(lc) ax.autoscale(True) ax.set_xlabel("Longitude") ax.set_ylabel("Latitude") plt.show() Much easier than the radar images, Matplotlib took care of all the limit setting automatically. Such features are extremely useful for writing generic applications that do not wish to concern themselves with such details. Summary In this article, we introduced you to the foundational concepts of Matplotlib. Using show(), you showed your first plot with only three lines of Python. With this plot up on your screen, you learned some of the basic interactive features built into Matplotlib, such as panning, zooming, and the myriad of key bindings that are available. Then we discussed the difference between interactive and non-interactive plotting modes and the difference between scripted and interactive plotting. You now know where to go online for more information, examples, and forum discussions of Matplotlib when it comes time for you to work on your next Matplotlib project. Next, we discussed the architectural concepts of Matplotlib: backends, figures, axes, and artists. Then we started our construction project, an interactive storm cell tracking application. We saw how to plot a radar image using a pre-existing plotting function, as well as how to display polygons and lines as artists and collections. While creating these objects, we had a glimpse of how to customize the properties of these objects for our display needs, learning some of the property and styling names. We also learned some of the steps one needs to consider when creating their own plotting functions, such as autoscaling. Resources for Article: Further resources on this subject: The plot function [article] First Steps [article] Machine Learning in IPython with scikit-learn [article]
Read more
  • 0
  • 2
  • 7670

article-image-finding-people-and-things
Packt
19 Mar 2015
18 min read
Save for later

Finding People and Things

Packt
19 Mar 2015
18 min read
In this article by Richard M Reese, author of the book Natural Language Processing with Java, we will see how to use NLP APIs. Using NLP APIs We will demonstrate the NER process using OpenNLP, Stanford API, and LingPipe. Each of these provide alternate techniques that can often do a good job of identifying entities in the text. The following declaration will serve as the sample text to demonstrate the APIs: String sentences[] = {"Joe was the last person to see Fred. ", "He saw him in Boston at McKenzie's pub at 3:00 where he " + " paid $2.45 for an ale. ", "Joe wanted to go to Vermont for the day to visit a cousin who " + "works at IBM, but Sally and he had to look for Fred"}; Using OpenNLP for NER We will demonstrate the use of the TokenNameFinderModel class to perform NLP using the OpenNLP API. Additionally, we will demonstrate how to determine the probability that the entity identified is correct. The general approach is to convert the text into a series of tokenized sentences, create an instance of the TokenNameFinderModel class using an appropriate model, and then use the find method to identify the entities in the text. The following example demonstrates the use of the TokenNameFinderModel class. We will use a simple sentence initially and then use multiple sentences. The sentence is defined here: String sentence = "He was the last person to see Fred."; We will use the models found in the en-token.bin and en-ner-person.bin files for the tokenizer and name finder models, respectively. The InputStream object for these files is opened using a try-with-resources block, as shown here: try (InputStream tokenStream = new FileInputStream(        new File(getModelDir(), "en-token.bin"));        InputStream modelStream = new FileInputStream(            new File(getModelDir(), "en-ner-person.bin"));) {    ...   } catch (Exception ex) {    // Handle exceptions } Within the try block, the TokenizerModel and Tokenizer objects are created:    TokenizerModel tokenModel = new TokenizerModel(tokenStream);    Tokenizer tokenizer = new TokenizerME(tokenModel); Next, an instance of the NameFinderME class is created using the person model: TokenNameFinderModel entityModel =    new TokenNameFinderModel(modelStream); NameFinderME nameFinder = new NameFinderME(entityModel); We can now use the tokenize method to tokenize the text and the find method to identify the person in the text. The find method will use the tokenized String array as input and return an array of Span objects, as shown: String tokens[] = tokenizer.tokenize(sentence); Span nameSpans[] = nameFinder.find(tokens); The Span class holds positional information about the entities found. The actual string entities are still in the tokens array: The following for statement displays the person found in the sentence. Its positional information and the person are displayed on separate lines: for (int i = 0; i < nameSpans.length; i++) {    System.out.println("Span: " + nameSpans[i].toString());    System.out.println("Entity: "        + tokens[nameSpans[i].getStart()]); } The output is as follows: Span: [7..9) person Entity: Fred We will often work with multiple sentences. To demonstrate this, we will use the previously defined sentences string array. The previous for statement is replaced with the following sequence. The tokenize method is invoked against each sentence and then the entity information is displayed as earlier: for (String sentence : sentences) {    String tokens[] = tokenizer.tokenize(sentence);    Span nameSpans[] = nameFinder.find(tokens);    for (int i = 0; i < nameSpans.length; i++) {        System.out.println("Span: " + nameSpans[i].toString());        System.out.println("Entity: "            + tokens[nameSpans[i].getStart()]);    }    System.out.println(); } The output is as follows. There is an extra blank line between the two people detected because the second sentence did not contain a person: Span: [0..1) person Entity: Joe Span: [7..9) person Entity: Fred     Span: [0..1) person Entity: Joe Span: [19..20) person Entity: Sally Span: [26..27) person Entity: Fred Determining the accuracy of the entity When the TokenNameFinderModel identifies entities in text, it computes a probability for that entity. We can access this information using the probs method as shown in the following line of code. This method returns an array of doubles, which corresponds to the elements of the nameSpans array: double[] spanProbs = nameFinder.probs(nameSpans); Add this statement to the previous example immediately after the use of the find method. Then add the next statement at the end of the nested for statement: System.out.println("Probability: " + spanProbs[i]); When the example is executed, you will get the following output. The probability fields reflect the confidence level of the entity assignment. For the first entity, the model is 80.529 percent confident that "Joe" is a person: Span: [0..1) person Entity: Joe Probability: 0.8052914774025202 Span: [7..9) person Entity: Fred Probability: 0.9042160889302772   Span: [0..1) person Entity: Joe Probability: 0.9620970782763985 Span: [19..20) person Entity: Sally Probability: 0.964568603518126 Span: [26..27) person Entity: Fred Probability: 0.990383039618594 Using other entity types OpenNLP supports different libraries as listed in the following table. These models can be downloaded from http://opennlp.sourceforge.net/models-1.5/. The prefix, en, specifies English as the language and ner indicates that the model is for NER. English finder models Filename Location name finder model en-ner-location.bin Money name finder model en-ner-money.bin Organization name finder model en-ner-organization.bin Percentage name finder model en-ner-percentage.bin Person name finder model en-ner-person.bin Time name finder model en-ner-time.bin If we modify the statement to use a different model file, we can see how they work against the sample sentences: InputStream modelStream = new FileInputStream(    new File(getModelDir(), "en-ner-time.bin"));) { When the en-ner-money.bin model is used, the index in the tokens array in the earlier code sequence has to be increased by one. Otherwise, all that is returned is the dollar sign. The various outputs are shown in the following table. Model Output en-ner-location.bin Span: [4..5) location Entity: Boston Probability: 0.8656908776583051 Span: [5..6) location Entity: Vermont Probability: 0.9732488014011262 en-ner-money.bin Span: [14..16) money Entity: 2.45 Probability: 0.7200919701507937 en-ner-organization.bin Span: [16..17) organization Entity: IBM Probability: 0.9256970736336729 en-ner-time.bin The model was not able to detect time in this text sequence The model failed to find the time entities in the sample text. This illustrates that the model did not have enough confidence that it found any time entities in the text. Processing multiple entity types We can also handle multiple entity types at the same time. This involves creating instances of the NameFinderME class based on each model within a loop and applying the model against each sentence, keeping track of the entities as they are found. We will illustrate this process with the following example. It requires rewriting the previous try block to create the InputStream instance within the block, as shown here: try {    InputStream tokenStream = new FileInputStream(        new File(getModelDir(), "en-token.bin"));    TokenizerModel tokenModel = new TokenizerModel(tokenStream);    Tokenizer tokenizer = new TokenizerME(tokenModel);    ... } catch (Exception ex) {    // Handle exceptions } Within the try block, we will define a string array to hold the names of the model files. As shown here, we will use models for people, locations, and organizations: String modelNames[] = {"en-ner-person.bin",    "en-ner-location.bin", "en-ner-organization.bin"}; An ArrayList instance is created to hold the entities as they are discovered: ArrayList<String> list = new ArrayList(); A for-each statement is used to load one model at a time and then to create an instance of the NameFinderME class: for(String name : modelNames) {    TokenNameFinderModel entityModel = new TokenNameFinderModel(        new FileInputStream(new File(getModelDir(), name)));    NameFinderME nameFinder = new NameFinderME(entityModel);    ... } Previously, we did not try to identify which sentences the entities were found in. This is not hard to do but we need to use a simple for statement instead of a for-each statement to keep track of the sentence indexes. This is shown in the following example, where the previous example has been modified to use the integer variable index to keep the sentences. Otherwise, the code works the same way as earlier: for (int index = 0; index < sentences.length; index++) {    String tokens[] = tokenizer.tokenize(sentences[index]);    Span nameSpans[] = nameFinder.find(tokens);    for(Span span : nameSpans) {        list.add("Sentence: " + index            + " Span: " + span.toString() + " Entity: "            + tokens[span.getStart()]);    } } The entities discovered are then displayed: for(String element : list) {    System.out.println(element); } The output is as follows: Sentence: 0 Span: [0..1) person Entity: Joe Sentence: 0 Span: [7..9) person Entity: Fred Sentence: 2 Span: [0..1) person Entity: Joe Sentence: 2 Span: [19..20) person Entity: Sally Sentence: 2 Span: [26..27) person Entity: Fred Sentence: 1 Span: [4..5) location Entity: Boston Sentence: 2 Span: [5..6) location Entity: Vermont Sentence: 2 Span: [16..17) organization Entity: IBM Using the Stanford API for NER We will demonstrate the CRFClassifier class as used to perform NER. This class implements what is known as a linear chain Conditional Random Field (CRF) sequence model. To demonstrate the use of the CRFClassifier class, we will start with a declaration of the classifier file string, as shown here: String model = getModelDir() +    "\english.conll.4class.distsim.crf.ser.gz"; The classifier is then created using the model: CRFClassifier<CoreLabel> classifier =    CRFClassifier.getClassifierNoExceptions(model); The classify method takes a single string representing the text to be processed. To use the sentences text, we need to convert it to a simple string: String sentence = ""; for (String element : sentences) {    sentence += element; } The classify method is then applied to the text. List<List<CoreLabel>> entityList = classifier.classify(sentence); A List instance of List instances of CoreLabel objects is returned. The object returned is a list that contains another list. The contained list is a List instance of CoreLabel objects. The CoreLabel class represents a word with additional information attached to it. The "internal" list contains a list of these words. In the outer for-each statement in the following code sequence, the reference variable, internalList, represents one sentence of the text. In the inner for-each statement, each word in that inner list is displayed. The word method returns the word and the get method returns the type of the word. The words and their types are then displayed: for (List<CoreLabel> internalList: entityList) {    for (CoreLabel coreLabel : internalList) {        String word = coreLabel.word();        String category = coreLabel.get(            CoreAnnotations.AnswerAnnotation.class);        System.out.println(word + ":" + category);    } } Part of the output follows. It has been truncated because every word is displayed. The O represents the "Other" category: Joe:PERSON was:O the:O last:O person:O to:O see:O Fred:PERSON .:O He:O ... look:O for:O Fred:PERSON To filter out the words that are not relevant, replace the println statement with the following statements. This will eliminate the other categories: if (!"O".equals(category)) {    System.out.println(word + ":" + category); } The output is simpler now: Joe:PERSON Fred:PERSON Boston:LOCATION McKenzie:PERSON Joe:PERSON Vermont:LOCATION IBM:ORGANIZATION Sally:PERSON Fred:PERSON Using LingPipe for NER We will demonstrate how name entity models and the ExactDictionaryChunker class are used to perform NER analysis. Using LingPipe's name entity models LingPipe has a few named entity models that we can use with chunking. These files consist of a serialized object that can be read from a file and then applied to text. These objects implement the Chunker interface. The chunking process results in a series of Chunking objects that identify the entities of interest. A list of the NER models is found in the following table. These models can be downloaded from http://alias-i.com/lingpipe/web/models.html: Genre Corpus File English News MUC-6 ne-en-news-muc6.AbstractCharLmRescoringChunker English Genes GeneTag ne-en-bio-genetag.HmmChunker English Genomics GENIA ne-en-bio-genia.TokenShapeChunker We will use the model found in the ne-en-news-muc6.AbstractCharLmRescoringChunker file to demonstrate how this class is used. We start with a try-catch block to deal with exceptions as shown in the following example. The file is opened and used with the AbstractExternalizable class' static readObject method to create an instance of a Chunker class. This method will read in the serialized model: try {    File modelFile = new File(getModelDir(),        "ne-en-news-muc6.AbstractCharLmRescoringChunker");      Chunker chunker = (Chunker)        AbstractExternalizable.readObject(modelFile);    ... } catch (IOException | ClassNotFoundException ex) {    // Handle exception } The Chunker and Chunking interfaces provide methods that work with a set of chunks of text. Its chunk method returns an object that implements the Chunking instance. The following sequence displays the chunks found in each sentence of the text, as shown here: for (int i = 0; i < sentences.length; ++i) {    Chunking chunking = chunker.chunk(sentences[i]);    System.out.println("Chunking=" + chunking); } The output of this sequence is as follows: Chunking=Joe was the last person to see Fred. : [0-3:PERSON@-Infinity, 31-35:ORGANIZATION@-Infinity] Chunking=He saw him in Boston at McKenzie's pub at 3:00 where he paid $2.45 for an ale. : [14-20:LOCATION@-Infinity, 24-32:PERSON@-Infinity] Chunking=Joe wanted to go to Vermont for the day to visit a cousin who works at IBM, but Sally and he had to look for Fred : [0-3:PERSON@-Infinity, 20-27:ORGANIZATION@-Infinity, 71-74:ORGANIZATION@-Infinity, 109-113:ORGANIZATION@-Infinity] Instead, we can use methods of the Chunk class to extract specific pieces of information as illustrated here. We will replace the previous for statement with the following for-each statement. This calls a displayChunkSet method: for (String sentence : sentences) {    displayChunkSet(chunker, sentence); } The output that follows shows the result. However, it does not always match the entity type correctly. Type: PERSON Entity: [Joe] Score: -Infinity Type: ORGANIZATION Entity: [Fred] Score: -Infinity Type: LOCATION Entity: [Boston] Score: -Infinity Type: PERSON Entity: [McKenzie] Score: -Infinity Type: PERSON Entity: [Joe] Score: -Infinity Type: ORGANIZATION Entity: [Vermont] Score: -Infinity Type: ORGANIZATION Entity: [IBM] Score: -Infinity Type: ORGANIZATION Entity: [Fred] Score: -Infinity Using the ExactDictionaryChunker class The ExactDictionaryChunker class provides an easy way to create a dictionary of entities and their types, which can be used to find them later in text. It uses a MapDictionary object to store entries and then the ExactDictionaryChunker class is used to extract chunks based on the dictionary. The AbstractDictionary interface supports basic operations for entities, categories, and scores. The score is used in the matching process. The MapDictionary and TrieDictionary classes implement the AbstractDictionary interface. The TrieDictionary class stores information using a character trie structure. This approach uses less memory when it is a concern. We will use the MapDictionary class for our example. To illustrate this approach, we start with a declaration of the MapDictionary class: private MapDictionary<String> dictionary; The dictionary will contain the entities that we are interested in finding. We need to initialize the model as performed in the following initializeDictionary method. The DictionaryEntry constructor used here accepts three arguments: String: The name of the entity String: The category of the entity Double: Represent a score for the entity The score is used when determining matches. A few entities are declared and added to the dictionary. private static void initializeDictionary() {    dictionary = new MapDictionary<String>();    dictionary.addEntry(        new DictionaryEntry<String>("Joe","PERSON",1.0));    dictionary.addEntry(        new DictionaryEntry<String>("Fred","PERSON",1.0));    dictionary.addEntry(        new DictionaryEntry<String>("Boston","PLACE",1.0));    dictionary.addEntry(        new DictionaryEntry<String>("pub","PLACE",1.0));    dictionary.addEntry(        new DictionaryEntry<String>("Vermont","PLACE",1.0));    dictionary.addEntry(        new DictionaryEntry<String>("IBM","ORGANIZATION",1.0));    dictionary.addEntry(        new DictionaryEntry<String>("Sally","PERSON",1.0)); } An ExactDictionaryChunker instance will use this dictionary. The arguments of the ExactDictionaryChunker class are detailed here: Dictionary<String>: It is a dictionary containing the entities TokenizerFactory: It is a tokenizer used by the chunker boolean: If it is true, the chunker should return all matches boolean: If it is true, matches are case sensitive Matches can be overlapping. For example, in the phrase "The First National Bank", the entity "bank" could be used by itself or in conjunction with the rest of the phrase. The third parameter determines if all of the matches are returned. In the following sequence, the dictionary is initialized. We then create an instance of the ExactDictionaryChunker class using the Indo-European tokenizer, where we return all matches and ignore the case of the tokens: initializeDictionary(); ExactDictionaryChunker dictionaryChunker    = new ExactDictionaryChunker(dictionary,        IndoEuropeanTokenizerFactory.INSTANCE, true, false); The dictionaryChunker object is used with each sentence, as shown in the following code sequence. We will use the displayChunkSet method: for (String sentence : sentences) {    System.out.println("nTEXT=" + sentence);    displayChunkSet(dictionaryChunker, sentence); } On execution, we get the following output: TEXT=Joe was the last person to see Fred. Type: PERSON Entity: [Joe] Score: 1.0 Type: PERSON Entity: [Fred] Score: 1.0   TEXT=He saw him in Boston at McKenzie's pub at 3:00 where he paid $2.45 for an ale. Type: PLACE Entity: [Boston] Score: 1.0 Type: PLACE Entity: [pub] Score: 1.0   TEXT=Joe wanted to go to Vermont for the day to visit a cousin who works at IBM, but Sally and he had to look for Fred Type: PERSON Entity: [Joe] Score: 1.0 Type: PLACE Entity: [Vermont] Score: 1.0 Type: ORGANIZATION Entity: [IBM] Score: 1.0 Type: PERSON Entity: [Sally] Score: 1.0 Type: PERSON Entity: [Fred] Score: 1.0 This does a pretty good job but it requires a lot of effort to create the dictionary for a large vocabulary. Training a model We will use OpenNLP to demonstrate how a model is trained. The training file used must: Contain marks to demarcate the entities Have one sentence per line We will use the following model file named en-ner-person.train: <START:person> Joe <END> was the last person to see <START:person> Fred <END>. He saw him in Boston at McKenzie's pub at 3:00 where he paid $2.45 for an ale. <START:person> Joe <END> wanted to go to Vermont for the day to visit a cousin who works at IBM, but <START:person> Sally <END> and he had to look for <START:person> Fred <END>. Several methods of this example are capable of throwing exceptions. These statements will be placed in a try-with-resource block as shown here, where the model's output stream is created: try (OutputStream modelOutputStream = new BufferedOutputStream(        new FileOutputStream(new File("modelFile")));) {    ... } catch (IOException ex) {    // Handle exception } Within the block, we create an OutputStream<String> object using the PlainTextByLineStream class. This class' constructor takes a FileInputStream instance and returns each line as a String object. The en-ner-person.train file is used as the input file, as shown here. The UTF-8 string refers to the encoding sequence used: ObjectStream<String> lineStream = new PlainTextByLineStream(    new FileInputStream("en-ner-person.train"), "UTF-8"); The lineStream object contains streams that are annotated with tags delineating the entities in the text. These need to be converted to the NameSample objects so that the model can be trained. This conversion is performed by the NameSampleDataStream class as shown here. A NameSample object holds the names of the entities found in the text: ObjectStream<NameSample> sampleStream =    new NameSampleDataStream(lineStream); The train method can now be executed as follows: TokenNameFinderModel model = NameFinderME.train(    "en", "person", sampleStream,    Collections.<String, Object>emptyMap(), 100, 5); The arguments of the method are as detailed in the following table: Parameter Meaning "en" Language Code "person" Entity type sampleStream Sample data null Resources 100 The number of iterations 5 The cutoff The model is then serialized to an output file: model.serialize(modelOutputStream); The output of this sequence is as follows. It has been shortened to conserve space. Basic information about the model creation is detailed: Indexing events using cutoff of 5   Computing event counts... done. 53 events Indexing... done. Sorting and merging events... done. Reduced 53 events to 46. Done indexing. Incorporating indexed data for training... Number of Event Tokens: 46      Number of Outcomes: 2    Number of Predicates: 34 ...done. Computing model parameters ... Performing 100 iterations. 1: ... loglikelihood=-36.73680056967707 0.05660377358490566 2: ... loglikelihood=-17.499660626361216 0.9433962264150944 3: ... loglikelihood=-13.216835449617108 0.9433962264150944 4: ... loglikelihood=-11.461783667999262 0.9433962264150944 5: ... loglikelihood=-10.380239416084963 0.9433962264150944 6: ... loglikelihood=-9.570622475692486 0.9433962264150944 7: ... loglikelihood=-8.919945779143012 0.9433962264150944 ... 99: ... loglikelihood=-3.513810438211968 0.9622641509433962 100: ... loglikelihood=-3.507213816708068 0.9622641509433962 Evaluating a model The model can be evaluated using the TokenNameFinderEvaluator class. The evaluation process uses marked up sample text to perform the evaluation. For this simple example, a file called en-ner-person.eval was created that contained the following text: <START:person> Bill <END> went to the farm to see <START:person> Sally <END>. Unable to find <START:person> Sally <END> he went to town. There he saw <START:person> Fred <END> who had seen <START:person> Sally <END> at the book store with <START:person> Mary <END>. The following code is used to perform the evaluation. The previous model is used as the argument of the TokenNameFinderEvaluator constructor. A NameSampleDataStream instance is created based on the evaluation file. The TokenNameFinderEvaluator class' evaluate method performs the evaluation: TokenNameFinderEvaluator evaluator =    new TokenNameFinderEvaluator(new NameFinderME(model));   lineStream = new PlainTextByLineStream(    new FileInputStream("en-ner-person.eval"), "UTF-8"); sampleStream = new NameSampleDataStream(lineStream); evaluator.evaluate(sampleStream); To determine how well the model worked with the evaluation data, the getFMeasure method is executed. The results are then displayed: FMeasure result = evaluator.getFMeasure(); System.out.println(result.toString()); The following output displays the precision, recall, and F-measure. It indicates that 50 percent of the entities found exactly match the evaluation data. The recall is the percentage of entities defined in the corpus that were found in the same location. The performance measure is the harmonic mean and is defined as: F1 = 2 * Precision * Recall / (Recall + Precision) Precision: 0.5 Recall: 0.25 F-Measure: 0.3333333333333333 The data and evaluation sets should be much larger to create a better model. The intent here was to demonstrate the basic approach used to train and evaluate a POS model. Summary We investigated several techniques for performing NER. Regular expressions is one approach that is supported by both core Java classes and NLP APIs. This technique is useful for many applications and there are a large number of regular expression libraries available. Dictionary-based approaches are also possible and work well for some applications. However, they require considerable effort to populate at times. We used LingPipe's MapDictionary class to illustrate this approach. Resources for Article: Further resources on this subject: Tuning Solr JVM and Container [article] Model-View-ViewModel [article] AngularJS Performance [article]
Read more
  • 0
  • 0
  • 3134
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-observer-pattern
Packt
19 Mar 2015
13 min read
Save for later

The Observer Pattern

Packt
19 Mar 2015
13 min read
In this article, written by Leonardo Borges, the author of Clojure Reactive Programming, we will: Explore Rx's main abstraction: Observables Learn about the duality between iterators and Observables Create and manipulate Observable sequences (For more resources related to this topic, see here.) The Observer pattern revisited Let's take a look at an example: (def numbers (atom []))   (defn adder [key ref old-state new-state] (print "Current sum is " (reduce + new-state)))   (add-watch numbers :adder adder) In the preceding example, our Observable subject is the var, numbers. The Observer is the adder watch. When the Observable changes, it pushes its changes to the Observer synchronously. Now, contrast this to working with sequences: (->> [1 2 3 4 5 6]      (map inc)      (filter even?)    (reduce +)) This time around, the vector is the subject being observed and the functions processing it can be thought of as the Observers. However, this works in a pull-based model. The vector doesn't push any elements down the sequence. Instead, map and friends ask the sequence for more elements. This is a synchronous operation. Rx makes sequences—and more—behave like Observables so that you can still map, filter, and compose them just as you would compose functions over normal sequences. Observer – an Iterator's dual Clojure's sequence operators such as map, filter, reduce, and so on support Java Iterables. As the name implies, an Iterable is an object that can be iterated over. At a low level, this is supported by retrieving an Iterator reference from such object. Java's Iterator interface looks like the following: public interface Iterator<E> {    boolean hasNext();    E next();    void remove(); } When passed in an object that implements this interface, Clojure's sequence operators pull data from it by using the next method, while using the hasNext method to know when to stop. The remove method is required to remove its last element from the underlying collection. This in-place mutation is clearly unsafe in a multithreaded environment. Whenever Clojure implements this interface for the purposes of interoperability, the remove method simply throws UnsupportedOperationException. An Observable, on the other hand, has Observers subscribed to it. Observers have the following interface: public interface Observer<T> {    void onCompleted();    void onError(Throwable e);    void onNext(T t); } As we can see, an Observer implementing this interface will have its onNext method called with the next value available from whatever Observable it's subscribed to. Hence, it being a push-based notification model. This duality becomes clearer if we look at both the interfaces side by side: Iterator<E> {                       Observer<T> {    boolean hasNext();                 void onCompleted();    E next();                          void onError(Throwable e);    void remove();                     void onNext(T t); }                                       } Observables provide the ability to have producers push items asynchronously to consumers. A few examples will help solidify our understanding. Creating Observables This article is all about Reactive Extensions, so let's go ahead and create a project called rx-playground that we will be using in our exploratory tour. We will use RxClojure (see https://github.com/ReactiveX/RxClojure), a library that provides Clojure bindings for RxJava() (see https://github.com/ReactiveX/RxJava): $ lein new rx-playground Open the project file and add a dependency on RxJava's Clojure bindings: (defproject rx-playground "0.1.0-SNAPSHOT" :description "FIXME: write description" :url "http://example.com/FIXME" :license {:name "Eclipse Public License"            :url "http://www.eclipse.org/legal/epl-v10.html"} :dependencies [[org.clojure/clojure "1.5.1"]                  [io.reactivex/rxclojure "1.0.0"]])"]]) Now, fire up a REPL in the project's root directory so that we can start creating some Observables: $ lein repl The first thing we need to do is import RxClojure, so let's get this out of the way by typing the following in the REPL: (require '[rx.lang.clojure.core :as rx]) (import '(rx Observable)) The simplest way to create a new Observable is by calling the justreturn function: (def obs (rx/return 10)) Now, we can subscribe to it: (rx/subscribe obs              (fn [value]              (prn (str "Got value: " value)))) This will print the string "Got value: 10" to the REPL. The subscribe function of an Observable allows us to register handlers for three main things that happen throughout its life cycle: new values, errors, or a notification that the Observable is done emitting values. This corresponds to the onNext, onError, and onCompleted methods of the Observer interface, respectively. In the preceding example, we are simply subscribing to onNext, which is why we get notified about the Observable's only value, 10. A single-value Observable isn't terribly interesting though. Let's create and interact with one that emits multiple values: (-> (rx/seq->o [1 2 3 4 5 6 7 8 9 10])    (rx/subscribe prn)) This will print the numbers from 1 to 10, inclusive, to the REPL. seq->o is a way to create Observables from Clojure sequences. It just so happens that the preceding snippet can be rewritten using Rx's own range operator: (-> (rx/range 1 10)    (rx/subscribe prn)) Of course, this doesn't yet present any advantages to working with raw values or sequences in Clojure. But what if we need an Observable that emits an undefined number of integers at a given interval? This becomes challenging to represent as a sequence in Clojure, but Rx makes it trivial: (import '(java.util.concurrent TimeUnit)) (rx/subscribe (Observable/interval 100 TimeUnit/MILLISECONDS)              prn-to-repl) RxClojure doesn't yet provide bindings to all of RxJava's API. The interval method is one such example. We're required to use interoperability and call the method directly on the Observable class from RxJava. Observable/interval takes as arguments a number and a time unit. In this case, we are telling it to emit an integer—starting from zero—every 100 milliseconds. If we type this in an REPL-connected editor, however, two things will happen: We will not see any output (depending on your REPL; this is true for Emacs) We will have a rogue thread emitting numbers indefinitely Both issues arise from the fact that Observable/interval is the first factory method we have used that doesn't emit values synchronously. Instead, it returns an Observable that defers the work to a separate thread. The first issue is simple enough to fix. Functions such as prn will print to whatever the dynamic var *out* is bound to. When working in certain REPL environments such as Emacs', this is bound to the REPL stream, which is why we can generally see everything we print. However, since Rx is deferring the work to a separate thread, *out* isn't bound to the REPL stream anymore so we don't see the output. In order to fix this, we need to capture the current value of *out* and bind it in our subscription. This will be incredibly useful as we experiment with Rx in the REPL. As such, let's create a helper function for it: (def repl-out *out*) (defn prn-to-repl [& args] (binding [*out* repl-out]    (apply prn args))) The first thing we do is create a var repl-out that contains the current REPL stream. Next, we create a function prn-to-repl that works just like prn, except it uses the binding macro to create a new binding for *out* that is valid within that scope. This still leaves us with the rogue thread problem. Now is the appropriate time to mention that the subscribe method from an Observable returns a subscription object. By holding onto a reference to it, we can call its unsubscribe method to indicate that we are no longer interested in the values produced by that Observable. Putting it all together, our interval example can be rewritten like the following: (def subscription (rx/subscribe (Observable/interval 100 TimeUnit/MILLISECONDS)                                prn-to-repl))   (Thread/sleep 1000)   (rx/unsubscribe subscription) We create a new interval Observable and immediately subscribe to it, just as we did before. This time, however, we assign the resulting subscription to a local var. Note that it now uses our helper function prn-to-repl, so we will start seeing values being printed to the REPL straight away. Next, we sleep the current—the REPL—thread for a second. This is enough time for the Observable to produce numbers from 0 to 9. That's roughly when the REPL thread wakes up and unsubscribes from that Observable, causing it to stop emitting values. Custom Observables Rx provides many more factory methods to create Observables (see https://github.com/ReactiveX/RxJava/wiki/Creating-Observables), but it is beyond the scope of this article to cover them all. Nevertheless, sometimes, none of the built-in factories is what you want. For such cases, Rx provides the create method. We can use it to create a custom Observable from scratch. As an example, we'll create our own version of the just Observable we used earlier in this article: (defn just-obs [v] (rx/Observable*    (fn [Observer]      (rx/on-next Observer v)      (rx/on-completed Observer))))   (rx/subscribe (just-obs 20) prn) First, we create a function, just-obs, which implements our Observable by calling the Observable* function. When creating an Observable this way, the function passed to Observable* will get called with an Observer as soon as one subscribes to us. When this happens, we are free to do whatever computation—and even I/O—we need in order to produce values and push them to the Observer. We should remember to call the Observer's onCompleted method whenever we're done producing values. The preceding snippet will print 20 to the REPL. While creating custom Observables is fairly straightforward, we should make sure we exhaust the built-in factory functions first, only then resorting to creating our own. Manipulating Observables Now that we know how to create Observables, we should look at what kinds of interesting things we can do with them. In this section, we will see what it means to treat Observables as sequences. We'll start with something simple. Let's print the sum of the first five positive even integers from an Observable of all integers: (rx/subscribe (->> (Observable/interval 1 TimeUnit/MICROSECONDS)                    (rx/filter even?)                    (rx/take 5)                    (rx/reduce +))                    prn-to-repl) This is starting to look awfully familiar to us. We create an interval that will emit all positive integers starting at zero every 1 microsecond. Then, we filter all even numbers in this Observable. Obviously, this is too big a list to handle, so we simply take the first five elements from it. Finally, we reduce the value using +. The result is 20. To drive home the point that programming with Observables really is just like operating on sequences, we will look at one more example where we will combine two different Observable sequences. One contains the names of musicians I'm a fan of and the other the names of their respective bands: (defn musicians [] (rx/seq->o ["James Hetfield" "Dave Mustaine" "Kerry King"]))   (defn bands     [] (rx/seq->o ["Metallica" "Megadeth" "Slayer"])) We would like to print to the REPL a string of the format Musician name – from: band name. An added requirement is that the band names should be printed in uppercase for impact. We'll start by creating another Observable that contains the uppercased band names: (defn uppercased-obs [] (rx/map (fn [s] (.toUpperCase s)) (bands))) While not strictly necessary, this makes a reusable piece of code that can be handy in several places of the program, thus avoiding duplication. Subscribers interested in the original band names can keep subscribing to the bands Observable. With the two Observables in hand, we can proceed to combine them: (-> (rx/map vector           (musicians)            (uppercased-obs))    (rx/subscribe (fn [[musician band]]                    (prn-to-repl (str musician " - from: " band))))) Once more, this example should feel familiar. The solution we were after was a way to zip the two Observables together. RxClojure provides zip behavior through map, much like Clojure's core map function does. We call it with three arguments: the two Observables to zip and a function that will be called with both elements, one from each Observable, and should return an appropriate representation. In this case, we simply turn them into a vector. Next, in our subscriber, we simply destructure the vector in order to access the musician and band names. We can finally print the final result to the REPL: "James Hetfield - from: METALLICA" "Dave Mustaine - from: MEGADETH" "Kerry King - from: SLAYER" Summary In this article, we took a deep dive into RxJava, a port form Microsoft's Reactive Extensions from .NET. We learned about its main abstraction, the Observable, and how it relates to iterables. We also learned how to create, manipulate, and combine Observables in several ways. The examples shown here were contrived to keep things simple. Resources for Article: Further resources on this subject: A/B Testing – Statistical Experiments for the Web [article] Working with Incanter Datasets [article] Using cross-validation [article]
Read more
  • 0
  • 0
  • 2374

article-image-drupal-8-configuration-management
Packt
18 Mar 2015
14 min read
Save for later

Drupal 8 Configuration Management

Packt
18 Mar 2015
14 min read
In this article, by the authors, Stefan Borchert and Anja Schirwinski, of the book, Drupal 8 Configuration Management,we will learn the inner workings of the Configuration Management system in Drupal 8. You will learn about config and schema files and read about the difference between simple configuration and configuration entities. (For more resources related to this topic, see here.) The config directory During installation, Drupal adds a directory within sites/default/files called config_HASH, where HASH is a long random string of letters and numbers, as shown in the following screenshot: This sequence is a random hash generated during the installation of your Drupal site. It is used to add some protection to your configuration files. Additionally to the default restriction enforced by the .htaccess file within the subdirectories of the config directory that prevents unauthorized users from seeing the content of the directories. As a result, would be really hard for someone to guess the folder's name. Within the config directory, you will see two additional directories that are empty by default (leaving the .htaccess and README.txt files aside). One of the directories is called active. If you change the configuration system to use file storage instead of the database for active Drupal site configuration, this directory will contain the active configuration. If you did not customize the storage mechanism of the active configuration (we will learn later how to do this), Drupal 8 uses the database to store the active configuration. The other directory is called staging. This directory is empty by default, but can host the configuration you want to be imported into your Drupal site from another installation. You will learn how to use this later on in this article. A simple configuration example First, we want to become familiar with configuration itself. If you look into the database of your Drupal installation and open up the config table , you will find the entire active configuration of your site, as shown in the following screenshot: Depending on your site's configuration, table names may be prefixed with a custom string, so you'll have to look for a table name that ends with config. Don't worry about the strange-looking text in the data column; this is the serialized content of the corresponding configuration. It expands to single configuration values—that is, system.site.name, which holds the value of the name of your site. Changing the site's name in the user interface on admin/config/system/site-information will immediately update the record in the database; thus, put simply the records in the table are the current state of your site's configuration, as shown in the following screenshot: But where does the initial configuration of your site come from? Drupal itself and the modules you install must use some kind of default configuration that gets added to the active storage during installation. Config and schema files – what are they and what are they used for? In order to provide a default configuration during the installation process, Drupal (modules and profiles) comes with a bunch of files that hold the configuration needed to run your site. To make parsing of these files simple and enhance readability of these configuration files, the configuration is stored using the YAML format. YAML (http://yaml.org/) is a data-orientated serialization standard that aims for simplicity. With YAML, it is easy to map common data types such as lists, arrays, or scalar values. Config files Directly beneath the root directory of each module and profile defining or overriding configuration (either core or contrib), you will find a directory named config. Within this directory, there may be two more directories (although both are optional): install and schema. Check the image module inside core/modules and take a look at its config directory, as shown in the following screenshot: The install directory shown in the following screenshot contains all configuration values that the specific module defines or overrides and that are stored in files with the extension .yml (one of the default extensions for files in the YAML format): During installation, the values stored in these files are copied to the active configuration of your site. In the case of default configuration storage, the values are added to the config table; in file-based configuration storage mechanisms, on the other hand, the files are copied to the appropriate directories. Looking at the filenames, you will see that they follow a simple convention: <module name>.<type of configuration>[.<machine name of configuration object>].yml (setting aside <module name>.settings.yml for now). The explanation is as follows: <module name>: This is the name of the module that defines the settings included in the file. For instance, the image.style.large.yml file contains settings defined by the image module. <type of configuration>: This can be seen as a type of group for configuration objects. The image module, for example, defines several image styles. These styles are a set of different configuration objects, so the group is defined as style. Hence, all configuration files that contain image styles defined by the image module itself are named image.style.<something>.yml. The same structure applies to blocks (<block.block.*.yml>), filter formats (<filter.format.*.yml>), menus (<system.menu.*.yml>), content types (<node.type.*.yml>), and so on. <machine name of configuration object>: The last part of the filename is the unique machine-readable name of the configuration object itself. In our examples from the image module, you see three different items: large, medium, and thumbnail. These are exactly the three image styles you will find on admin/config/media/image-styles after installing a fresh copy of Drupal 8. The image styles are shown in the following screenshot: Schema files The primary reason schema files were introduced into Drupal 8 is multilingual support. A tool was needed to identify all translatable strings within the shipped configuration. The secondary reason is to provide actual translation forms for configuration based on your data and to expose translatable configuration pieces to external tools. Each module can have as many configuration the .yml files as needed. All of these are explained in one or more schema files that are shipped with the module. As a simple example of how schema files work, let's look at the system's maintenance settings in the system.maintenance.yml file at core/modules/system/config/install. The file's contents are as follows: message: '@site is currently under maintenance. We should be back shortly. Thank you for your patience.' langcode: en The system module's schema files live in core/modules/system/config/schema. These define the basic types but, for our example, the most important aspect is that they define the schema for the maintenance settings. The corresponding schema section from the system.schema.yml file is as follows: system.maintenance: type: mapping label: 'Maintenance mode' mapping:    message:      type: text      label: 'Message to display when in maintenance mode'    langcode:      type: string      label: 'Default language' The first line corresponds to the filename for the .yml file, and the nested lines underneath the first line describe the file's contents. Mapping is a basic type for key-value pairs (always the top-level type in .yml). The system.maintenance.yml file is labeled as label: 'Maintenance mode'. Then, the actual elements in the mapping are listed under the mapping key. As shown in the code, the file has two items, so the message and langcode keys are described. These are a text and a string value, respectively. Both values are given a label as well in order to identify them in configuration forms. Learning the difference between active and staging By now, you know that Drupal works with the two directories active and staging. But what is the intention behind those directories? And how do we use them? The configuration used by your site is called the active configuration since it's the configuration that is affecting the site's behavior right now. The current (active) configuration is stored in the database and direct changes to your site's configuration go into the specific tables. The reason Drupal 8 stores the active configuration in the database is that it enhances performance and security. Source: https://www.drupal.org/node/2241059. However, sometimes you might not want to store the active configuration in the database and might need to use a different storage mechanism. For example, using the filesystem as configuration storage will enable you to track changes in the site's configuration using a versioning system such as Git or SVN. Changing the active configuration storage If you do want to switch your active configuration storage to files, here's how: Note that changing the configuration storage is only possible before installing Drupal. After installing it, there is no way to switch to another configuration storage! To use a different configuration storage mechanism, you have to make some modifications to your settings.php file. First, you'll need to find the section named Active configuration settings. Now you will have to uncomment the line that starts with $settings['bootstrap_config_storage'] to enable file-based configuration storage. Additionally, you need to copy the existing default.services.yml (next to your settings.php file) to a file named services.yml and enable the new configuration storage: services: # Override configuration storage. config.storage:    class: DrupalCoreConfigCachedStorage    arguments: ['@config.storage.active', '@cache.config'] config.storage.active:    # Use file storage for active configuration.    alias: config.storage.file This tells Drupal to override the default service used for configuration storage and use config.storage.file as the active configuration storage mechanism instead of the default database storage. After installing the site with these settings, we will take another look at the config directory in sites/default/files (assuming you didn't change to the location of the active and staging directory): As you can see, the active directory contains the entire site's configuration. The files in this directory get copied here during the website's installation process. Whenever you make a change to your website, the change is reflected in these files. Exporting a configuration always exports a snapshot of the active configuration, regardless of the storage method. The staging directory contains the changes you want to add to your site. Drupal compares the staging directory to the active directory and checks for changes between them. When you upload your compressed export file, it actually gets placed inside the staging directory. This means you can save yourself the trouble of using the interface to export and import the compressed file if you're comfortable enough with copy-and-pasting files to another directory. Just make sure you copy all of the files to the staging directory even if only one of the files was changed. Any missing files are interpreted as deleted configuration, and will mess up your site. In order to get the contents of staging into active, we simply have to use the synchronize option at admin/config/development/configuration again. This page will show us what was changed and allows us to import the changes. On importing, your active configuration will get overridden with the configuration in your staging directory. Note that the files inside the staging directory will not be removed after the synchronization is finished. The next time you want to copy-and-paste from your active directory, make sure you empty staging first. Note that you cannot override files directly in the active directory. The changes have to be made inside staging and then synchronized. Changing the storage location of the active and staging directories In case you do not want Drupal to store your configuration in sites/default/files, you can set the path according to your wishes. Actually, this is recommended for security reasons, as these directories should never be accessible over the Web or by unauthorized users on your server. Additionally, it makes your life easier if you work with version control. By default, the whole files directory is usually ignored in version-controlled environments because Drupal writes to it, and having the active and staging directory located within sites/default/files would result in them being ignored too. So how do we change the location of the configuration directories? Before installing Drupal, you will need to create and modify the settings.php file that Drupal uses to load its basic configuration data from (that is, the database connection settings). If you haven't done it yet, copy the default.settings.php file and rename the copy to settings.php. Afterwards, open the new file with the editor of your choice and search for the following line: $config_directories = array(); Change the preceding line to the following (or simply insert your addition at the bottom of the file). $config_directories = array( CONFIG_ACTIVE_DIRECTORY => './../config/active', // folder outside the webroot CONFIG_STAGING_DIRECTORY => './../config/staging', // folder outside the webroot ); The directory names can be chosen freely, but it is recommended that you at least use similar names to the default ones so that you or other developers don't get confused when looking at them later. Remember to put these directories outside your webroot, or at least protect the directories using an .htaccess file (if using Apache as the server). Directly after adding the paths to your settings.php file, make sure you remove write permissions from the file as it would be a security risk if someone could change it. Drupal will now use your custom location for its configuration files on installation. You can also change the location of the configuration directories after installing Drupal. Open up your settings.php file and find these two lines near the end of the file and start with $config_directories. Change their paths to something like this: $config_directories['active'] = './../config/active'; $config_directories['staging] = './../config/staging'; This path places the directories above your Drupal root. Now that you know about active and staging, let's learn more about the different types of configuration you can create on your own. Simple configuration versus configuration entities As soon as you want to start storing your own configuration, you need to understand the differences between simple configuration and configuration entities. Here's a short definition of the two types of configuration used in Drupal. Simple configuration This configuration type is easier to implement and therefore ideal for basic configuration settings that result in Boolean values, integers, or simple strings of text being stored, as well as global variables that are used throughout your site. A good example would be the value of an on/off toggle for a specific feature in your module, or our previously used example of the site name configured by the system module: name: 'Configuration Management in Drupal 8' Simple configuration also includes any settings that your module requires in order to operate correctly. For example, JavaScript aggregation has to be either on or off. If it doesn't exist, the system module won't be able to determine the appropriate course of action. Configuration entities Configuration entities are much more complicated to implement but far more flexible. They are used to store information about objects that users can create and destroy without breaking the code. A good example of configuration entities is an image style provided by the image module. Take a look at the image.style.thumbnail.yml file: uuid: fe1fba86-862c-49c2-bf00-c5e1f78a0f6c langcode: en status: true dependencies: { } name: thumbnail label: 'Thumbnail (100×100)' effects: 1cfec298-8620-4749-b100-ccb6c4500779:    uuid: 1cfec298-8620-4749-b100-ccb6c4500779    id: image_scale    weight: 0    data:      width: 100      height: 100      upscale: false third_party_settings: { } This defines a specific style for images, so the system is able to create derivatives of images that a user uploads to the site. Configuration entities also come with a complete set of create, read, update, and delete (CRUD) hooks that are fired just like any other entity in Drupal, making them an ideal candidate for configuration that might need to be manipulated or responded to by other modules. As an example, the Views module uses configuration entities that allow for a scenario where, at runtime, hooks are fired that allow any other module to provide configuration (in this case, custom views) to the Views module. Summary In this article, you learned about how to store configuration and briefly got to know the two different types of configuration. Resources for Article: Further resources on this subject: Tabula Rasa: Nurturing your Site for Tablets [article] Components - Reusing Rules, Conditions, and Actions [article] Introduction to Drupal Web Services [article]
Read more
  • 0
  • 0
  • 13299

article-image-building-mobile-games-craftyjs-and-phonegap-part-1
Robi Sen
18 Mar 2015
7 min read
Save for later

Building Mobile Games with Crafty.js and PhoneGap: Part 1

Robi Sen
18 Mar 2015
7 min read
In this post, we will build a mobile game using HTML5, CSS, and JavaScript. To make things easier, we are going to make use of the Crafty.js JavaScript game engine, which is both free and open source. In this first part of a two-part series, we will look at making a simple turn-based RPG-like game based on Pascal Rettig’s Crafty Workshop presentation. You will learn how to add sprites to a game, control them, and work with mouse/touch events. Setting up To get started, first create a new PhoneGap project wherever you want in which to do your development. For this article, let’s call the project simplerpg. Figure 1: Creating the simplerpg project in PhoneGap. Navigate to the www directory in your PhoneGap project and then add a new director called lib. This is where we are going to put several JavaScript libraries we will use for the project. Now, download the JQuery library to the lib directory. For this project, we will use JQuery 2.1. Once you have downloaded JQuery, you need to download the Crafty.js library and add it to your lib directory as well. For later parts of this series,you will want to be using a web server such as Apache or IIS to make development easier. For the first part of the post, you can just drag-and-drop the HTML files into your browser to test, but later, you will need to use a web browser to avoid Same Origin Policy errors. This article assumes you are using Chrome to develop in. While IE or FireFox will work just fine, Chrome is used in this article and its debugging environment. Finally, the source code for this article can be found here on GitHub. In the lessons directory, you will see a series of index files with a listing number matching each code listing in this article. Crafty PhoneGap allows you to take almost any HTML5 application and turn it into a mobile app with little to no extra work. Perhaps the most complex of all mobile apps are videos. Video games often have complex routines, graphics, and controls. As such, developing a video game from the ground up is very difficult. So much so that even major video game companies rarely do so. What they usually do, and what we will do here, is make use of libraries and game engines that take care of many of the complex tasks of managing objects, animation, collision detection, and more. For our project, we will be making use of the open source JavaScript game engine Crafty. Before you get started with the code, it’s recommended to quickly review the website here and review the Crafty API here. Bootstrapping Crafty and creating an entity Crafty is very simple to start working with. All you need to do is load the Crafty.js library and initialize Crafty. Let’s try that. Create an index.html file in your www root directory, if one does not exist; if you already have one, go ahead and overwrite it. Then, cut and paste listing 1 into it. Listing 1: Creating an entity <!DOCTYPE html> <html> <head></head> <body> <div id="game"></div> <script type="text/javascript" src="lib/crafty.js"></script> <script> // Height and Width var WIDTH = 500, HEIGHT = 320; // Initialize Crafty Crafty.init(WIDTH, HEIGHT); var player = Crafty.e(); player.addComponent("2D, Canvas, Color") player.color("red").attr({w:50, h:50}); </script> </body> </html> As you can see in listing 1, we are creating an HTML5 document and loading the Crafty.js library. Then, we initialize Crafty and pass it a width and height. Next, we create a Crafty entity called player. Crafty, like many other game engines, follows a design pattern called Entity-Component-System or (ECS). Entities are objects that you can attach things like behaviors and data to. For ourplayerentity, we are going to add several components including 2D, Canvas, and Color. Components can be data, metadata, or behaviors. Finally, we will add a specific color and position to our entity. If you now save your file and drag-and-drop it into the browser, you should see something like figure 2. Figure 2: A simple entity in Crafty.  Moving a box Now,let’s do something a bit more complex in Crafty. Let’s move the red box based on where we move our mouse, or if you have a touch-enabled device, where we touch the screen. To do this, open your index.html file and edit it so it looks like listing 2. Listing 2: Moving the box <!DOCTYPE html> <html> <head></head> <body> <div id="game"></div> <script type="text/javascript" src="lib/crafty.js"></script> <script> var WIDTH = 500, HEIGHT = 320; Crafty.init(WIDTH, HEIGHT); // Background Crafty.background("black"); //add mousetracking so block follows your mouse Crafty.e("mouseTracking, 2D, Mouse, Touch, Canvas") .attr({ w:500, h:320, x:0, y:0 }) .bind("MouseMove", function(e) { console.log("MouseDown:"+ Crafty.mousePos.x +", "+ Crafty.mousePos.y); // when you touch on the canvas redraw the player player.x = Crafty.mousePos.x; player.y = Crafty.mousePos.y; }); // Create the player entity var player = Crafty.e(); player.addComponent("2D, DOM"); //set where your player starts player.attr({ x : 10, y : 10, w : 50, h : 50 }); player.addComponent("Color").color("red"); </script> </body> </html> As you can see, there is a lot more going on in this listing. The first difference is that we are using Crafty.background to set the background to black, but we are also creating a new entity called mouseTracking that is the same size as the whole canvas. We assign several components to the entity so that it can inherit their methods and properties. We then use .bind to bind the mouse’s movements to our entity. Then, we tell Crafty to reposition our player entity to wherever the mouse’s x and y position is. So, if you save this code and run it, you will find that the red box will go wherever your mouse moves or wherever you touch or drag as in figure 3.    Figure 3: Controlling the movement of a box in Crafty.  Summary In this post, you learned about working with Crafty.js. Specifically, you learned how to work with the Crafty API and create entities. In Part 2, you will work with sprites, create components, and control entities via mouse/touch.  About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus-year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as UnderArmour, Sony, CISCO, IBM, and many others to help build new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems, allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 5763

article-image-drupal-8-and-configuration-management
Packt
18 Mar 2015
15 min read
Save for later

Drupal 8 and Configuration Management

Packt
18 Mar 2015
15 min read
In this article, by the authors, Stefan Borchert and Anja Schirwinski, of the book, Drupal 8 Configuration Management,we will learn the inner workings of the Configuration Management system in Drupal 8. You will learn about config and schema files and read about the difference between simple configuration and configuration entities. (For more resources related to this topic, see here.) The config directory During installation, Drupal adds a directory within sites/default/files called config_HASH, where HASH is a long random string of letters and numbers, as shown in the following screenshot: This sequence is a random hash generated during the installation of your Drupal site. It is used to add some protection to your configuration files. Additionally to the default restriction enforced by the .htaccess file within the subdirectories of the config directory that prevents unauthorized users from seeing the content of the directories. As a result, would be really hard for someone to guess the folder's name. Within the config directory, you will see two additional directories that are empty by default (leaving the .htaccess and README.txt files aside). One of the directories is called active. If you change the configuration system to use file storage instead of the database for active Drupal site configuration, this directory will contain the active configuration. If you did not customize the storage mechanism of the active configuration (we will learn later how to do this), Drupal 8 uses the database to store the active configuration. The other directory is called staging. This directory is empty by default, but can host the configuration you want to be imported into your Drupal site from another installation. You will learn how to use this later on in this article. A simple configuration example First, we want to become familiar with configuration itself. If you look into the database of your Drupal installation and open up the config table , you will find the entire active configuration of your site, as shown in the following screenshot: Depending on your site's configuration, table names may be prefixed with a custom string, so you'll have to look for a table name that ends with config. Don't worry about the strange-looking text in the data column; this is the serialized content of the corresponding configuration. It expands to single configuration values—that is, system.site.name, which holds the value of the name of your site. Changing the site's name in the user interface on admin/config/system/site-information will immediately update the record in the database; thus, put simply the records in the table are the current state of your site's configuration, as shown in the following screenshot: But where does the initial configuration of your site come from? Drupal itself and the modules you install must use some kind of default configuration that gets added to the active storage during installation. Config and schema files – what are they and what are they used for? In order to provide a default configuration during the installation process, Drupal (modules and profiles) comes with a bunch of files that hold the configuration needed to run your site. To make parsing of these files simple and enhance readability of these configuration files, the configuration is stored using the YAML format. YAML (http://yaml.org/) is a data-orientated serialization standard that aims for simplicity. With YAML, it is easy to map common data types such as lists, arrays, or scalar values. Config files Directly beneath the root directory of each module and profile defining or overriding configuration (either core or contrib), you will find a directory named config. Within this directory, there may be two more directories (although both are optional): install and schema. Check the image module inside core/modules and take a look at its config directory, as shown in the following screenshot: The install directory shown in the following screenshot contains all configuration values that the specific module defines or overrides and that are stored in files with the extension .yml (one of the default extensions for files in the YAML format): During installation, the values stored in these files are copied to the active configuration of your site. In the case of default configuration storage, the values are added to the config table; in file-based configuration storage mechanisms, on the other hand, the files are copied to the appropriate directories. Looking at the filenames, you will see that they follow a simple convention: <module name>.<type of configuration>[.<machine name of configuration object>].yml (setting aside <module name>.settings.yml for now). The explanation is as follows: <module name>: This is the name of the module that defines the settings included in the file. For instance, the image.style.large.yml file contains settings defined by the image module. <type of configuration>: This can be seen as a type of group for configuration objects. The image module, for example, defines several image styles. These styles are a set of different configuration objects, so the group is defined as style. Hence, all configuration files that contain image styles defined by the image module itself are named image.style.<something>.yml. The same structure applies to blocks (<block.block.*.yml>), filter formats (<filter.format.*.yml>), menus (<system.menu.*.yml>), content types (<node.type.*.yml>), and so on. <machine name of configuration object>: The last part of the filename is the unique machine-readable name of the configuration object itself. In our examples from the image module, you see three different items: large, medium, and thumbnail. These are exactly the three image styles you will find on admin/config/media/image-styles after installing a fresh copy of Drupal 8. The image styles are shown in the following screenshot: Schema files The primary reason schema files were introduced into Drupal 8 is multilingual support. A tool was needed to identify all translatable strings within the shipped configuration. The secondary reason is to provide actual translation forms for configuration based on your data and to expose translatable configuration pieces to external tools. Each module can have as many configuration the .yml files as needed. All of these are explained in one or more schema files that are shipped with the module. As a simple example of how schema files work, let's look at the system's maintenance settings in the system.maintenance.yml file at core/modules/system/config/install. The file's contents are as follows: message: '@site is currently under maintenance. We should be back shortly. Thank you for your patience.' langcode: en The system module's schema files live in core/modules/system/config/schema. These define the basic types but, for our example, the most important aspect is that they define the schema for the maintenance settings. The corresponding schema section from the system.schema.yml file is as follows: system.maintenance: type: mapping label: 'Maintenance mode' mapping:    message:      type: text      label: 'Message to display when in maintenance mode'    langcode:      type: string      label: 'Default language' The first line corresponds to the filename for the .yml file, and the nested lines underneath the first line describe the file's contents. Mapping is a basic type for key-value pairs (always the top-level type in .yml). The system.maintenance.yml file is labeled as label: 'Maintenance mode'. Then, the actual elements in the mapping are listed under the mapping key. As shown in the code, the file has two items, so the message and langcode keys are described. These are a text and a string value, respectively. Both values are given a label as well in order to identify them in configuration forms. Learning the difference between active and staging By now, you know that Drupal works with the two directories active and staging. But what is the intention behind those directories? And how do we use them? The configuration used by your site is called the active configuration since it's the configuration that is affecting the site's behavior right now. The current (active) configuration is stored in the database and direct changes to your site's configuration go into the specific tables. The reason Drupal 8 stores the active configuration in the database is that it enhances performance and security. Source: https://www.drupal.org/node/2241059. However, sometimes you might not want to store the active configuration in the database and might need to use a different storage mechanism. For example, using the filesystem as configuration storage will enable you to track changes in the site's configuration using a versioning system such as Git or SVN. Changing the active configuration storage If you do want to switch your active configuration storage to files, here's how: Note that changing the configuration storage is only possible before installing Drupal. After installing it, there is no way to switch to another configuration storage! To use a different configuration storage mechanism, you have to make some modifications to your settings.php file. First, you'll need to find the section named Active configuration settings. Now you will have to uncomment the line that starts with $settings['bootstrap_config_storage'] to enable file-based configuration storage. Additionally, you need to copy the existing default.services.yml (next to your settings.php file) to a file named services.yml and enable the new configuration storage: services: # Override configuration storage. config.storage:    class: DrupalCoreConfigCachedStorage    arguments: ['@config.storage.active', '@cache.config'] config.storage.active:    # Use file storage for active configuration.    alias: config.storage.file This tells Drupal to override the default service used for configuration storage and use config.storage.file as the active configuration storage mechanism instead of the default database storage. After installing the site with these settings, we will take another look at the config directory in sites/default/files (assuming you didn't change to the location of the active and staging directory): As you can see, the active directory contains the entire site's configuration. The files in this directory get copied here during the website's installation process. Whenever you make a change to your website, the change is reflected in these files. Exporting a configuration always exports a snapshot of the active configuration, regardless of the storage method. The staging directory contains the changes you want to add to your site. Drupal compares the staging directory to the active directory and checks for changes between them. When you upload your compressed export file, it actually gets placed inside the staging directory. This means you can save yourself the trouble of using the interface to export and import the compressed file if you're comfortable enough with copy-and-pasting files to another directory. Just make sure you copy all of the files to the staging directory even if only one of the files was changed. Any missing files are interpreted as deleted configuration, and will mess up your site. In order to get the contents of staging into active, we simply have to use the synchronize option at admin/config/development/configuration again. This page will show us what was changed and allows us to import the changes. On importing, your active configuration will get overridden with the configuration in your staging directory. Note that the files inside the staging directory will not be removed after the synchronization is finished. The next time you want to copy-and-paste from your active directory, make sure you empty staging first. Note that you cannot override files directly in the active directory. The changes have to be made inside staging and then synchronized. Changing the storage location of the active and staging directories In case you do not want Drupal to store your configuration in sites/default/files, you can set the path according to your wishes. Actually, this is recommended for security reasons, as these directories should never be accessible over the Web or by unauthorized users on your server. Additionally, it makes your life easier if you work with version control. By default, the whole files directory is usually ignored in version-controlled environments because Drupal writes to it, and having the active and staging directory located within sites/default/files would result in them being ignored too. So how do we change the location of the configuration directories? Before installing Drupal, you will need to create and modify the settings.php file that Drupal uses to load its basic configuration data from (that is, the database connection settings). If you haven't done it yet, copy the default.settings.php file and rename the copy to settings.php. Afterwards, open the new file with the editor of your choice and search for the following line: $config_directories = array(); Change the preceding line to the following (or simply insert your addition at the bottom of the file). $config_directories = array( CONFIG_ACTIVE_DIRECTORY => './../config/active', // folder outside the webroot CONFIG_STAGING_DIRECTORY => './../config/staging', // folder outside the webroot ); The directory names can be chosen freely, but it is recommended that you at least use similar names to the default ones so that you or other developers don't get confused when looking at them later. Remember to put these directories outside your webroot, or at least protect the directories using an .htaccess file (if using Apache as the server). Directly after adding the paths to your settings.php file, make sure you remove write permissions from the file as it would be a security risk if someone could change it. Drupal will now use your custom location for its configuration files on installation. You can also change the location of the configuration directories after installing Drupal. Open up your settings.php file and find these two lines near the end of the file and start with $config_directories. Change their paths to something like this: $config_directories['active'] = './../config/active'; $config_directories['staging] = './../config/staging'; This path places the directories above your Drupal root. Now that you know about active and staging, let's learn more about the different types of configuration you can create on your own. Simple configuration versus configuration entities As soon as you want to start storing your own configuration, you need to understand the differences between simple configuration and configuration entities. Here's a short definition of the two types of configuration used in Drupal. Simple configuration This configuration type is easier to implement and therefore ideal for basic configuration settings that result in Boolean values, integers, or simple strings of text being stored, as well as global variables that are used throughout your site. A good example would be the value of an on/off toggle for a specific feature in your module, or our previously used example of the site name configured by the system module: name: 'Configuration Management in Drupal 8' Simple configuration also includes any settings that your module requires in order to operate correctly. For example, JavaScript aggregation has to be either on or off. If it doesn't exist, the system module won't be able to determine the appropriate course of action. Configuration entities Configuration entities are much more complicated to implement but far more flexible. They are used to store information about objects that users can create and destroy without breaking the code. A good example of configuration entities is an image style provided by the image module. Take a look at the image.style.thumbnail.yml file: uuid: fe1fba86-862c-49c2-bf00-c5e1f78a0f6c langcode: en status: true dependencies: { } name: thumbnail label: 'Thumbnail (100×100)' effects: 1cfec298-8620-4749-b100-ccb6c4500779:    uuid: 1cfec298-8620-4749-b100-ccb6c4500779    id: image_scale    weight: 0    data:      width: 100      height: 100      upscale: false third_party_settings: { } This defines a specific style for images, so the system is able to create derivatives of images that a user uploads to the site. Configuration entities also come with a complete set of create, read, update, and delete (CRUD) hooks that are fired just like any other entity in Drupal, making them an ideal candidate for configuration that might need to be manipulated or responded to by other modules. As an example, the Views module uses configuration entities that allow for a scenario where, at runtime, hooks are fired that allow any other module to provide configuration (in this case, custom views) to the Views module. Summary In this article, you learned about how to store configuration and briefly got to know the two different types of configuration. Resources for Article: Further resources on this subject: Tabula Rasa: Nurturing your Site for Tablets [article] Components - Reusing Rules, Conditions, and Actions [article] Introduction to Drupal Web Services [article]
Read more
  • 0
  • 0
  • 21909
article-image-entity-framework-code-first-accessing-database-views-and-stored-procedures
Packt
18 Mar 2015
15 min read
Save for later

Entity Framework Code-First: Accessing Database Views and Stored Procedures

Packt
18 Mar 2015
15 min read
In this article by Sergey Barskiy, author of the book Code-First Development using Entity Framework, you will learn how to integrate Entity Framework with additional database objects, specifically views and stored procedures. We will see how to take advantage of existing stored procedures and functions to retrieve and change the data. You will learn how to persist changed entities from our context using stored procedures. We will gain an understanding of the advantages of asynchronous processing and see how Entity Framework supports this concept via its built-in API. Finally, you will learn why concurrency is important for a multi-user application and what options are available in Entity Framework to implement optimistic concurrency. In this article, we will cover how to: Get data from a view Get data from a stored procedure or table-valued function Map create, update, and delete operations on a table to a set of stored procedures Use the asynchronous API to get and save the data Implement multi-user concurrency handling Working with views Views in an RDBMS fulfill an important role. They allow developers to combine data from multiple tables into a structure that looks like a table, but do not provide persistence. Thus, we have an abstraction on top of raw table data. One can use this approach to provide different security rights, for example. We can also simplify queries we have to write, especially if we access the data defined by views quite frequently in multiple places in our code. Entity Framework Code-First does not fully support views as of now. As a result, we have to use a workaround. One approach would be to write code as if a view was really a table, that is, let Entity Framework define this table, then drop the table, and create a replacement view. We will still end up with strongly typed data with full query support. Let's start with the same database structure we used before, including person and person type. Our view will combine a few columns from the Person table and Person type name, as shown in the following code snippet: public class PersonViewInfo { public int PersonId { get; set; } public string TypeName { get; set; } public string FirstName { get; set; } public string LastName { get; set; } } Here is the same class in VB.NET: Public Class PersonViewInfo Public Property PersonId() As Integer Public Property TypeName() As String Public Property FirstName() As String Public Property LastName() As String End Class Now, we need to create a configuration class for two reasons. We need to specify a primary key column because we do not follow the naming convention that Entity Framework assumes for primary keys. Then, we need to specify the table name, which will be our view name, as shown in the following code: public class PersonViewInfoMap : EntityTypeConfiguration<PersonViewInfo> { public PersonViewInfoMap() {    HasKey(p => p.PersonId);    ToTable("PersonView"); } } Here is the same class in VB.NET: Public Class PersonViewInfoMap Inherits EntityTypeConfiguration(Of PersonViewInfo) Public Sub New()    HasKey(Function(p) p.PersonId)    ToTable("PersonView") End Sub End Class Finally, we need to add a property to our context that exposes this data, as shown here: public DbSet<PersonViewInfo> PersonView { get; set; } The same property in VB.NET looks quite familiar to us, as shown in the following code: Property PersonView() As DbSet(Of PersonViewInfo) Now, we need to work with our initializer to drop the table and create a view in its place. We are using one of the initializers we created before. When we cover migrations, we will see that the same approach works there as well, with virtually identical code. Here is the code we added to the Seed method of our initializer, as shown in the following code: public class Initializer : DropCreateDatabaseIfModelChanges<Context> { protected override void Seed(Context context) {    context.Database.ExecuteSqlCommand("DROP TABLE PersonView");    context.Database.ExecuteSqlCommand(      @"CREATE VIEW [dbo].[PersonView]      AS      SELECT        dbo.People.PersonId,        dbo.People.FirstName,        dbo.People.LastName,        dbo.PersonTypes.TypeName      FROM        dbo.People      INNER JOIN dbo.PersonTypes        ON dbo.People.PersonTypeId = dbo.PersonTypes.PersonTypeId      "); } } In the preceding code, we first drop the table using the ExecuteSqlCommand method of the Database object. This method is useful because it allows the developer to execute arbitrary SQL code against the backend. We call this method twice, the first time to drop the tables and the second time to create our view. The same initializer code in VB.NET looks as follows: Public Class Initializer Inherits DropCreateDatabaseIfModelChanges(Of Context) Protected Overrides Sub Seed(ByVal context As Context)    context.Database.ExecuteSqlCommand("DROP TABLE PersonView")    context.Database.ExecuteSqlCommand( <![CDATA[      CREATE VIEW [dbo].[PersonView]      AS      SELECT        dbo.People.PersonId,        dbo.People.FirstName,        dbo.People.LastName,        dbo.PersonTypes.TypeName      FROM        dbo.People      INNER JOIN dbo.PersonTypes        ON dbo.People.PersonTypeId = dbo.PersonTypes.PersonTypeId]]>.Value()) End Sub End Class Since VB.NET does not support multiline strings such as C#, we are using XML literals instead, getting a value of a single node. This just makes SQL code more readable. We are now ready to query our data. This is shown in the following code snippet: using (var context = new Context()) { var people = context.PersonView    .Where(p => p.PersonId > 0)    .OrderBy(p => p.LastName)    .ToList(); foreach (var personViewInfo in people) {    Console.WriteLine(personViewInfo.LastName); } As we can see, there is literally no difference in accessing our view or any other table. Here is the same code in VB.NET: Using context = New Context() Dim people = context.PersonView _      .Where(Function(p) p.PersonId > 0) _      .OrderBy(Function(p) p.LastName) _      .ToList() For Each personViewInfo In people    Console.WriteLine(personViewInfo.LastName) Next End Using Although the view looks like a table, if we try to change and update an entity defined by this view, we will get an exception. If we do not want to play around with tables in such a way, we can still use the initializer to define our view, but query the data using a different method of the Database object, SqlQuery. This method has the same parameters as ExecuteSqlCommand, but is expected to return a result set, in our case, a collection of PersonViewInfo objects, as shown in the following code: using (var context = new Context()) { var sql = @"SELECT * FROM PERSONVIEW WHERE PERSONID > {0} "; var peopleViaCommand = context.Database.SqlQuery<PersonViewInfo>(    sql,    0); foreach (var personViewInfo in peopleViaCommand) {    Console.WriteLine(personViewInfo.LastName); } } The SqlQuery method takes generic type parameters, which define what data will be materialized when a raw SQL command is executed. The text of the command itself is simply parameterized SQL. We need to use parameters to ensure that our dynamic code is not subject to SQL injection. SQL injection is a process in which a malicious user can execute arbitrary SQL code by providing specific input values. Entity Framework is not subject to such attacks on its own. Here is the same code in VB.NET: Using context = New Context() Dim sql = "SELECT * FROM PERSONVIEW WHERE PERSONID > {0} " Dim peopleViaCommand = context.Database.SqlQuery(Of PersonViewInfo)(sql, 0)    For Each personViewInfo In peopleViaCommand    Console.WriteLine(personViewInfo.LastName) Next End Using We not only saw how to use views in Entity Framework, but saw two extremely useful methods of the Database object, which allows us to execute arbitrary SQL statements and optionally materialize the results of such queries. The generic type parameter does not have to be a class. You can use the native .NET type, such as a string or an integer. It is not always necessary to use views. Entity Framework allows us to easily combine multiple tables in a single query. Working with stored procedures The process of working with stored procedures in Entity Framework is similar to the process of working with views. We will use the same two methods we just saw on the Database object—SqlQuery and ExecuteSqlCommand. In order to read a number of rows from a stored procedure, we simply need a class that we will use to materialize all the rows of retrieved data into a collection of instances of this class. For example, to read the data from the stored procedure, consider this query: CREATE PROCEDURE [dbo].[SelectCompanies] @dateAdded as DateTime AS BEGIN SELECT CompanyId, CompanyName FROM Companies WHERE DateAdded > @dateAdded END We just need a class that matches the results of our stored procedure, as shown in the following code: public class CompanyInfo { public int CompanyId { get; set; } public string CompanyName { get; set; } } The same class looks as follows in VB.NET: Public Class CompanyInfo Property CompanyId() As Integer Property CompanyName() As String End Class We are now able to read the data using the SqlQuery method, as shown in the following code: sql = @"SelectCompanies {0}"; var companies = context.Database.SqlQuery<CompanyInfo>( sql, DateTime.Today.AddYears(-10)); foreach (var companyInfo in companies) { We specified which class we used to read the results of the query call. We also provided a formatted placeholder when we created our SQL statement for a parameter that the stored procedure takes. We provided a value for that parameter when we called SqlQuery. If one has to provide multiple parameters, one just needs to provide an array of values to SqlQuery and provide formatted placeholders, separated by commas as part of our SQL statement. We could have used a table values function instead of a stored procedure as well. Here is how the code looks in VB.NET: sql = "SelectCompanies {0}" Dim companies = context.Database.SqlQuery(Of CompanyInfo)( sql, DateTime.Today.AddYears(-10)) For Each companyInfo As CompanyInfo In companies Another use case is when our stored procedure does not return any values, but instead simply issues a command against one or more tables in the database. It does not matter as much what a procedure does, just that it does not need to return a value. For example, here is a stored procedure that updates multiple rows in a table in our database: CREATE PROCEDURE dbo.UpdateCompanies @dateAdded as DateTime, @activeFlag as Bit AS BEGIN UPDATE Companies Set DateAdded = @dateAdded, IsActive = @activeFlag END In order to call this procedure, we will use ExecuteSqlCommand. This method returns a single value—the number of rows affected by the stored procedure or any other SQL statement. You do not need to capture this value if you are not interested in it, as shown in this code snippet: var sql = @"UpdateCompanies {0}, {1}"; var rowsAffected = context.Database.ExecuteSqlCommand(    sql, DateTime.Now, true); We see that we needed to provide two parameters. We needed to provide them in the exact same order the stored procedure expected them. They are passed into ExecuteSqlCommand as the parameter array, except we did not need to create an array explicitly. Here is how the code looks in VB.NET: Dim sql = "UpdateCompanies {0}, {1}" Dim rowsAffected = context.Database.ExecuteSqlCommand( _    sql, DateTime.Now, True) Entity Framework eliminates the need for stored procedures to a large extent. However, there may still be reasons to use them. Some of the reasons include security standards, legacy database, or efficiency. For example, you may need to update thousands of rows in a single operation and retrieve them through Entity Framework; updating each row at a time and then saving those instances is not efficient. You could also update data inside any stored procedure, even if you call it with the SqlQuery method. Developers can also execute any arbitrary SQL statements, following the exact same technique as stored procedures. Just provide your SQL statement, instead of the stored procedure name to the SqlQuery or ExecuteSqlCommand method. Create, update, and delete entities with stored procedures So far, we have always used the built-in functionality that comes with Entity Framework that generates SQL statements to insert, update, or delete the entities. There are use cases when we would want to use stored procedures to achieve the same result. Developers may have requirements to use stored procedures for security reasons. You may be dealing with an existing database that has these procedures already built in. Entity Framework Code-First now has full support for such updates. We can configure the support for stored procedures using the familiar EntityTypeConfiguration class. We can do so simply by calling the MapToStoredProcedures method. Entity Framework will create stored procedures for us automatically if we let it manage database structures. We can override a stored procedure name or parameter names, if we want to, using appropriate overloads of the MapToStoredProcedures method. Let's use the Company table in our example: public class CompanyMap : EntityTypeConfiguration<Company> { public CompanyMap() {    MapToStoredProcedures(); } } If we just run the code to create or update the database, we will see new procedures created for us, named Company_Insert for an insert operation and similar names for other operations. Here is how the same class looks in VB.NET: Public Class CompanyMap Inherits EntityTypeConfiguration(Of Company) Public Sub New()    MapToStoredProcedures() End Sub End Class Here is how we can customize our procedure names if necessary: public class CompanyMap : EntityTypeConfiguration<Company> { public CompanyMap() {    MapToStoredProcedures(config =>      {        config.Delete(          procConfig =>          {            procConfig.HasName("CompanyDelete");            procConfig.Parameter(company => company.CompanyId, "companyId");          });        config.Insert(procConfig => procConfig.HasName("CompanyInsert"));        config.Update(procConfig => procConfig.HasName("CompanyUpdate"));      }); } } In this code, we performed the following: Changed the stored procedure name that deletes a company to CompanyDelete Changed the parameter name that this procedure accepts to companyId and specified that the value comes from the CompanyId property Changed the stored procedure name that performs insert operations on CompanyInsert Changed the stored procedure name that performs updates to CompanyUpdate Here is how the code looks in VB.NET: Public Class CompanyMap Inherits EntityTypeConfiguration(Of Company) Public Sub New()    MapToStoredProcedures( _      Sub(config)        config.Delete(          Sub(procConfig)            procConfig.HasName("CompanyDelete")            procConfig.Parameter(Function(company) company.CompanyId, "companyId")          End Sub        )        config.Insert(Function(procConfig) procConfig.HasName("CompanyInsert"))        config.Update(Function(procConfig) procConfig.HasName("CompanyUpdate"))      End Sub    ) End Sub End Class Of course, if you do not need to customize the names, your code will be much simpler. Summary Entity Framework provides a lot of value to the developers, allowing them to use C# or VB.NET code to manipulate database data. However, sometimes we have to drop a level lower, accessing data a bit more directly through views, dynamic SQL statements and/or stored procedures. We can use the ExecuteSqlCommand method to execute any arbitrary SQL code, including raw SQL or stored procedure. We can use the SqlQuery method to retrieve data from a view, stored procedure, or any other SQL statement, and Entity Framework takes care of materializing the data for us, based on the result type we provide. It is important to follow best practices when providing parameters to those two methods to avoid SQL injection vulnerability. Entity Framework also supports environments where there are requirements to perform all updates to entities via stored procedures. The framework will even write them for us, and we would only need to write one line of code per entity for this type of support, assuming we are happy with naming conventions and coding standards for such procedures. Resources for Article: Further resources on this subject: Developing with Entity Metadata Wrappers [article] Entity Framework DB First – Inheritance Relationships between Entities [article] The .NET Framework Primer [article]
Read more
  • 0
  • 0
  • 10826

article-image-signing-application-android-using-maven
Packt
18 Mar 2015
10 min read
Save for later

Signing an application in Android using Maven

Packt
18 Mar 2015
10 min read
In this article written by Patroklos Papapetrou and Jonathan LALOU, authors of the book Android Application Development with Maven, we'll learn different modes of digital signing and also using Maven to digitally sign the applications. The topics that we will explore in this article are: (For more resources related to this topic, see here.) Signing an application Android requires that all packages, in order to be valid for installation in devices, need to be digitally signed with a certificate. This certificate is used by the Android ecosystem to validate the author of the application. Thankfully, the certificate is not required to be issued by a certificate authority. It would be a total nightmare for every Android developer and it would increase the cost of developing applications. However, if you want to sign the certificate by a trusted authority like the majority of the certificates used in web servers, you are free to do it. Android supports two modes of signing: debug and release. Debug mode is used by default during the development of the application, and the release mode when we are ready to release and publish it. In debug mode, when building and packaging an application the Android SDK automatically generates a certificate and signs the package. So don't worry; even though we haven't told Maven to do anything about signing, Android knows what to do and behind the scenes signs the package with the autogenerated key. When it comes to distributing an application, debug mode is not enough; so, we need to prepare our own self-signed certificate and instruct Maven to use it instead of the default one. Before we dive to Maven configuration, let us quickly remind you how to issue your own certificate. Open a command window, and type the following command: keytool -genkey -v -keystore my-android-release-key.keystore -alias my-android-key -keyalg RSA -keysize 2048 -validity 10000 If the keytool command line utility is not in your path, then it's a good idea to add it. It's located under the %JAVA_HOME%/bin directory. Alternatively, you can execute the command inside this directory. Let us explain some parameters of this command. We use the keytool command line utility to create a new keystore file under the name my-android-release-key.keystore inside the current directory. The -alias parameter is used to define an alias name for this key, and it will be used later on in Maven configuration. We also specify the algorithm, RSA, and the key size, 2048; finally we set the validity period in days. The generated key will be valid for 10,000 days—long enough for many many new versions of our application! After running the command, you will be prompted to answer a series of questions. First, type twice a password for the keystore file. It's a good idea to note it down because we will use it again in our Maven configuration. Type the word :secret in both prompts. Then, we need to provide some identification data, like name, surname, organization details, and location. Finally, we need to set a password for the key. If we want to keep the same password with the keystore file, we can just hit RETURN. If everything goes well, we will see the final message that informs us that the key is being stored in the keystore file with the name we just defined. After this, our key is ready to be used to sign our Android application. The key used in debug mode can be found in this file: ~/.android.debug.keystore and contains the following information: Keystore name: "debug.keystore" Keystore password: "android" Key alias: "androiddebugkey" Key password: "android" CN: "CN=Android Debug,O=Android,C=US" Now, it's time to let Maven use the key we just generated. Before we add the necessary configuration to our pom.xml file, we need to add a Maven profile to the global Maven settings. The profiles defined in the user settings.xml file can be used by all Maven projects in the same machine. This file is usually located under this folder: %M2_HOME%/conf/settings.xml. One fundamental advantage of defining global profiles in user's Maven settings is that this configuration is not shared in the pom.xml file to all developers that work on the application. The settings.xml file should never be kept under the Source Control Management (SCM) tool. Users can safely enter personal or critical information like passwords and keys, which is exactly the case of our example. Now, edit the settings.xml file and add the following lines inside the <profiles> attribute: <profile> <id>release</id> <properties>    <sign.keystore>/path/to/my/keystore/my-android-release-    key.keystore</sign.keystore>    <sign.alias>my-android-key</sign.alias>    <sign.storepass>secret</sign.storepass>    <sign.keypass>secret</sign.keypass> </properties> </profile> Keep in mind that the keystore name, the alias name, the keystore password, and the key password should be the ones we used when we created the keystore file. Clearly, storing passwords in plain text, even in a file that is normally protected from other users, is not a very good practice. A quick way to make it slightly less easy to read the password is to use XML entities to write the value. Some sites on the internet like this one http://coderstoolbox.net/string/#!encoding=xml&action=encode&charset=none provide such encryptions. It will be resolved as plain text when the file is loaded; so Maven won't even notice it. In this case, this would become: <sign.storepass>&#115;&#101;&#99;&#114;&#101;&#116;</sign.storepass> We have prepared our global profile and the corresponding properties, and so we can now edit the pom.xml file of the parent project and do the proper configuration. Adding common configuration in the parent file for all Maven submodules is a good practice in our case because at some point, we would like to release both free and paid versions, and it's preferable to avoid duplicating the same configuration in two files. We want to create a new profile and add all the necessary settings there, because the release process is not something that runs every day during the development phase. It should run only at a final stage, when we are ready to deploy our application. Our first priority is to tell Maven to disable debug mode. Then, we need to specify a new Maven plugin name: maven-jarsigner-plugin, which is responsible for driving the verification and signing process for custom/private certificates. You can find the complete release profile as follows: <profiles> <profile>    <id>release</id>    <build>      <plugins>        <plugin>          <groupId>com.jayway.maven.plugins.android.generation2          </groupId>          <artifactId>android-maven-plugin</artifactId>          <extensions>true</extensions>          <configuration>            <sdk>              <platform>19</platform>            </sdk>            <sign>              <debug>false</debug>            </sign>          </configuration>        </plugin>        <plugin>          <groupId>org.apache.maven.plugins</groupId>          <artifactId>maven-jarsigner-plugin</artifactId>          <executions>            <execution>              <id>signing</id>              <phase>package</phase>              <goals>                <goal>sign</goal>                <goal>verify</goal>              </goals>              <inherited>true</inherited>              <configuration>                <removeExistingSignatures>true                </removeExistingSignatures>                <archiveDirectory />                <includes>                  <include>${project.build.directory}/                  ${project.artifactId}.apk</include>                </includes>                <keystore>${sign.keystore}</keystore>                <alias>${sign.alias}</alias>                <storepass>${sign.storepass}</storepass>                <keypass>${sign.keypass}</keypass>                <verbose>true</verbose>              </configuration>            </execution>          </executions>        </plugin>      </plugins>    </build> </profile> </profiles> We instruct the JAR signer plugin to be triggered during the package phase and run the goals of verification and signing. Furthermore, we tell the plugin to remove any existing signatures from the package and use the variable values we have defined in our global profile, $sign.alias, $sign.keystore, $sign.storepass and $sign.keypass. The "verbose" setting is used here to verify that the private key is used instead of the debug key. Before we run our new profile, for comparison purposes, let's package our application without using the signing capability. Open a terminal window, and type the following Maven command: mvn clean package When the command finishes, navigate to the paid version target directory, /PaidVersion/target, and take a look at its contents. You will notice that there are two packaging files: a PaidVersion.jar (size 14KB) and PaidVersion.apk (size 46KB). Since we haven't discussed yet about releasing an application, we can just run the following command in a terminal window and see how the private key is used for signing the package: mvn clean package -Prelease You must have probably noticed that we use only one profile name, and that is the beauty of Maven. Profiles with the same ID are merged together, and so it's easier to understand and maintain the build scripts. If you want to double-check that the package is signed with your private certificate, you can monitor the Maven output, and at some point you will see something similar to the following image: This output verifies that the classes have been properly signed through the execution of the Maven JAR signer plugin. To better understand how signing and optimization affects the packages generation, we can navigate again to the /PaidVersion/target directory and take a look at the files created. You will be surprised to see that the same packages exist again but they have different sizes. The PaidVersion.jar file has a size of 18KB, which is greater than the file generated without signing. However, the PaidVersion.apk is smaller (size 44KB) than the first version. These differences happen because the .jar file is signed with the new certificate; so the size is getting slightly bigger, but what about the .apk file. Should be also bigger because every file is signed with the certificate. The answer can be easily found if we open both the .apk files and compare them. They are compressed files so any well-known tool that opens compressed files can do this. If you take a closer look at the contents of the .apk files, you will notice that the contents of the .apk file that was generated using the private certificate are slightly larger except the resources.arsc file. This file, in the case of custom signing, is compressed, whereas in the debug signing mode it is in raw format. This explains why the signed version of the .apk file is smaller than the original one. There's also one last thing that verifies the correct completion of signing. Keep the compressed .apk files opened and navigate to the META-INF directory. These directories contain a couple of different files. The signed package with our personal certificate contains the key files named with the alias we used when we created the certificate and the package signed in debug mode contains the default certificate used by Android. Summary We know that Android developers struggle when it comes to proper package and release of an application to the public. We have analyzed in many details the necessary steps for a correct and complete packaging of Maven configuration. After reading this article, you should have basic knowledge of digitally signing packages with and without the help of Mavin in Android. Resources for Article: Further resources on this subject: Best Practices [article] Installing Apache Karaf [article] Apache Maven and m2eclipse [article]
Read more
  • 0
  • 0
  • 7229

article-image-azure-storage
Packt
17 Mar 2015
7 min read
Save for later

Azure Storage

Packt
17 Mar 2015
7 min read
In this article by John Chapman and Aman Dhally, authors of the book, Automating Microsoft Azure with PowerShell, you will see that Microsoft Azure offers a variety of different services to store and retrieve data in the cloud. This includes File and Blob storage. Within Azure, each of these types of data is contained within an Azure storage account. While Azure SQL databases are also storage mechanisms, they are not part of an Azure storage account. (For more resources related to this topic, see here.) Azure File storage versus Azure Blob storage In a Microsoft Azure storage account, both the Azure File storage service and the Azure Blob storage service can be used to store files. Deciding which service to use depends on the purpose of the content and who will use the content. To break down the differences and similarities between these two services, we will cover the features, structure, and common uses for each service. Azure File storage Azure File storage provides shared storage using the Server Message Block (SMB) protocol. This allows clients, such as Windows Explorer, to connect and browse the File storage (such as a typical network file share). In a Windows file share, clients can add directory structures and files to the share. Similar to file shares, Azure File storage is typically used within an organization and not with users outside the organization. Azure File shares can only be mounted in Windows Explorer as a drive within virtual machines running in Azure. They cannot be mounted from computers outside of Azure. A few common uses of Azure File storage include: Sharing files between on-premise computers and Azure virtual machines Storing application configuration and diagnostic files in shared location Sharing documents and other files with users in the same organization but in different geographical locations Azure Blob storage A blob refers to a binary large object, which might not be an actual file. The Azure Blob storage service is used to store large amounts of unstructured data. This data can be accessed via HTTP or HTTPS, making it particularly useful to share large amounts of data publicly. Within an Azure storage account, blobs are stored within containers. Each container can be public or private, but it does not offer any directory structure as the File storage service does. A few common uses of Azure Blob storage include: Serving images, style sheets (CSS), and static web files for a website, much like a content delivery network Streaming media Backups and disaster recovery Sharing files to external users Getting the Azure storage account keys Managing services provided by Microsoft Azure storage accounts require two pieces of information: the storage account name and an access key. While we can obtain this information from the Microsoft Azure web portal, we will do so with PowerShell. Azure storage accounts have a primary and a secondary access key. If one of the access key is compromised, it can be regenerated without affecting the other. To obtain the Azure storage account keys, we will use the following steps: Open Microsoft Azure PowerShell from the Start menu and connect it to an Azure subscription. Use the Get-AzureStorageKey cmdlet with the name of the storage account to retrieve the storage account key information and assign it to a variable: PS C:> $accountKey = Get-AzureStorageKey -StorageAccountName psautomation Use the Format-List cmdlet (PS C:> $accountKey | Format-List –Property Primary,Secondary) to display the Primary and Secondary access key properties. Note that we are using the PowerShell pipeline to use the Format-List cmdlet on the $accountKey variable: Assign one of the keys (Primary or Secondary) to a variable for us to use: PS C:> $key = $accountKey.Primary Using Azure File storage As mentioned in the Azure File storage versus Azure Blob storage section, Azure File services act much like typical network files shares. To demonstrate Azure File services, we will first create a file share. After this, we will create a directory, upload a file, and list the files in a directory. To complete Azure File storage tasks, we will use the following steps: In the PowerShell session from the Getting the Azure storage account keys section in which we obtained an access key, use the New-AzureStorageContext cmdlet to connect to the Azure storage account and assign it to a variable. Note that the first parameter is the name of the storage account, whereas the second parameter is the access key: PS C:> $context = New-AzureStorageContext psautomation $key Create a new file share using the New-AzureStorageShare cmdlet and assign it to a variable: PS C:> $share = New-AzureStorageShare psautomationshare –Context $context Create a new directory in the file share using the New-AzureStorageDirectory cmdlet: PS C:> New-AzureStorageDirectory –Share $share –Path TextFiles Before uploading a file to the newly created directory, we need to ensure that we have a file to upload. To create a sample file, we can use the Set-Content cmdlet to create a new text file: PS C:> Set-Content C:FilesMyFile.txt –Value "Hello" Upload a file to the newly created directory using the Set-AzureStorageFileContent cmdlet: PS C:> Set-AzureStorageFileContent –Share $share –Source C:FilesMyFile.txt –Path TextFiles Use the Get-AzureStorageFile cmdlet (PS C:> Get-AzureStorageFile –Share $share –Path TextFiles) to list the files in the directory (similar to executing the dir or ls commands), as shown in the following screenshot: Using Azure Blog storage As mentioned in the Azure File storage versus Azure Blob storage section, Azure Blob storage can be used to store any unstructured data, including file content. Blobs are stored within containers, whereas permissions are set at the container level. The permission levels that can be assigned to a container are shown in the following table: Permission level Access provided Container This provides anonymous read access to the container and all blobs in the container. In addition, it allows anonymous users to list the blobs in the container. Blob This provides anonymous read access to blobs within the container. Anonymous users cannot list all of the blobs in the container. Off This does not provide anonymous access. It is only accessible with the Azure storage account keys. To illustrate Azure Blob storage, we will use the following steps to create a public container, upload a file, and access the file from a web browser: In the PowerShell session from the Getting Azure storage account keys section in which we obtained an access key, use the New-AzureStorageContext cmdlet to connect to the Azure storage account and assign it to a variable. Note that the first parameter is the name of the storage account, whereas the second parameter is the access key: PS C:> $context = New-AzureStorageContext psautomation $key Use the New-AzureStorageContainer cmdlet to create a new public container. Note that the name must contain only numbers and lowercase letters. No special characters, spaces, or uppercase letters are permitted: PS C:> New-AzureStorageContainer –Name textfiles –Context $context –Permission Container Before uploading a file to the newly created directory, we need to ensure that we have a file to upload. To create a sample file, we can use the Set-Content cmdlet to create a new text file: PS C:> Set-Content C:FilesMyFile.txt –Value "Hello" Upload a file using the Set-AzureStorageBlobContent cmdlet: PS C:> Set-AzureStorageBlobContent –File C:FilesMyFile.txt –Blob "MyFile.txt" –Container textfiles –Context $context Navigate to the newly uploaded blob in Internet Explorer. The URL for the blob is formatted as https://<StorageAccountName>.blob.core.windows.net/<ContainerName>/<BlobName>. In our example, the URL is https://psautomation.blob.core.windows.net/textfiles/MyFile.txt, as shown in the following screenshot: Summary In this article, you learned about Microsoft Azure storage accounts and how to interact with the storage account services with PowerShell. This included the Azure File storage and Azure Blob storage. Resources for Article: Further resources on this subject: Using Azure BizTalk Features [Article] Windows Azure Mobile Services - Implementing Push Notifications using [Article] How to use PowerShell Web Access to manage Windows Server [Article]
Read more
  • 0
  • 0
  • 1930
article-image-code-sharing-between-ios-and-android
Packt
17 Mar 2015
24 min read
Save for later

Code Sharing Between iOS and Android

Packt
17 Mar 2015
24 min read
In this article by Jonathan Peppers, author of the book Xamarin Cross-platform Application Development, we will see how Xamarin's tools promise to share a good portion of your code between iOS and Android while taking advantage of the native APIs on each platform where possible. Doing so is an exercise in software engineering more than a programming skill or having the knowledge of each platform. To architect a Xamarin application to enable code sharing, it is a must to separate your application into distinct layers. We'll cover the basics of this in this article as well as specific options to consider in certain situations. In this article, we will cover: The MVVM design pattern for code sharing Project and solution organization strategies Portable Class Libraries (PCLs) Preprocessor statements for platform-specific code Dependency injection (DI) simplified Inversion of Control (IoC) (For more resources related to this topic, see here.) Learning the MVVM design pattern The Model-View-ViewModel (MVVM) design pattern was originally invented for Windows Presentation Foundation (WPF) applications using XAML for separating the UI from business logic and taking full advantage of data binding. Applications architected in this way have a distinct ViewModel layer that has no dependencies on its user interface. This architecture in itself is optimized for unit testing as well as cross-platform development. Since an application's ViewModel classes have no dependencies on the UI layer, you can easily swap an iOS user interface for an Android one and write tests against the ViewModellayer. The MVVM design pattern is also very similar to the MVC design pattern. The MVVM design pattern includes the following: Model: The Model layer is the backend business logic that drives the application and any business objects to go along with it. This can be anything from making web requests to a server to using a backend database. View: This layer is the actual user interface seen on the screen. In the case of cross-platform development, it includes any platform-specific code for driving the user interface of the application. On iOS, this includes controllers used throughout an application, and on Android, an application's activities. ViewModel: This layer acts as the glue in MVVM applications. The ViewModel layerscoordinate operations between the View and Model layers. A ViewModel layer will contain properties that the View will get or set, and functions for each operation that can be made by the user on each View. The ViewModel layer will also invoke operations on the Model layer if needed. The following figure shows you the MVVM design pattern: It is important to note that the interaction between the View and ViewModel layers is traditionally created by data binding with WPF. However, iOS and Android do not have built-in data binding mechanisms, so our general approach throughout the article will be to manually call the ViewModel layer from the View layer. There are a few frameworks out there that provide data binding functionality such as MVVMCross and Xamarin.Forms. Implementing MVVM in an example To understand this pattern better, let's implement a common scenario. Let's say we have a search box on the screen and a search button. When the user enters some text and clicks on the button, a list of products and prices will be displayed to the user. In our example, we use the async and await keywords that are available in C# 5 to simplify asynchronous programming. To implement this feature, we will start with a simple model class (also called a business object) as follows: public class Product{   public int Id { get; set; } //Just a numeric identifier   public string Name { get; set; } //Name of the product   public float Price { get; set; } //Price of the product} Next, we will implement our Model layer to retrieve products based on the searched term. This is where the business logic is performed, expressing how the search needs to actually work. This is seen in the following lines of code: // An example class, in the real world would talk to a web// server or database.public class ProductRepository{// a sample list of products to simulate a databaseprivate Product[] products = new[]{   new Product { Id = 1, Name = “Shoes”, Price = 19.99f },   new Product { Id = 2, Name = “Shirt”, Price = 15.99f },   new Product { Id = 3, Name = “Hat”, Price = 9.99f },};public async Task<Product[]> SearchProducts(   string searchTerm){   // Wait 2 seconds to simulate web request   await Task.Delay(2000);    // Use Linq-to-objects to search, ignoring case   searchTerm = searchTerm.ToLower();   return products.Where(p =>      p.Name.ToLower().Contains(searchTerm))   .ToArray();}} It is important to note here that the Product and ProductRepository classes are both considered as a part of the Model layer of a cross-platform application. Some might consider ProductRepository as a service that is generally a self-contained class to retrieve data. It is a good idea to separate this functionality into two classes. The Product class's job is to hold information about a product, while the ProductRepository class is in charge of retrieving products. This is the basis for the single responsibility principle, which states that each class should only have one job or concern. Next, we will implement a ViewModel class as follows: public class ProductViewModel{private readonly ProductRepository repository =    new ProductRepository(); public string SearchTerm{   get;   set;}public Product[] Products{   get;   private set;}public async Task Search(){   if (string.IsNullOrEmpty(SearchTerm))     Products = null;   else     Products = await repository.SearchProducts(SearchTerm);}} From here, your platform-specific code starts. Each platform will handle managing an instance of a ViewModel class, setting the SearchTerm property, and calling Search when the button is clicked. When the task completes, the user interface layer will update a list displayed on the screen. If you are familiar with the MVVM design pattern used with WPF, you might notice that we are not implementing INotifyPropertyChanged for data binding. Since iOS and Android don't have the concept of data binding, we omitted this functionality. If you plan on having a WPF or Windows 8 version of your mobile application or are using a framework that provides data binding, you should implement support for it where needed. Comparing project organization strategies You might be asking yourself at this point, how do I set up my solution in Xamarin Studio to handle shared code and also have platform-specific projects? Xamarin.iOS applications can only reference Xamarin.iOS class libraries, so setting up a solution can be problematic. There are several strategies for setting up a cross-platform solution, each with its own advantages and disadvantages. Options for cross-platform solutions are as follows: File Linking: For this option, you will start with either a plain .NET 4.0 or .NET 4.5 class library that contains all the shared code. You would then have a new project for each platform you want your app to run on. Each platform-specific project will have a subdirectory with all of the files linked in from the first class library. To set this up, add the existing files to the project and select the Add a link to the file option. Any unit tests can run against the original class library. The advantages and disadvantages of file linking are as follows: Advantages: This approach is very flexible. You can choose to link or not link certain files and can also use preprocessor directives such as #if IPHONE. You can also reference different libraries on Android versus iOS. Disadvantages: You have to manage a file's existence in three projects: core library, iOS, and Android. This can be a hassle if it is a large application or if many people are working on it. This option is also a bit outdated since the arrival of shared projects. Cloned Project Files: This is very similar to file linking. The main difference being that you have a class library for each platform in addition to the main project. By placing the iOS and Android projects in the same directory as the main project, the files can be added without linking. You can easily add files by right-clicking on the solution and navigating to Display Options | Show All Files. Unit tests can run against the original class library or the platform-specific versions: Advantages: This approach is just as flexible as file linking, but you don't have to manually link any files. You can still use preprocessor directives and reference different libraries on each platform. Disadvantages: You still have to manage a file's existence in three projects. There is additionally some manual file arranging required to set this up. You also end up with an extra project to manage on each platform. This option is also a bit outdated since the arrival of shared projects. Shared Projects: Starting with Visual Studio 2013 Update 2, Microsoft created the concept of shared projects to enable code sharing between Windows 8 and Windows Phone apps. Xamarin has also implemented shared projects in Xamarin Studio as another option to enable code sharing. Shared projects are virtually the same as file linking, since adding a reference to a shared project effectively adds its files to your project: Advantages: This approach is the same as file linking, but a lot cleaner since your shared code is in a single project. Xamarin Studio also provides a dropdown to toggle between each referencing project, so that you can see the effect of preprocessor statements in your code. Disadvantages: Since all the files in a shared project get added to each platform's main project, it can get ugly to include platform-specific code in a shared project. Preprocessor statements can quickly get out of hand if you have a large team or have team members that do not have a lot of experience. A shared project also doesn't compile to a DLL, so there is no way to share this kind of project without the source code. Portable Class Libraries: This is the most optimal option; you begin the solution by making a Portable Class Library (PCL) project for all your shared code. This is a special project type that allows multiple platforms to reference the same project, allowing you to use the smallest subset of C# and the .NET framework available in each platform. Each platform-specific project will reference this library directly as well as any unit test projects: Advantages: All your shared code is in one project, and all platforms use the same library. Since preprocessor statements aren't possible, PCL libraries generally have cleaner code. Platform-specific code is generally abstracted away by interfaces or abstract classes. Disadvantages: You are limited to a subset of .NET depending on how many platforms you are targeting. Platform-specific code requires use of dependency injection, which can be a more advanced topic for developers not familiar with it. Setting up a cross-platform solution To understand each option completely and what different situations call for, let's define a solution structure for each cross-platform solution. Let's use the product search example and set up a solution for each approach. To set up file linking, perform the following steps: Open Xamarin Studio and start a new solution. Select a new Library project under the general C# section. Name the project ProductSearch.Core, and name the solution ProductSearch. Right-click on the newly created project and select Options. Navigate to Build | General, and set the Target Framework option to .NET Framework 4.5. Add the Product, ProductRepository, and ProductViewModel classes to the project. You will need to add using System.Threading.Tasks; and using System.Linq; where needed. Navigate to Build | Build All from the menu at the top to be sure that everything builds properly. Now, let's create a new iOS project by right-clicking on the solution and navigating to Add | Add New Project. Then, navigate to iOS | iPhone | Single View Application and name the project ProductSearch.iOS. Create a new Android project by right-clicking on the solution and navigating to Add | Add New Project. Create a new project by navigating to Android | Android Application and name it ProductSearch.Droid. Add a new folder named Core to both the iOS and Android projects. Right-click on the new folder for the iOS project and navigate to Add | Add Files from Folder. Select the root directory for the ProductSearch.Core project. Check the three C# files in the root of the project. An Add File to Folder dialog will appear. Select Add a link to the file and make sure that the Use the same action for all selected files checkbox is selected. Repeat this process for the Android project. Navigate to Build | Build All from the menu at the top to double-check everything. You have successfully set up a cross-platform solution with file linking. When all is done, you will have a solution tree that looks something like what you can see in the following screenshot: You should consider using this technique when you have to reference different libraries on each platform. You might consider using this option if you are using MonoGame, or other frameworks that require you to reference a different library on iOS versus Android. Setting up a solution with the cloned project files approach is similar to file linking, except that you will have to create an additional class library for each platform. To do this, create an Android library project and an iOS library project in the same ProductSearch.Core directory. You will have to create the projects and move them to the proper folder manually, then re-add them to the solution. Right-click on the solution and navigate to Display Options | Show All Files to add the required C# files to these two projects. Your main iOS and Android projects can reference these projects directly. Your project will look like what is shown in the following screenshot, with ProductSearch.iOS referencing ProductSearch.Core.iOS and ProductSearch.Droid referencing ProductSearch.Core.Droid: Working with Portable Class Libraries A Portable Class Library (PCL) is a C# library project that can be supported on multiple platforms, including iOS, Android, Windows, Windows Store apps, Windows Phone, Silverlight, and Xbox 360. PCLs have been an effort by Microsoft to simplify development across different versions of the .NET framework. Xamarin has also added support for iOS and Android for PCLs. Many popular cross-platform frameworks and open source libraries are starting to develop PCL versions such as Json.NET and MVVMCross. Using PCLs in Xamarin Let's create our first portable class library: Open Xamarin Studio and start a new solution. Select a new Portable Library project under the general C# section. Name the project ProductSearch.Core and name the solution ProductSearch. Add the Product, ProductRepository, and ProductViewModel classes to the project. You will need to add using System.Threading.Tasks; and using System.Linq; where needed. Navigate to Build | Build All from the menu at the top to be sure that everything builds properly. Now, let's create a new iOS project by right-clicking on the solution and navigating to Add | Add New Project. Create a new project by navigating to iOS | iPhone | Single View Application and name it ProductSearch.iOS. Create a new Android project by right-clicking on the solution and navigating to Add | Add New Project. Then, navigate to Android | Android Application and name the project ProductSearch.Droid. Simply add a reference to the portable class library from the iOS and Android projects. Navigate to Build | Build All from the top menu and you have successfully set up a simple solution with a portable library. Each solution type has its distinct advantages and disadvantages. PCLs are generally better, but there are certain cases where they can't be used. For example, if you were using a library such as MonoGame, which is a different library for each platform, you would be much better off using a shared project or file linking. Similar issues would arise if you needed to use a preprocessor statement such as #if IPHONE or a native library such as the Facebook SDK on iOS or Android. Setting up a shared project is almost the same as setting up a portable class library. In step 2, just select Shared Project under the general C# section and complete the remaining steps as stated. Using preprocessor statements When using shared projects, file linking, or cloned project files, one of your most powerful tools is the use of preprocessor statements. If you are unfamiliar with them, C# has the ability to define preprocessor variables such as #define IPHONE , allowing you to use #if IPHONE or #if !IPHONE. The following is a simple example of using this technique: #if IPHONEConsole.WriteLine(“I am running on iOS”);#elif ANDROIDConsole.WriteLine(“I am running on Android”);#elseConsole.WriteLine(“I am running on ???”);#endif In Xamarin Studio, you can define preprocessor variables in your project's options by navigating to Build | Compiler | Define Symbols, delimited with semicolons. These will be applied to the entire project. Be warned that you must set up these variables for each configuration setting in your solution (Debug and Release); this can be an easy step to miss. You can also define these variables at the top of any C# file by declaring #define IPHONE, but they will only be applied within the C# file. Let's go over another example, assuming that we want to implement a class to open URLs on each platform: public static class Utility{public static void OpenUrl(string url){   //Open the url in the native browser}} The preceding example is a perfect candidate for using preprocessor statements, since it is very specific to each platform and is a fairly simple function. To implement the method on iOS and Android, we will need to take advantage of some native APIs. Refactor the class to look as follows: #if IPHONE//iOS using statementsusing MonoTouch.Foundation;using MonoTouch.UIKit;#elif ANDROID//Android using statementsusing Android.App;using Android.Content;using Android.Net;#else//Standard .Net using statementusing System.Diagnostics;#endif public static class Utility{#if ANDROID   public static void OpenUrl(Activity activity, string url)#else   public static void OpenUrl(string url)#endif{   //Open the url in the native browser   #if IPHONE     UIApplication.SharedApplication.OpenUrl(       NSUrl.FromString(url));   #elif ANDROID     var intent = new Intent(Intent.ActionView,       Uri.Parse(url));     activity.StartActivity(intent);   #else     Process.Start(url);   #endif}} The preceding class supports three different types of projects: Android, iOS, and a standard Mono or .NET framework class library. In the case of iOS, we can perform the functionality with static classes available in Apple's APIs. Android is a little more problematic and requires an Activity object to launch a browser natively. We get around this by modifying the input parameters on Android. Lastly, we have a plain .NET version that uses Process.Start() to launch a URL. It is important to note that using the third option would not work on iOS or Android natively, which necessitates our use of preprocessor statements. Using preprocessor statements is not normally the cleanest or the best solution for cross-platform development. They are generally best used in a tight spot or for very simple functions. Code can easily get out of hand and can become very difficult to read with many #if statements, so it is always better to use it in moderation. Using inheritance or interfaces is generally a better solution when a class is mostly platform specific. Simplifying dependency injection Dependency injection at first seems like a complex topic, but for the most part it is a simple concept. It is a design pattern aimed at making your code within your applications more flexible so that you can swap out certain functionality when needed. The idea builds around setting up dependencies between classes in an application so that each class only interacts with an interface or base/abstract class. This gives you the freedom to override different methods on each platform when you need to fill in native functionality. The concept originated from the SOLID object-oriented design principles, which is a set of rules you might want to research if you are interested in software architecture. There is a good article about SOLID on Wikipedia, (http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29) if you would like to learn more. The D in SOLID, which we are interested in, stands for dependencies. Specifically, the principle declares that a program should depend on abstractions, not concretions (concrete types). To build upon this concept, let's walk you through the following example: Let's assume that we need to store a setting in an application that determines whether the sound is on or off. Now let's declare a simple interface for the setting: interface ISettings { bool IsSoundOn { get; set; } }. On iOS, we'd want to implement this interface using the NSUserDefaults class. Likewise, on Android, we will implement this using SharedPreferences. Finally, any class that needs to interact with this setting will only reference ISettings so that the implementation can be replaced on each platform. For reference, the full implementation of this example will look like the following snippet: public interface ISettings{bool IsSoundOn{   get;   set;}}//On iOSusing MonoTouch.UIKit;using MonoTouch.Foundation; public class AppleSettings : ISettings{public bool IsSoundOn{   get   {     return NSUserDefaults.StandardUserDefaults     BoolForKey(“IsSoundOn”);   }   set   {     var defaults = NSUserDefaults.StandardUserDefaults;     defaults.SetBool(value, “IsSoundOn”);     defaults.Synchronize();   }}}//On Androidusing Android.Content; public class DroidSettings : ISettings{private readonly ISharedPreferences preferences; public DroidSettings(Context context){   preferences = context.GetSharedPreferences(     context.PackageName, FileCreationMode.Private);}public bool IsSoundOn{   get   {     return preferences.GetBoolean(“IsSoundOn”, true”);   }   set   {     using (var editor = preferences.Edit())     {       editor.PutBoolean(“IsSoundOn”, value);       editor.Commit();     }   }}} Now you will potentially have a ViewModel class that will only reference ISettings when following the MVVM pattern. It can be seen in the following snippet: public class SettingsViewModel{  private readonly ISettings settings;  public SettingsViewModel(ISettings settings)  {    this.settings = settings;  }  public bool IsSoundOn  {    get;    set;  }  public void Save()  {    settings.IsSoundOn = IsSoundOn;  }} Using a ViewModel layer for such a simple example is not necessarily needed, but you can see it would be useful if you needed to perform other tasks such as input validation. A complete application might have a lot more settings and might need to present the user with a loading indicator. Abstracting out your setting's implementation has other benefits that add flexibility to your application. Let's say you suddenly need to replace NSUserDefaults on iOS with the iCloud instead; you can easily do so by implementing a new ISettings class and the remainder of your code will remain unchanged. This will also help you target new platforms such as Windows Phone, where you might choose to implement ISettings in a platform-specific way. Implementing Inversion of Control You might be asking yourself at this point in time, how do I switch out different classes such as the ISettings example? Inversion of Control (IoC) is a design pattern meant to complement dependency injection and solve this problem. The basic principle is that many of the objects created throughout your application are managed and created by a single class. Instead of using the standard C# constructors for your ViewModel or Model classes, a service locator or factory class will manage them throughout the application. There are many different implementations and styles of IoC, so let's implement a simple service locator class as follows: public static class ServiceContainer{  static readonly Dictionary<Type, Lazy<object>> services =    new Dictionary<Type, Lazy<object>>();  public static void Register<T>(Func<T> function)  {    services[typeof(T)] = new Lazy<object>(() => function());  }  public static T Resolve<T>()  {    return (T)Resolve(typeof(T));  }  public static object Resolve(Type type)  {    Lazy<object> service;    if (services.TryGetValue(type, out service)    {      return service.Value;    }    throw new Exception(“Service not found!”);  }} This class is inspired by the simplicity of XNA/MonoGame's GameServiceContainer class and follows the service locator pattern. The main differences are the heavy use of generics and the fact that it is a static class. To use our ServiceContainer class, we will declare the version of ISettings or other interfaces that we want to use throughout our application by calling Register, as seen in the following lines of code: //iOS version of ISettingsServiceContainer.Register<ISettings>(() => new AppleSettings());//Android version of ISettingsServiceContainer.Register<ISettings>(() => new DroidSettings());//You can even register ViewModelsServiceContainer.Register<SettingsViewMode>(() =>   new SettingsViewModel()); On iOS, you can place this registration code in either your static void Main() method or in the FinishedLaunching method of your AppDelegate class. These methods are always called before the application is started. On Android, it is a little more complicated. You cannot put this code in the OnCreate method of your activity that acts as the main launcher. In some situations, the Android OS can close your application but restart it later in another activity. This situation is likely to cause an exception somewhere. The guaranteed safe place to put this is in a custom Android Application class which has an OnCreate method that is called prior to any activities being created in your application. The following lines of code show you the use of the Application class: [Application]public class Application : Android.App.Application{  //This constructor is required  public Application(IntPtr javaReference, JniHandleOwnership     transfer): base(javaReference, transfer)  {  }  public override void OnCreate()  {    base.OnCreate();    //IoC Registration here  }} To pull a service out of the ServiceContainer class, we can rewrite the constructor of the SettingsViewModel class so that it is similar to the following lines of code: public SettingsViewModel(){  this.settings = ServiceContainer.Resolve<ISettings>();} Likewise, you will use the generic Resolve method to pull out any ViewModel classes you would need to call from within controllers on iOS or activities on Android. This is a great, simple way to manage dependencies within your application. There are, of course, some great open source libraries out there that implement IoC for C# applications. You might consider switching to one of them if you need more advanced features for service location or just want to graduate to a more complicated IoC container. Here are a few libraries that have been used with Xamarin projects: TinyIoC: https://github.com/grumpydev/TinyIoC Ninject: http://www.ninject.org/ MvvmCross: https://github.com/slodge/MvvmCross includes a full MVVM framework as well as IoC Simple Injector: http://simpleinjector.codeplex.com OpenNETCF.IoC: http://ioc.codeplex.com Summary In this article, we learned about the MVVM design pattern and how it can be used to better architect cross-platform applications. We compared several project organization strategies for managing a Xamarin Studio solution that contains both iOS and Android projects. We went over portable class libraries as the preferred option for sharing code and how to use preprocessor statements as a quick and dirty way to implement platform-specific code. After completing this article, you should be able to speed up with several techniques for sharing code between iOS and Android applications using Xamarin Studio. Using the MVVM design pattern will help you divide your shared code and code that is platform specific. We also covered several options for setting up cross-platform Xamarin solutions. You should also have a firm understanding of using dependency injection and Inversion of Control to give your shared code access to the native APIs on each platform. Resources for Article:   Further resources on this subject: XamChat – a Cross-platform App [article] Configuring Your Operating System [article] Updating data in the background [article]
Read more
  • 0
  • 0
  • 16763

article-image-text-mining-r-part-1
Robi Sen
16 Mar 2015
7 min read
Save for later

Text Mining with R: Part 1

Robi Sen
16 Mar 2015
7 min read
R is rapidly becoming the platform of choice for programmers, scientists, and others who need to perform statistical analysis and data mining. In part this is because R is incredibly easy to learn and with just a few commands you can perform data mining and analysis functions that would be very hard in more general purpose languages like Ruby, .Net, Java, or C++. To demonstrate R’s ease, flexibility, and power we will look at how to use R to look at a collection of tweets from the 2014 super bowl, clear up data via R, turn that data it a document matrix so we can analyze the data, then create a “word cloud” so we can visualize our analysis to look for interesting words. Getting Started To get started you need to download both R and R studio. R can be found here and RStudio can be found here. R and RStudio are available for most major operating systems and you should follow the up to date installation guides on their respective websites. For this example we are going to be using a data set from Techtunk which is rather large. For this article I have taken a small excerpt of techtrunks SuperBowl 2014, over 8 million tweets, and cleaned it up for the article. You can download it from the original data source here. Finally you will need to install the R packages text mining package (tm ) and word cloud package (wordcloud). You can use standard library method to install the packages or just use RStudio to install the packets. Preparing our Data As already stated you can find the total SuperBowl 2014 dataset. That being said, it's very large and broken up into many sets of Pipe Delimited files, and they have the .csv file extension but are not .csv, which can be somewhat awkward to work with. This though is a common problem when working with large data sets. Luckily the data set is broken up into fragments in that usually when you are working with large datasets you do not want to try to start developing against the whole data set rather a small and workable dataset the will let you quickly develop your scripts without being so large and unwieldy that it delays development. Indeed you will find that working the large files provided by Techtunk can take 10’s of minutes to process as is. In cases like this is good to look at the data, figure out what data you want, take a sample set of data, massage it as needed, and then work in it from there until you have your coding working exactly how you want. In our cases I took a subset of 4600 tweets from one of the pipe delimited files, converted the file format to Commas Separated Value, .csv, and saved it as a sample file to work from. You can do the same thing, you should consider using files smaller than 5000 records, however you would like or use the file created for this post here. Visualizing our Data For this post all we want to do is get a general sense of what the more common words are that are being tweeted during the superbowl. A common way to visualize this is with a word cloud which will show us the frequency of a term by representing it in greater size to other words in comparison to how many times it is mentioned in the body of tweets being analyzed. To do this we need to a few things first with our data. First we need to create read in our file and turn our collection of tweets into a Corpus. In general a Corpus is a large body of text documents. In R’s textming package it’s an object that will be used to hold our tweets in memory. So to load our tweets as a corpus into R you can do as shown here: # change this file location to suit your machine file_loc <- "yourfilelocation/largerset11.csv" # change TRUE to FALSE if you have no column headings in the CSV myfile <- read.csv(file_loc, header = TRUE, stringsAsFactors=FALSE) require(tm) mycorpus <- Corpus(DataframeSource(myfile[c("username","tweet")])) You can now simply print your Corpus to get a sense of it. > print(mycorpus) <<VCorpus (documents: 4688, metadata (corpus/indexed): 0/0)>> In this case, VCorpus is an automatic assignment meaning that the Corpus is a Volatile object stored in memory. If you want you can make the Corpus permanent using PCorpus. You might do this if you were doing analysis on actual documents such as PDF’s or even databases and in this case R makes pointers to the documents instead of full document structures in memory. Another method you can use to look at your corpus is inspect() which provides a variety of ways to look at the documents in your corpus. For example using: inspect(mycorpus[1,2]) This might give you a result like: > inspect(mycorpus[1:2]) <<VCorpus (documents: 2, metadata (corpus/indexed): 0/0)>> [[1]] <<PlainTextDocument (metadata: 7)>> sstanley84 wow rt thestanchion news fleury just made fraudulent investment karlsson httptco5oi6iwashg [[2]] <<PlainTextDocument (metadata: 7)>> judemgreen 2 hour wait train superbowl time traffic problemsnice job chris christie As such inspect can be very useful in quickly getting a sense of data in your corpus without having to try to print the whole corpus. Now that we have our corpus in memory let's clean it up a little before we do our analysis. Usually you want to do this on documents that you are analyzing to remove words that are not relevant to your analysis such as “stopwords” or words such as and, like, but, if, the, and the like which you don’t care about. To do this with the textmining package you want to use transforms. Transforms essentially various functions to all documents in a corpus and that the form of tm_map(your corpus, some function). For example we can use tm_map like this: mycorpus <- tm_map(mycorpus, removePunctuation) Which will now remove all the punctuation marks from our tweets. We can do some other transforms to clean up our data by converting all the text to lower case, removing stop words, extra whitespace, and the like. mycorpus <- tm_map(mycorpus, removePunctuation) mycorpus <- tm_map(mycorpus, content_transformer(tolower)) mycorpus <- tm_map(mycorpus, stripWhitespace) mycorpus <- tm_map(mycorpus, removeWords, c(stopwords("english"), "news")) Note the last line. In that case we are using the stopwords() method but also adding our own word to it; news. You could append your own list of stopwords in this manner. Summary In this post we have looked at the basics of doing text mining in R by selecting data, preparing it, cleaning, then performing various operations on it to visualize that data. In the next post, Part 2, we look at a simple use case showing how we can derive real meaning and value from a visualization by seeing how a simple word cloud and help you understand the impact of an advertisement. About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as UnderArmour, Sony, CISCO, IBM, and many others to help build out new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 3404
Modal Close icon
Modal Close icon