Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7010 Articles
article-image-drupal-8-and-configuration-management
Packt
18 Mar 2015
15 min read
Save for later

Drupal 8 and Configuration Management

Packt
18 Mar 2015
15 min read
In this article, by the authors, Stefan Borchert and Anja Schirwinski, of the book, Drupal 8 Configuration Management,we will learn the inner workings of the Configuration Management system in Drupal 8. You will learn about config and schema files and read about the difference between simple configuration and configuration entities. (For more resources related to this topic, see here.) The config directory During installation, Drupal adds a directory within sites/default/files called config_HASH, where HASH is a long random string of letters and numbers, as shown in the following screenshot: This sequence is a random hash generated during the installation of your Drupal site. It is used to add some protection to your configuration files. Additionally to the default restriction enforced by the .htaccess file within the subdirectories of the config directory that prevents unauthorized users from seeing the content of the directories. As a result, would be really hard for someone to guess the folder's name. Within the config directory, you will see two additional directories that are empty by default (leaving the .htaccess and README.txt files aside). One of the directories is called active. If you change the configuration system to use file storage instead of the database for active Drupal site configuration, this directory will contain the active configuration. If you did not customize the storage mechanism of the active configuration (we will learn later how to do this), Drupal 8 uses the database to store the active configuration. The other directory is called staging. This directory is empty by default, but can host the configuration you want to be imported into your Drupal site from another installation. You will learn how to use this later on in this article. A simple configuration example First, we want to become familiar with configuration itself. If you look into the database of your Drupal installation and open up the config table , you will find the entire active configuration of your site, as shown in the following screenshot: Depending on your site's configuration, table names may be prefixed with a custom string, so you'll have to look for a table name that ends with config. Don't worry about the strange-looking text in the data column; this is the serialized content of the corresponding configuration. It expands to single configuration values—that is, system.site.name, which holds the value of the name of your site. Changing the site's name in the user interface on admin/config/system/site-information will immediately update the record in the database; thus, put simply the records in the table are the current state of your site's configuration, as shown in the following screenshot: But where does the initial configuration of your site come from? Drupal itself and the modules you install must use some kind of default configuration that gets added to the active storage during installation. Config and schema files – what are they and what are they used for? In order to provide a default configuration during the installation process, Drupal (modules and profiles) comes with a bunch of files that hold the configuration needed to run your site. To make parsing of these files simple and enhance readability of these configuration files, the configuration is stored using the YAML format. YAML (http://yaml.org/) is a data-orientated serialization standard that aims for simplicity. With YAML, it is easy to map common data types such as lists, arrays, or scalar values. Config files Directly beneath the root directory of each module and profile defining or overriding configuration (either core or contrib), you will find a directory named config. Within this directory, there may be two more directories (although both are optional): install and schema. Check the image module inside core/modules and take a look at its config directory, as shown in the following screenshot: The install directory shown in the following screenshot contains all configuration values that the specific module defines or overrides and that are stored in files with the extension .yml (one of the default extensions for files in the YAML format): During installation, the values stored in these files are copied to the active configuration of your site. In the case of default configuration storage, the values are added to the config table; in file-based configuration storage mechanisms, on the other hand, the files are copied to the appropriate directories. Looking at the filenames, you will see that they follow a simple convention: <module name>.<type of configuration>[.<machine name of configuration object>].yml (setting aside <module name>.settings.yml for now). The explanation is as follows: <module name>: This is the name of the module that defines the settings included in the file. For instance, the image.style.large.yml file contains settings defined by the image module. <type of configuration>: This can be seen as a type of group for configuration objects. The image module, for example, defines several image styles. These styles are a set of different configuration objects, so the group is defined as style. Hence, all configuration files that contain image styles defined by the image module itself are named image.style.<something>.yml. The same structure applies to blocks (<block.block.*.yml>), filter formats (<filter.format.*.yml>), menus (<system.menu.*.yml>), content types (<node.type.*.yml>), and so on. <machine name of configuration object>: The last part of the filename is the unique machine-readable name of the configuration object itself. In our examples from the image module, you see three different items: large, medium, and thumbnail. These are exactly the three image styles you will find on admin/config/media/image-styles after installing a fresh copy of Drupal 8. The image styles are shown in the following screenshot: Schema files The primary reason schema files were introduced into Drupal 8 is multilingual support. A tool was needed to identify all translatable strings within the shipped configuration. The secondary reason is to provide actual translation forms for configuration based on your data and to expose translatable configuration pieces to external tools. Each module can have as many configuration the .yml files as needed. All of these are explained in one or more schema files that are shipped with the module. As a simple example of how schema files work, let's look at the system's maintenance settings in the system.maintenance.yml file at core/modules/system/config/install. The file's contents are as follows: message: '@site is currently under maintenance. We should be back shortly. Thank you for your patience.' langcode: en The system module's schema files live in core/modules/system/config/schema. These define the basic types but, for our example, the most important aspect is that they define the schema for the maintenance settings. The corresponding schema section from the system.schema.yml file is as follows: system.maintenance: type: mapping label: 'Maintenance mode' mapping:    message:      type: text      label: 'Message to display when in maintenance mode'    langcode:      type: string      label: 'Default language' The first line corresponds to the filename for the .yml file, and the nested lines underneath the first line describe the file's contents. Mapping is a basic type for key-value pairs (always the top-level type in .yml). The system.maintenance.yml file is labeled as label: 'Maintenance mode'. Then, the actual elements in the mapping are listed under the mapping key. As shown in the code, the file has two items, so the message and langcode keys are described. These are a text and a string value, respectively. Both values are given a label as well in order to identify them in configuration forms. Learning the difference between active and staging By now, you know that Drupal works with the two directories active and staging. But what is the intention behind those directories? And how do we use them? The configuration used by your site is called the active configuration since it's the configuration that is affecting the site's behavior right now. The current (active) configuration is stored in the database and direct changes to your site's configuration go into the specific tables. The reason Drupal 8 stores the active configuration in the database is that it enhances performance and security. Source: https://www.drupal.org/node/2241059. However, sometimes you might not want to store the active configuration in the database and might need to use a different storage mechanism. For example, using the filesystem as configuration storage will enable you to track changes in the site's configuration using a versioning system such as Git or SVN. Changing the active configuration storage If you do want to switch your active configuration storage to files, here's how: Note that changing the configuration storage is only possible before installing Drupal. After installing it, there is no way to switch to another configuration storage! To use a different configuration storage mechanism, you have to make some modifications to your settings.php file. First, you'll need to find the section named Active configuration settings. Now you will have to uncomment the line that starts with $settings['bootstrap_config_storage'] to enable file-based configuration storage. Additionally, you need to copy the existing default.services.yml (next to your settings.php file) to a file named services.yml and enable the new configuration storage: services: # Override configuration storage. config.storage:    class: DrupalCoreConfigCachedStorage    arguments: ['@config.storage.active', '@cache.config'] config.storage.active:    # Use file storage for active configuration.    alias: config.storage.file This tells Drupal to override the default service used for configuration storage and use config.storage.file as the active configuration storage mechanism instead of the default database storage. After installing the site with these settings, we will take another look at the config directory in sites/default/files (assuming you didn't change to the location of the active and staging directory): As you can see, the active directory contains the entire site's configuration. The files in this directory get copied here during the website's installation process. Whenever you make a change to your website, the change is reflected in these files. Exporting a configuration always exports a snapshot of the active configuration, regardless of the storage method. The staging directory contains the changes you want to add to your site. Drupal compares the staging directory to the active directory and checks for changes between them. When you upload your compressed export file, it actually gets placed inside the staging directory. This means you can save yourself the trouble of using the interface to export and import the compressed file if you're comfortable enough with copy-and-pasting files to another directory. Just make sure you copy all of the files to the staging directory even if only one of the files was changed. Any missing files are interpreted as deleted configuration, and will mess up your site. In order to get the contents of staging into active, we simply have to use the synchronize option at admin/config/development/configuration again. This page will show us what was changed and allows us to import the changes. On importing, your active configuration will get overridden with the configuration in your staging directory. Note that the files inside the staging directory will not be removed after the synchronization is finished. The next time you want to copy-and-paste from your active directory, make sure you empty staging first. Note that you cannot override files directly in the active directory. The changes have to be made inside staging and then synchronized. Changing the storage location of the active and staging directories In case you do not want Drupal to store your configuration in sites/default/files, you can set the path according to your wishes. Actually, this is recommended for security reasons, as these directories should never be accessible over the Web or by unauthorized users on your server. Additionally, it makes your life easier if you work with version control. By default, the whole files directory is usually ignored in version-controlled environments because Drupal writes to it, and having the active and staging directory located within sites/default/files would result in them being ignored too. So how do we change the location of the configuration directories? Before installing Drupal, you will need to create and modify the settings.php file that Drupal uses to load its basic configuration data from (that is, the database connection settings). If you haven't done it yet, copy the default.settings.php file and rename the copy to settings.php. Afterwards, open the new file with the editor of your choice and search for the following line: $config_directories = array(); Change the preceding line to the following (or simply insert your addition at the bottom of the file). $config_directories = array( CONFIG_ACTIVE_DIRECTORY => './../config/active', // folder outside the webroot CONFIG_STAGING_DIRECTORY => './../config/staging', // folder outside the webroot ); The directory names can be chosen freely, but it is recommended that you at least use similar names to the default ones so that you or other developers don't get confused when looking at them later. Remember to put these directories outside your webroot, or at least protect the directories using an .htaccess file (if using Apache as the server). Directly after adding the paths to your settings.php file, make sure you remove write permissions from the file as it would be a security risk if someone could change it. Drupal will now use your custom location for its configuration files on installation. You can also change the location of the configuration directories after installing Drupal. Open up your settings.php file and find these two lines near the end of the file and start with $config_directories. Change their paths to something like this: $config_directories['active'] = './../config/active'; $config_directories['staging] = './../config/staging'; This path places the directories above your Drupal root. Now that you know about active and staging, let's learn more about the different types of configuration you can create on your own. Simple configuration versus configuration entities As soon as you want to start storing your own configuration, you need to understand the differences between simple configuration and configuration entities. Here's a short definition of the two types of configuration used in Drupal. Simple configuration This configuration type is easier to implement and therefore ideal for basic configuration settings that result in Boolean values, integers, or simple strings of text being stored, as well as global variables that are used throughout your site. A good example would be the value of an on/off toggle for a specific feature in your module, or our previously used example of the site name configured by the system module: name: 'Configuration Management in Drupal 8' Simple configuration also includes any settings that your module requires in order to operate correctly. For example, JavaScript aggregation has to be either on or off. If it doesn't exist, the system module won't be able to determine the appropriate course of action. Configuration entities Configuration entities are much more complicated to implement but far more flexible. They are used to store information about objects that users can create and destroy without breaking the code. A good example of configuration entities is an image style provided by the image module. Take a look at the image.style.thumbnail.yml file: uuid: fe1fba86-862c-49c2-bf00-c5e1f78a0f6c langcode: en status: true dependencies: { } name: thumbnail label: 'Thumbnail (100×100)' effects: 1cfec298-8620-4749-b100-ccb6c4500779:    uuid: 1cfec298-8620-4749-b100-ccb6c4500779    id: image_scale    weight: 0    data:      width: 100      height: 100      upscale: false third_party_settings: { } This defines a specific style for images, so the system is able to create derivatives of images that a user uploads to the site. Configuration entities also come with a complete set of create, read, update, and delete (CRUD) hooks that are fired just like any other entity in Drupal, making them an ideal candidate for configuration that might need to be manipulated or responded to by other modules. As an example, the Views module uses configuration entities that allow for a scenario where, at runtime, hooks are fired that allow any other module to provide configuration (in this case, custom views) to the Views module. Summary In this article, you learned about how to store configuration and briefly got to know the two different types of configuration. Resources for Article: Further resources on this subject: Tabula Rasa: Nurturing your Site for Tablets [article] Components - Reusing Rules, Conditions, and Actions [article] Introduction to Drupal Web Services [article]
Read more
  • 0
  • 0
  • 21909

article-image-drupal-8-configuration-management
Packt
18 Mar 2015
14 min read
Save for later

Drupal 8 Configuration Management

Packt
18 Mar 2015
14 min read
In this article, by the authors, Stefan Borchert and Anja Schirwinski, of the book, Drupal 8 Configuration Management,we will learn the inner workings of the Configuration Management system in Drupal 8. You will learn about config and schema files and read about the difference between simple configuration and configuration entities. (For more resources related to this topic, see here.) The config directory During installation, Drupal adds a directory within sites/default/files called config_HASH, where HASH is a long random string of letters and numbers, as shown in the following screenshot: This sequence is a random hash generated during the installation of your Drupal site. It is used to add some protection to your configuration files. Additionally to the default restriction enforced by the .htaccess file within the subdirectories of the config directory that prevents unauthorized users from seeing the content of the directories. As a result, would be really hard for someone to guess the folder's name. Within the config directory, you will see two additional directories that are empty by default (leaving the .htaccess and README.txt files aside). One of the directories is called active. If you change the configuration system to use file storage instead of the database for active Drupal site configuration, this directory will contain the active configuration. If you did not customize the storage mechanism of the active configuration (we will learn later how to do this), Drupal 8 uses the database to store the active configuration. The other directory is called staging. This directory is empty by default, but can host the configuration you want to be imported into your Drupal site from another installation. You will learn how to use this later on in this article. A simple configuration example First, we want to become familiar with configuration itself. If you look into the database of your Drupal installation and open up the config table , you will find the entire active configuration of your site, as shown in the following screenshot: Depending on your site's configuration, table names may be prefixed with a custom string, so you'll have to look for a table name that ends with config. Don't worry about the strange-looking text in the data column; this is the serialized content of the corresponding configuration. It expands to single configuration values—that is, system.site.name, which holds the value of the name of your site. Changing the site's name in the user interface on admin/config/system/site-information will immediately update the record in the database; thus, put simply the records in the table are the current state of your site's configuration, as shown in the following screenshot: But where does the initial configuration of your site come from? Drupal itself and the modules you install must use some kind of default configuration that gets added to the active storage during installation. Config and schema files – what are they and what are they used for? In order to provide a default configuration during the installation process, Drupal (modules and profiles) comes with a bunch of files that hold the configuration needed to run your site. To make parsing of these files simple and enhance readability of these configuration files, the configuration is stored using the YAML format. YAML (http://yaml.org/) is a data-orientated serialization standard that aims for simplicity. With YAML, it is easy to map common data types such as lists, arrays, or scalar values. Config files Directly beneath the root directory of each module and profile defining or overriding configuration (either core or contrib), you will find a directory named config. Within this directory, there may be two more directories (although both are optional): install and schema. Check the image module inside core/modules and take a look at its config directory, as shown in the following screenshot: The install directory shown in the following screenshot contains all configuration values that the specific module defines or overrides and that are stored in files with the extension .yml (one of the default extensions for files in the YAML format): During installation, the values stored in these files are copied to the active configuration of your site. In the case of default configuration storage, the values are added to the config table; in file-based configuration storage mechanisms, on the other hand, the files are copied to the appropriate directories. Looking at the filenames, you will see that they follow a simple convention: <module name>.<type of configuration>[.<machine name of configuration object>].yml (setting aside <module name>.settings.yml for now). The explanation is as follows: <module name>: This is the name of the module that defines the settings included in the file. For instance, the image.style.large.yml file contains settings defined by the image module. <type of configuration>: This can be seen as a type of group for configuration objects. The image module, for example, defines several image styles. These styles are a set of different configuration objects, so the group is defined as style. Hence, all configuration files that contain image styles defined by the image module itself are named image.style.<something>.yml. The same structure applies to blocks (<block.block.*.yml>), filter formats (<filter.format.*.yml>), menus (<system.menu.*.yml>), content types (<node.type.*.yml>), and so on. <machine name of configuration object>: The last part of the filename is the unique machine-readable name of the configuration object itself. In our examples from the image module, you see three different items: large, medium, and thumbnail. These are exactly the three image styles you will find on admin/config/media/image-styles after installing a fresh copy of Drupal 8. The image styles are shown in the following screenshot: Schema files The primary reason schema files were introduced into Drupal 8 is multilingual support. A tool was needed to identify all translatable strings within the shipped configuration. The secondary reason is to provide actual translation forms for configuration based on your data and to expose translatable configuration pieces to external tools. Each module can have as many configuration the .yml files as needed. All of these are explained in one or more schema files that are shipped with the module. As a simple example of how schema files work, let's look at the system's maintenance settings in the system.maintenance.yml file at core/modules/system/config/install. The file's contents are as follows: message: '@site is currently under maintenance. We should be back shortly. Thank you for your patience.' langcode: en The system module's schema files live in core/modules/system/config/schema. These define the basic types but, for our example, the most important aspect is that they define the schema for the maintenance settings. The corresponding schema section from the system.schema.yml file is as follows: system.maintenance: type: mapping label: 'Maintenance mode' mapping:    message:      type: text      label: 'Message to display when in maintenance mode'    langcode:      type: string      label: 'Default language' The first line corresponds to the filename for the .yml file, and the nested lines underneath the first line describe the file's contents. Mapping is a basic type for key-value pairs (always the top-level type in .yml). The system.maintenance.yml file is labeled as label: 'Maintenance mode'. Then, the actual elements in the mapping are listed under the mapping key. As shown in the code, the file has two items, so the message and langcode keys are described. These are a text and a string value, respectively. Both values are given a label as well in order to identify them in configuration forms. Learning the difference between active and staging By now, you know that Drupal works with the two directories active and staging. But what is the intention behind those directories? And how do we use them? The configuration used by your site is called the active configuration since it's the configuration that is affecting the site's behavior right now. The current (active) configuration is stored in the database and direct changes to your site's configuration go into the specific tables. The reason Drupal 8 stores the active configuration in the database is that it enhances performance and security. Source: https://www.drupal.org/node/2241059. However, sometimes you might not want to store the active configuration in the database and might need to use a different storage mechanism. For example, using the filesystem as configuration storage will enable you to track changes in the site's configuration using a versioning system such as Git or SVN. Changing the active configuration storage If you do want to switch your active configuration storage to files, here's how: Note that changing the configuration storage is only possible before installing Drupal. After installing it, there is no way to switch to another configuration storage! To use a different configuration storage mechanism, you have to make some modifications to your settings.php file. First, you'll need to find the section named Active configuration settings. Now you will have to uncomment the line that starts with $settings['bootstrap_config_storage'] to enable file-based configuration storage. Additionally, you need to copy the existing default.services.yml (next to your settings.php file) to a file named services.yml and enable the new configuration storage: services: # Override configuration storage. config.storage:    class: DrupalCoreConfigCachedStorage    arguments: ['@config.storage.active', '@cache.config'] config.storage.active:    # Use file storage for active configuration.    alias: config.storage.file This tells Drupal to override the default service used for configuration storage and use config.storage.file as the active configuration storage mechanism instead of the default database storage. After installing the site with these settings, we will take another look at the config directory in sites/default/files (assuming you didn't change to the location of the active and staging directory): As you can see, the active directory contains the entire site's configuration. The files in this directory get copied here during the website's installation process. Whenever you make a change to your website, the change is reflected in these files. Exporting a configuration always exports a snapshot of the active configuration, regardless of the storage method. The staging directory contains the changes you want to add to your site. Drupal compares the staging directory to the active directory and checks for changes between them. When you upload your compressed export file, it actually gets placed inside the staging directory. This means you can save yourself the trouble of using the interface to export and import the compressed file if you're comfortable enough with copy-and-pasting files to another directory. Just make sure you copy all of the files to the staging directory even if only one of the files was changed. Any missing files are interpreted as deleted configuration, and will mess up your site. In order to get the contents of staging into active, we simply have to use the synchronize option at admin/config/development/configuration again. This page will show us what was changed and allows us to import the changes. On importing, your active configuration will get overridden with the configuration in your staging directory. Note that the files inside the staging directory will not be removed after the synchronization is finished. The next time you want to copy-and-paste from your active directory, make sure you empty staging first. Note that you cannot override files directly in the active directory. The changes have to be made inside staging and then synchronized. Changing the storage location of the active and staging directories In case you do not want Drupal to store your configuration in sites/default/files, you can set the path according to your wishes. Actually, this is recommended for security reasons, as these directories should never be accessible over the Web or by unauthorized users on your server. Additionally, it makes your life easier if you work with version control. By default, the whole files directory is usually ignored in version-controlled environments because Drupal writes to it, and having the active and staging directory located within sites/default/files would result in them being ignored too. So how do we change the location of the configuration directories? Before installing Drupal, you will need to create and modify the settings.php file that Drupal uses to load its basic configuration data from (that is, the database connection settings). If you haven't done it yet, copy the default.settings.php file and rename the copy to settings.php. Afterwards, open the new file with the editor of your choice and search for the following line: $config_directories = array(); Change the preceding line to the following (or simply insert your addition at the bottom of the file). $config_directories = array( CONFIG_ACTIVE_DIRECTORY => './../config/active', // folder outside the webroot CONFIG_STAGING_DIRECTORY => './../config/staging', // folder outside the webroot ); The directory names can be chosen freely, but it is recommended that you at least use similar names to the default ones so that you or other developers don't get confused when looking at them later. Remember to put these directories outside your webroot, or at least protect the directories using an .htaccess file (if using Apache as the server). Directly after adding the paths to your settings.php file, make sure you remove write permissions from the file as it would be a security risk if someone could change it. Drupal will now use your custom location for its configuration files on installation. You can also change the location of the configuration directories after installing Drupal. Open up your settings.php file and find these two lines near the end of the file and start with $config_directories. Change their paths to something like this: $config_directories['active'] = './../config/active'; $config_directories['staging] = './../config/staging'; This path places the directories above your Drupal root. Now that you know about active and staging, let's learn more about the different types of configuration you can create on your own. Simple configuration versus configuration entities As soon as you want to start storing your own configuration, you need to understand the differences between simple configuration and configuration entities. Here's a short definition of the two types of configuration used in Drupal. Simple configuration This configuration type is easier to implement and therefore ideal for basic configuration settings that result in Boolean values, integers, or simple strings of text being stored, as well as global variables that are used throughout your site. A good example would be the value of an on/off toggle for a specific feature in your module, or our previously used example of the site name configured by the system module: name: 'Configuration Management in Drupal 8' Simple configuration also includes any settings that your module requires in order to operate correctly. For example, JavaScript aggregation has to be either on or off. If it doesn't exist, the system module won't be able to determine the appropriate course of action. Configuration entities Configuration entities are much more complicated to implement but far more flexible. They are used to store information about objects that users can create and destroy without breaking the code. A good example of configuration entities is an image style provided by the image module. Take a look at the image.style.thumbnail.yml file: uuid: fe1fba86-862c-49c2-bf00-c5e1f78a0f6c langcode: en status: true dependencies: { } name: thumbnail label: 'Thumbnail (100×100)' effects: 1cfec298-8620-4749-b100-ccb6c4500779:    uuid: 1cfec298-8620-4749-b100-ccb6c4500779    id: image_scale    weight: 0    data:      width: 100      height: 100      upscale: false third_party_settings: { } This defines a specific style for images, so the system is able to create derivatives of images that a user uploads to the site. Configuration entities also come with a complete set of create, read, update, and delete (CRUD) hooks that are fired just like any other entity in Drupal, making them an ideal candidate for configuration that might need to be manipulated or responded to by other modules. As an example, the Views module uses configuration entities that allow for a scenario where, at runtime, hooks are fired that allow any other module to provide configuration (in this case, custom views) to the Views module. Summary In this article, you learned about how to store configuration and briefly got to know the two different types of configuration. Resources for Article: Further resources on this subject: Tabula Rasa: Nurturing your Site for Tablets [article] Components - Reusing Rules, Conditions, and Actions [article] Introduction to Drupal Web Services [article]
Read more
  • 0
  • 0
  • 13299

article-image-building-mobile-games-craftyjs-and-phonegap-part-1
Robi Sen
18 Mar 2015
7 min read
Save for later

Building Mobile Games with Crafty.js and PhoneGap: Part 1

Robi Sen
18 Mar 2015
7 min read
In this post, we will build a mobile game using HTML5, CSS, and JavaScript. To make things easier, we are going to make use of the Crafty.js JavaScript game engine, which is both free and open source. In this first part of a two-part series, we will look at making a simple turn-based RPG-like game based on Pascal Rettig’s Crafty Workshop presentation. You will learn how to add sprites to a game, control them, and work with mouse/touch events. Setting up To get started, first create a new PhoneGap project wherever you want in which to do your development. For this article, let’s call the project simplerpg. Figure 1: Creating the simplerpg project in PhoneGap. Navigate to the www directory in your PhoneGap project and then add a new director called lib. This is where we are going to put several JavaScript libraries we will use for the project. Now, download the JQuery library to the lib directory. For this project, we will use JQuery 2.1. Once you have downloaded JQuery, you need to download the Crafty.js library and add it to your lib directory as well. For later parts of this series,you will want to be using a web server such as Apache or IIS to make development easier. For the first part of the post, you can just drag-and-drop the HTML files into your browser to test, but later, you will need to use a web browser to avoid Same Origin Policy errors. This article assumes you are using Chrome to develop in. While IE or FireFox will work just fine, Chrome is used in this article and its debugging environment. Finally, the source code for this article can be found here on GitHub. In the lessons directory, you will see a series of index files with a listing number matching each code listing in this article. Crafty PhoneGap allows you to take almost any HTML5 application and turn it into a mobile app with little to no extra work. Perhaps the most complex of all mobile apps are videos. Video games often have complex routines, graphics, and controls. As such, developing a video game from the ground up is very difficult. So much so that even major video game companies rarely do so. What they usually do, and what we will do here, is make use of libraries and game engines that take care of many of the complex tasks of managing objects, animation, collision detection, and more. For our project, we will be making use of the open source JavaScript game engine Crafty. Before you get started with the code, it’s recommended to quickly review the website here and review the Crafty API here. Bootstrapping Crafty and creating an entity Crafty is very simple to start working with. All you need to do is load the Crafty.js library and initialize Crafty. Let’s try that. Create an index.html file in your www root directory, if one does not exist; if you already have one, go ahead and overwrite it. Then, cut and paste listing 1 into it. Listing 1: Creating an entity <!DOCTYPE html> <html> <head></head> <body> <div id="game"></div> <script type="text/javascript" src="lib/crafty.js"></script> <script> // Height and Width var WIDTH = 500, HEIGHT = 320; // Initialize Crafty Crafty.init(WIDTH, HEIGHT); var player = Crafty.e(); player.addComponent("2D, Canvas, Color") player.color("red").attr({w:50, h:50}); </script> </body> </html> As you can see in listing 1, we are creating an HTML5 document and loading the Crafty.js library. Then, we initialize Crafty and pass it a width and height. Next, we create a Crafty entity called player. Crafty, like many other game engines, follows a design pattern called Entity-Component-System or (ECS). Entities are objects that you can attach things like behaviors and data to. For ourplayerentity, we are going to add several components including 2D, Canvas, and Color. Components can be data, metadata, or behaviors. Finally, we will add a specific color and position to our entity. If you now save your file and drag-and-drop it into the browser, you should see something like figure 2. Figure 2: A simple entity in Crafty.  Moving a box Now,let’s do something a bit more complex in Crafty. Let’s move the red box based on where we move our mouse, or if you have a touch-enabled device, where we touch the screen. To do this, open your index.html file and edit it so it looks like listing 2. Listing 2: Moving the box <!DOCTYPE html> <html> <head></head> <body> <div id="game"></div> <script type="text/javascript" src="lib/crafty.js"></script> <script> var WIDTH = 500, HEIGHT = 320; Crafty.init(WIDTH, HEIGHT); // Background Crafty.background("black"); //add mousetracking so block follows your mouse Crafty.e("mouseTracking, 2D, Mouse, Touch, Canvas") .attr({ w:500, h:320, x:0, y:0 }) .bind("MouseMove", function(e) { console.log("MouseDown:"+ Crafty.mousePos.x +", "+ Crafty.mousePos.y); // when you touch on the canvas redraw the player player.x = Crafty.mousePos.x; player.y = Crafty.mousePos.y; }); // Create the player entity var player = Crafty.e(); player.addComponent("2D, DOM"); //set where your player starts player.attr({ x : 10, y : 10, w : 50, h : 50 }); player.addComponent("Color").color("red"); </script> </body> </html> As you can see, there is a lot more going on in this listing. The first difference is that we are using Crafty.background to set the background to black, but we are also creating a new entity called mouseTracking that is the same size as the whole canvas. We assign several components to the entity so that it can inherit their methods and properties. We then use .bind to bind the mouse’s movements to our entity. Then, we tell Crafty to reposition our player entity to wherever the mouse’s x and y position is. So, if you save this code and run it, you will find that the red box will go wherever your mouse moves or wherever you touch or drag as in figure 3.    Figure 3: Controlling the movement of a box in Crafty.  Summary In this post, you learned about working with Crafty.js. Specifically, you learned how to work with the Crafty API and create entities. In Part 2, you will work with sprites, create components, and control entities via mouse/touch.  About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus-year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as UnderArmour, Sony, CISCO, IBM, and many others to help build new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems, allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 5763

article-image-azure-storage
Packt
17 Mar 2015
7 min read
Save for later

Azure Storage

Packt
17 Mar 2015
7 min read
In this article by John Chapman and Aman Dhally, authors of the book, Automating Microsoft Azure with PowerShell, you will see that Microsoft Azure offers a variety of different services to store and retrieve data in the cloud. This includes File and Blob storage. Within Azure, each of these types of data is contained within an Azure storage account. While Azure SQL databases are also storage mechanisms, they are not part of an Azure storage account. (For more resources related to this topic, see here.) Azure File storage versus Azure Blob storage In a Microsoft Azure storage account, both the Azure File storage service and the Azure Blob storage service can be used to store files. Deciding which service to use depends on the purpose of the content and who will use the content. To break down the differences and similarities between these two services, we will cover the features, structure, and common uses for each service. Azure File storage Azure File storage provides shared storage using the Server Message Block (SMB) protocol. This allows clients, such as Windows Explorer, to connect and browse the File storage (such as a typical network file share). In a Windows file share, clients can add directory structures and files to the share. Similar to file shares, Azure File storage is typically used within an organization and not with users outside the organization. Azure File shares can only be mounted in Windows Explorer as a drive within virtual machines running in Azure. They cannot be mounted from computers outside of Azure. A few common uses of Azure File storage include: Sharing files between on-premise computers and Azure virtual machines Storing application configuration and diagnostic files in shared location Sharing documents and other files with users in the same organization but in different geographical locations Azure Blob storage A blob refers to a binary large object, which might not be an actual file. The Azure Blob storage service is used to store large amounts of unstructured data. This data can be accessed via HTTP or HTTPS, making it particularly useful to share large amounts of data publicly. Within an Azure storage account, blobs are stored within containers. Each container can be public or private, but it does not offer any directory structure as the File storage service does. A few common uses of Azure Blob storage include: Serving images, style sheets (CSS), and static web files for a website, much like a content delivery network Streaming media Backups and disaster recovery Sharing files to external users Getting the Azure storage account keys Managing services provided by Microsoft Azure storage accounts require two pieces of information: the storage account name and an access key. While we can obtain this information from the Microsoft Azure web portal, we will do so with PowerShell. Azure storage accounts have a primary and a secondary access key. If one of the access key is compromised, it can be regenerated without affecting the other. To obtain the Azure storage account keys, we will use the following steps: Open Microsoft Azure PowerShell from the Start menu and connect it to an Azure subscription. Use the Get-AzureStorageKey cmdlet with the name of the storage account to retrieve the storage account key information and assign it to a variable: PS C:> $accountKey = Get-AzureStorageKey -StorageAccountName psautomation Use the Format-List cmdlet (PS C:> $accountKey | Format-List –Property Primary,Secondary) to display the Primary and Secondary access key properties. Note that we are using the PowerShell pipeline to use the Format-List cmdlet on the $accountKey variable: Assign one of the keys (Primary or Secondary) to a variable for us to use: PS C:> $key = $accountKey.Primary Using Azure File storage As mentioned in the Azure File storage versus Azure Blob storage section, Azure File services act much like typical network files shares. To demonstrate Azure File services, we will first create a file share. After this, we will create a directory, upload a file, and list the files in a directory. To complete Azure File storage tasks, we will use the following steps: In the PowerShell session from the Getting the Azure storage account keys section in which we obtained an access key, use the New-AzureStorageContext cmdlet to connect to the Azure storage account and assign it to a variable. Note that the first parameter is the name of the storage account, whereas the second parameter is the access key: PS C:> $context = New-AzureStorageContext psautomation $key Create a new file share using the New-AzureStorageShare cmdlet and assign it to a variable: PS C:> $share = New-AzureStorageShare psautomationshare –Context $context Create a new directory in the file share using the New-AzureStorageDirectory cmdlet: PS C:> New-AzureStorageDirectory –Share $share –Path TextFiles Before uploading a file to the newly created directory, we need to ensure that we have a file to upload. To create a sample file, we can use the Set-Content cmdlet to create a new text file: PS C:> Set-Content C:FilesMyFile.txt –Value "Hello" Upload a file to the newly created directory using the Set-AzureStorageFileContent cmdlet: PS C:> Set-AzureStorageFileContent –Share $share –Source C:FilesMyFile.txt –Path TextFiles Use the Get-AzureStorageFile cmdlet (PS C:> Get-AzureStorageFile –Share $share –Path TextFiles) to list the files in the directory (similar to executing the dir or ls commands), as shown in the following screenshot: Using Azure Blog storage As mentioned in the Azure File storage versus Azure Blob storage section, Azure Blob storage can be used to store any unstructured data, including file content. Blobs are stored within containers, whereas permissions are set at the container level. The permission levels that can be assigned to a container are shown in the following table: Permission level Access provided Container This provides anonymous read access to the container and all blobs in the container. In addition, it allows anonymous users to list the blobs in the container. Blob This provides anonymous read access to blobs within the container. Anonymous users cannot list all of the blobs in the container. Off This does not provide anonymous access. It is only accessible with the Azure storage account keys. To illustrate Azure Blob storage, we will use the following steps to create a public container, upload a file, and access the file from a web browser: In the PowerShell session from the Getting Azure storage account keys section in which we obtained an access key, use the New-AzureStorageContext cmdlet to connect to the Azure storage account and assign it to a variable. Note that the first parameter is the name of the storage account, whereas the second parameter is the access key: PS C:> $context = New-AzureStorageContext psautomation $key Use the New-AzureStorageContainer cmdlet to create a new public container. Note that the name must contain only numbers and lowercase letters. No special characters, spaces, or uppercase letters are permitted: PS C:> New-AzureStorageContainer –Name textfiles –Context $context –Permission Container Before uploading a file to the newly created directory, we need to ensure that we have a file to upload. To create a sample file, we can use the Set-Content cmdlet to create a new text file: PS C:> Set-Content C:FilesMyFile.txt –Value "Hello" Upload a file using the Set-AzureStorageBlobContent cmdlet: PS C:> Set-AzureStorageBlobContent –File C:FilesMyFile.txt –Blob "MyFile.txt" –Container textfiles –Context $context Navigate to the newly uploaded blob in Internet Explorer. The URL for the blob is formatted as https://<StorageAccountName>.blob.core.windows.net/<ContainerName>/<BlobName>. In our example, the URL is https://psautomation.blob.core.windows.net/textfiles/MyFile.txt, as shown in the following screenshot: Summary In this article, you learned about Microsoft Azure storage accounts and how to interact with the storage account services with PowerShell. This included the Azure File storage and Azure Blob storage. Resources for Article: Further resources on this subject: Using Azure BizTalk Features [Article] Windows Azure Mobile Services - Implementing Push Notifications using [Article] How to use PowerShell Web Access to manage Windows Server [Article]
Read more
  • 0
  • 0
  • 1930

article-image-code-sharing-between-ios-and-android
Packt
17 Mar 2015
24 min read
Save for later

Code Sharing Between iOS and Android

Packt
17 Mar 2015
24 min read
In this article by Jonathan Peppers, author of the book Xamarin Cross-platform Application Development, we will see how Xamarin's tools promise to share a good portion of your code between iOS and Android while taking advantage of the native APIs on each platform where possible. Doing so is an exercise in software engineering more than a programming skill or having the knowledge of each platform. To architect a Xamarin application to enable code sharing, it is a must to separate your application into distinct layers. We'll cover the basics of this in this article as well as specific options to consider in certain situations. In this article, we will cover: The MVVM design pattern for code sharing Project and solution organization strategies Portable Class Libraries (PCLs) Preprocessor statements for platform-specific code Dependency injection (DI) simplified Inversion of Control (IoC) (For more resources related to this topic, see here.) Learning the MVVM design pattern The Model-View-ViewModel (MVVM) design pattern was originally invented for Windows Presentation Foundation (WPF) applications using XAML for separating the UI from business logic and taking full advantage of data binding. Applications architected in this way have a distinct ViewModel layer that has no dependencies on its user interface. This architecture in itself is optimized for unit testing as well as cross-platform development. Since an application's ViewModel classes have no dependencies on the UI layer, you can easily swap an iOS user interface for an Android one and write tests against the ViewModellayer. The MVVM design pattern is also very similar to the MVC design pattern. The MVVM design pattern includes the following: Model: The Model layer is the backend business logic that drives the application and any business objects to go along with it. This can be anything from making web requests to a server to using a backend database. View: This layer is the actual user interface seen on the screen. In the case of cross-platform development, it includes any platform-specific code for driving the user interface of the application. On iOS, this includes controllers used throughout an application, and on Android, an application's activities. ViewModel: This layer acts as the glue in MVVM applications. The ViewModel layerscoordinate operations between the View and Model layers. A ViewModel layer will contain properties that the View will get or set, and functions for each operation that can be made by the user on each View. The ViewModel layer will also invoke operations on the Model layer if needed. The following figure shows you the MVVM design pattern: It is important to note that the interaction between the View and ViewModel layers is traditionally created by data binding with WPF. However, iOS and Android do not have built-in data binding mechanisms, so our general approach throughout the article will be to manually call the ViewModel layer from the View layer. There are a few frameworks out there that provide data binding functionality such as MVVMCross and Xamarin.Forms. Implementing MVVM in an example To understand this pattern better, let's implement a common scenario. Let's say we have a search box on the screen and a search button. When the user enters some text and clicks on the button, a list of products and prices will be displayed to the user. In our example, we use the async and await keywords that are available in C# 5 to simplify asynchronous programming. To implement this feature, we will start with a simple model class (also called a business object) as follows: public class Product{   public int Id { get; set; } //Just a numeric identifier   public string Name { get; set; } //Name of the product   public float Price { get; set; } //Price of the product} Next, we will implement our Model layer to retrieve products based on the searched term. This is where the business logic is performed, expressing how the search needs to actually work. This is seen in the following lines of code: // An example class, in the real world would talk to a web// server or database.public class ProductRepository{// a sample list of products to simulate a databaseprivate Product[] products = new[]{   new Product { Id = 1, Name = “Shoes”, Price = 19.99f },   new Product { Id = 2, Name = “Shirt”, Price = 15.99f },   new Product { Id = 3, Name = “Hat”, Price = 9.99f },};public async Task<Product[]> SearchProducts(   string searchTerm){   // Wait 2 seconds to simulate web request   await Task.Delay(2000);    // Use Linq-to-objects to search, ignoring case   searchTerm = searchTerm.ToLower();   return products.Where(p =>      p.Name.ToLower().Contains(searchTerm))   .ToArray();}} It is important to note here that the Product and ProductRepository classes are both considered as a part of the Model layer of a cross-platform application. Some might consider ProductRepository as a service that is generally a self-contained class to retrieve data. It is a good idea to separate this functionality into two classes. The Product class's job is to hold information about a product, while the ProductRepository class is in charge of retrieving products. This is the basis for the single responsibility principle, which states that each class should only have one job or concern. Next, we will implement a ViewModel class as follows: public class ProductViewModel{private readonly ProductRepository repository =    new ProductRepository(); public string SearchTerm{   get;   set;}public Product[] Products{   get;   private set;}public async Task Search(){   if (string.IsNullOrEmpty(SearchTerm))     Products = null;   else     Products = await repository.SearchProducts(SearchTerm);}} From here, your platform-specific code starts. Each platform will handle managing an instance of a ViewModel class, setting the SearchTerm property, and calling Search when the button is clicked. When the task completes, the user interface layer will update a list displayed on the screen. If you are familiar with the MVVM design pattern used with WPF, you might notice that we are not implementing INotifyPropertyChanged for data binding. Since iOS and Android don't have the concept of data binding, we omitted this functionality. If you plan on having a WPF or Windows 8 version of your mobile application or are using a framework that provides data binding, you should implement support for it where needed. Comparing project organization strategies You might be asking yourself at this point, how do I set up my solution in Xamarin Studio to handle shared code and also have platform-specific projects? Xamarin.iOS applications can only reference Xamarin.iOS class libraries, so setting up a solution can be problematic. There are several strategies for setting up a cross-platform solution, each with its own advantages and disadvantages. Options for cross-platform solutions are as follows: File Linking: For this option, you will start with either a plain .NET 4.0 or .NET 4.5 class library that contains all the shared code. You would then have a new project for each platform you want your app to run on. Each platform-specific project will have a subdirectory with all of the files linked in from the first class library. To set this up, add the existing files to the project and select the Add a link to the file option. Any unit tests can run against the original class library. The advantages and disadvantages of file linking are as follows: Advantages: This approach is very flexible. You can choose to link or not link certain files and can also use preprocessor directives such as #if IPHONE. You can also reference different libraries on Android versus iOS. Disadvantages: You have to manage a file's existence in three projects: core library, iOS, and Android. This can be a hassle if it is a large application or if many people are working on it. This option is also a bit outdated since the arrival of shared projects. Cloned Project Files: This is very similar to file linking. The main difference being that you have a class library for each platform in addition to the main project. By placing the iOS and Android projects in the same directory as the main project, the files can be added without linking. You can easily add files by right-clicking on the solution and navigating to Display Options | Show All Files. Unit tests can run against the original class library or the platform-specific versions: Advantages: This approach is just as flexible as file linking, but you don't have to manually link any files. You can still use preprocessor directives and reference different libraries on each platform. Disadvantages: You still have to manage a file's existence in three projects. There is additionally some manual file arranging required to set this up. You also end up with an extra project to manage on each platform. This option is also a bit outdated since the arrival of shared projects. Shared Projects: Starting with Visual Studio 2013 Update 2, Microsoft created the concept of shared projects to enable code sharing between Windows 8 and Windows Phone apps. Xamarin has also implemented shared projects in Xamarin Studio as another option to enable code sharing. Shared projects are virtually the same as file linking, since adding a reference to a shared project effectively adds its files to your project: Advantages: This approach is the same as file linking, but a lot cleaner since your shared code is in a single project. Xamarin Studio also provides a dropdown to toggle between each referencing project, so that you can see the effect of preprocessor statements in your code. Disadvantages: Since all the files in a shared project get added to each platform's main project, it can get ugly to include platform-specific code in a shared project. Preprocessor statements can quickly get out of hand if you have a large team or have team members that do not have a lot of experience. A shared project also doesn't compile to a DLL, so there is no way to share this kind of project without the source code. Portable Class Libraries: This is the most optimal option; you begin the solution by making a Portable Class Library (PCL) project for all your shared code. This is a special project type that allows multiple platforms to reference the same project, allowing you to use the smallest subset of C# and the .NET framework available in each platform. Each platform-specific project will reference this library directly as well as any unit test projects: Advantages: All your shared code is in one project, and all platforms use the same library. Since preprocessor statements aren't possible, PCL libraries generally have cleaner code. Platform-specific code is generally abstracted away by interfaces or abstract classes. Disadvantages: You are limited to a subset of .NET depending on how many platforms you are targeting. Platform-specific code requires use of dependency injection, which can be a more advanced topic for developers not familiar with it. Setting up a cross-platform solution To understand each option completely and what different situations call for, let's define a solution structure for each cross-platform solution. Let's use the product search example and set up a solution for each approach. To set up file linking, perform the following steps: Open Xamarin Studio and start a new solution. Select a new Library project under the general C# section. Name the project ProductSearch.Core, and name the solution ProductSearch. Right-click on the newly created project and select Options. Navigate to Build | General, and set the Target Framework option to .NET Framework 4.5. Add the Product, ProductRepository, and ProductViewModel classes to the project. You will need to add using System.Threading.Tasks; and using System.Linq; where needed. Navigate to Build | Build All from the menu at the top to be sure that everything builds properly. Now, let's create a new iOS project by right-clicking on the solution and navigating to Add | Add New Project. Then, navigate to iOS | iPhone | Single View Application and name the project ProductSearch.iOS. Create a new Android project by right-clicking on the solution and navigating to Add | Add New Project. Create a new project by navigating to Android | Android Application and name it ProductSearch.Droid. Add a new folder named Core to both the iOS and Android projects. Right-click on the new folder for the iOS project and navigate to Add | Add Files from Folder. Select the root directory for the ProductSearch.Core project. Check the three C# files in the root of the project. An Add File to Folder dialog will appear. Select Add a link to the file and make sure that the Use the same action for all selected files checkbox is selected. Repeat this process for the Android project. Navigate to Build | Build All from the menu at the top to double-check everything. You have successfully set up a cross-platform solution with file linking. When all is done, you will have a solution tree that looks something like what you can see in the following screenshot: You should consider using this technique when you have to reference different libraries on each platform. You might consider using this option if you are using MonoGame, or other frameworks that require you to reference a different library on iOS versus Android. Setting up a solution with the cloned project files approach is similar to file linking, except that you will have to create an additional class library for each platform. To do this, create an Android library project and an iOS library project in the same ProductSearch.Core directory. You will have to create the projects and move them to the proper folder manually, then re-add them to the solution. Right-click on the solution and navigate to Display Options | Show All Files to add the required C# files to these two projects. Your main iOS and Android projects can reference these projects directly. Your project will look like what is shown in the following screenshot, with ProductSearch.iOS referencing ProductSearch.Core.iOS and ProductSearch.Droid referencing ProductSearch.Core.Droid: Working with Portable Class Libraries A Portable Class Library (PCL) is a C# library project that can be supported on multiple platforms, including iOS, Android, Windows, Windows Store apps, Windows Phone, Silverlight, and Xbox 360. PCLs have been an effort by Microsoft to simplify development across different versions of the .NET framework. Xamarin has also added support for iOS and Android for PCLs. Many popular cross-platform frameworks and open source libraries are starting to develop PCL versions such as Json.NET and MVVMCross. Using PCLs in Xamarin Let's create our first portable class library: Open Xamarin Studio and start a new solution. Select a new Portable Library project under the general C# section. Name the project ProductSearch.Core and name the solution ProductSearch. Add the Product, ProductRepository, and ProductViewModel classes to the project. You will need to add using System.Threading.Tasks; and using System.Linq; where needed. Navigate to Build | Build All from the menu at the top to be sure that everything builds properly. Now, let's create a new iOS project by right-clicking on the solution and navigating to Add | Add New Project. Create a new project by navigating to iOS | iPhone | Single View Application and name it ProductSearch.iOS. Create a new Android project by right-clicking on the solution and navigating to Add | Add New Project. Then, navigate to Android | Android Application and name the project ProductSearch.Droid. Simply add a reference to the portable class library from the iOS and Android projects. Navigate to Build | Build All from the top menu and you have successfully set up a simple solution with a portable library. Each solution type has its distinct advantages and disadvantages. PCLs are generally better, but there are certain cases where they can't be used. For example, if you were using a library such as MonoGame, which is a different library for each platform, you would be much better off using a shared project or file linking. Similar issues would arise if you needed to use a preprocessor statement such as #if IPHONE or a native library such as the Facebook SDK on iOS or Android. Setting up a shared project is almost the same as setting up a portable class library. In step 2, just select Shared Project under the general C# section and complete the remaining steps as stated. Using preprocessor statements When using shared projects, file linking, or cloned project files, one of your most powerful tools is the use of preprocessor statements. If you are unfamiliar with them, C# has the ability to define preprocessor variables such as #define IPHONE , allowing you to use #if IPHONE or #if !IPHONE. The following is a simple example of using this technique: #if IPHONEConsole.WriteLine(“I am running on iOS”);#elif ANDROIDConsole.WriteLine(“I am running on Android”);#elseConsole.WriteLine(“I am running on ???”);#endif In Xamarin Studio, you can define preprocessor variables in your project's options by navigating to Build | Compiler | Define Symbols, delimited with semicolons. These will be applied to the entire project. Be warned that you must set up these variables for each configuration setting in your solution (Debug and Release); this can be an easy step to miss. You can also define these variables at the top of any C# file by declaring #define IPHONE, but they will only be applied within the C# file. Let's go over another example, assuming that we want to implement a class to open URLs on each platform: public static class Utility{public static void OpenUrl(string url){   //Open the url in the native browser}} The preceding example is a perfect candidate for using preprocessor statements, since it is very specific to each platform and is a fairly simple function. To implement the method on iOS and Android, we will need to take advantage of some native APIs. Refactor the class to look as follows: #if IPHONE//iOS using statementsusing MonoTouch.Foundation;using MonoTouch.UIKit;#elif ANDROID//Android using statementsusing Android.App;using Android.Content;using Android.Net;#else//Standard .Net using statementusing System.Diagnostics;#endif public static class Utility{#if ANDROID   public static void OpenUrl(Activity activity, string url)#else   public static void OpenUrl(string url)#endif{   //Open the url in the native browser   #if IPHONE     UIApplication.SharedApplication.OpenUrl(       NSUrl.FromString(url));   #elif ANDROID     var intent = new Intent(Intent.ActionView,       Uri.Parse(url));     activity.StartActivity(intent);   #else     Process.Start(url);   #endif}} The preceding class supports three different types of projects: Android, iOS, and a standard Mono or .NET framework class library. In the case of iOS, we can perform the functionality with static classes available in Apple's APIs. Android is a little more problematic and requires an Activity object to launch a browser natively. We get around this by modifying the input parameters on Android. Lastly, we have a plain .NET version that uses Process.Start() to launch a URL. It is important to note that using the third option would not work on iOS or Android natively, which necessitates our use of preprocessor statements. Using preprocessor statements is not normally the cleanest or the best solution for cross-platform development. They are generally best used in a tight spot or for very simple functions. Code can easily get out of hand and can become very difficult to read with many #if statements, so it is always better to use it in moderation. Using inheritance or interfaces is generally a better solution when a class is mostly platform specific. Simplifying dependency injection Dependency injection at first seems like a complex topic, but for the most part it is a simple concept. It is a design pattern aimed at making your code within your applications more flexible so that you can swap out certain functionality when needed. The idea builds around setting up dependencies between classes in an application so that each class only interacts with an interface or base/abstract class. This gives you the freedom to override different methods on each platform when you need to fill in native functionality. The concept originated from the SOLID object-oriented design principles, which is a set of rules you might want to research if you are interested in software architecture. There is a good article about SOLID on Wikipedia, (http://en.wikipedia.org/wiki/SOLID_%28object-oriented_design%29) if you would like to learn more. The D in SOLID, which we are interested in, stands for dependencies. Specifically, the principle declares that a program should depend on abstractions, not concretions (concrete types). To build upon this concept, let's walk you through the following example: Let's assume that we need to store a setting in an application that determines whether the sound is on or off. Now let's declare a simple interface for the setting: interface ISettings { bool IsSoundOn { get; set; } }. On iOS, we'd want to implement this interface using the NSUserDefaults class. Likewise, on Android, we will implement this using SharedPreferences. Finally, any class that needs to interact with this setting will only reference ISettings so that the implementation can be replaced on each platform. For reference, the full implementation of this example will look like the following snippet: public interface ISettings{bool IsSoundOn{   get;   set;}}//On iOSusing MonoTouch.UIKit;using MonoTouch.Foundation; public class AppleSettings : ISettings{public bool IsSoundOn{   get   {     return NSUserDefaults.StandardUserDefaults     BoolForKey(“IsSoundOn”);   }   set   {     var defaults = NSUserDefaults.StandardUserDefaults;     defaults.SetBool(value, “IsSoundOn”);     defaults.Synchronize();   }}}//On Androidusing Android.Content; public class DroidSettings : ISettings{private readonly ISharedPreferences preferences; public DroidSettings(Context context){   preferences = context.GetSharedPreferences(     context.PackageName, FileCreationMode.Private);}public bool IsSoundOn{   get   {     return preferences.GetBoolean(“IsSoundOn”, true”);   }   set   {     using (var editor = preferences.Edit())     {       editor.PutBoolean(“IsSoundOn”, value);       editor.Commit();     }   }}} Now you will potentially have a ViewModel class that will only reference ISettings when following the MVVM pattern. It can be seen in the following snippet: public class SettingsViewModel{  private readonly ISettings settings;  public SettingsViewModel(ISettings settings)  {    this.settings = settings;  }  public bool IsSoundOn  {    get;    set;  }  public void Save()  {    settings.IsSoundOn = IsSoundOn;  }} Using a ViewModel layer for such a simple example is not necessarily needed, but you can see it would be useful if you needed to perform other tasks such as input validation. A complete application might have a lot more settings and might need to present the user with a loading indicator. Abstracting out your setting's implementation has other benefits that add flexibility to your application. Let's say you suddenly need to replace NSUserDefaults on iOS with the iCloud instead; you can easily do so by implementing a new ISettings class and the remainder of your code will remain unchanged. This will also help you target new platforms such as Windows Phone, where you might choose to implement ISettings in a platform-specific way. Implementing Inversion of Control You might be asking yourself at this point in time, how do I switch out different classes such as the ISettings example? Inversion of Control (IoC) is a design pattern meant to complement dependency injection and solve this problem. The basic principle is that many of the objects created throughout your application are managed and created by a single class. Instead of using the standard C# constructors for your ViewModel or Model classes, a service locator or factory class will manage them throughout the application. There are many different implementations and styles of IoC, so let's implement a simple service locator class as follows: public static class ServiceContainer{  static readonly Dictionary<Type, Lazy<object>> services =    new Dictionary<Type, Lazy<object>>();  public static void Register<T>(Func<T> function)  {    services[typeof(T)] = new Lazy<object>(() => function());  }  public static T Resolve<T>()  {    return (T)Resolve(typeof(T));  }  public static object Resolve(Type type)  {    Lazy<object> service;    if (services.TryGetValue(type, out service)    {      return service.Value;    }    throw new Exception(“Service not found!”);  }} This class is inspired by the simplicity of XNA/MonoGame's GameServiceContainer class and follows the service locator pattern. The main differences are the heavy use of generics and the fact that it is a static class. To use our ServiceContainer class, we will declare the version of ISettings or other interfaces that we want to use throughout our application by calling Register, as seen in the following lines of code: //iOS version of ISettingsServiceContainer.Register<ISettings>(() => new AppleSettings());//Android version of ISettingsServiceContainer.Register<ISettings>(() => new DroidSettings());//You can even register ViewModelsServiceContainer.Register<SettingsViewMode>(() =>   new SettingsViewModel()); On iOS, you can place this registration code in either your static void Main() method or in the FinishedLaunching method of your AppDelegate class. These methods are always called before the application is started. On Android, it is a little more complicated. You cannot put this code in the OnCreate method of your activity that acts as the main launcher. In some situations, the Android OS can close your application but restart it later in another activity. This situation is likely to cause an exception somewhere. The guaranteed safe place to put this is in a custom Android Application class which has an OnCreate method that is called prior to any activities being created in your application. The following lines of code show you the use of the Application class: [Application]public class Application : Android.App.Application{  //This constructor is required  public Application(IntPtr javaReference, JniHandleOwnership     transfer): base(javaReference, transfer)  {  }  public override void OnCreate()  {    base.OnCreate();    //IoC Registration here  }} To pull a service out of the ServiceContainer class, we can rewrite the constructor of the SettingsViewModel class so that it is similar to the following lines of code: public SettingsViewModel(){  this.settings = ServiceContainer.Resolve<ISettings>();} Likewise, you will use the generic Resolve method to pull out any ViewModel classes you would need to call from within controllers on iOS or activities on Android. This is a great, simple way to manage dependencies within your application. There are, of course, some great open source libraries out there that implement IoC for C# applications. You might consider switching to one of them if you need more advanced features for service location or just want to graduate to a more complicated IoC container. Here are a few libraries that have been used with Xamarin projects: TinyIoC: https://github.com/grumpydev/TinyIoC Ninject: http://www.ninject.org/ MvvmCross: https://github.com/slodge/MvvmCross includes a full MVVM framework as well as IoC Simple Injector: http://simpleinjector.codeplex.com OpenNETCF.IoC: http://ioc.codeplex.com Summary In this article, we learned about the MVVM design pattern and how it can be used to better architect cross-platform applications. We compared several project organization strategies for managing a Xamarin Studio solution that contains both iOS and Android projects. We went over portable class libraries as the preferred option for sharing code and how to use preprocessor statements as a quick and dirty way to implement platform-specific code. After completing this article, you should be able to speed up with several techniques for sharing code between iOS and Android applications using Xamarin Studio. Using the MVVM design pattern will help you divide your shared code and code that is platform specific. We also covered several options for setting up cross-platform Xamarin solutions. You should also have a firm understanding of using dependency injection and Inversion of Control to give your shared code access to the native APIs on each platform. Resources for Article:   Further resources on this subject: XamChat – a Cross-platform App [article] Configuring Your Operating System [article] Updating data in the background [article]
Read more
  • 0
  • 0
  • 16763

article-image-benchmarking-and-optimizing-go-code-part-1
Alex Browne
16 Mar 2015
6 min read
Save for later

Benchmarking and Optimizing Go Code, Part 1

Alex Browne
16 Mar 2015
6 min read
In this two part post, you'll learn the basics of how to benchmark and optmize go code. You'll do this by trying out a few different implementations for calculating a specific element in Pascal's Triangle (a.k.a. the binomial coeffecient). I'll assume that you are already familiar with go, but if you aren't, I recommend the interactive tutorial. All of the code for these two posts is available on github. Installation & Set Up To follow along, you will need to install go version 1.2 or later. Also, make sure that you follow these instructions for setting up your go work environment. In particular, you will need to have the GOPATH environment variable pointing to a directory where all your go code will reside. The Goal Our goal is to write a function to calculate a specific element of Pascal's Triangle. In case you aren't familiar with Pascal's Triangle, it looks something like this: Pascal's Triangle follows these simple rules: The first row contains a single element (1) Every subsequent element is the sum of the two elements directly above it (one above and to the left, and the other above and to the right). If either the element above and to the left or the one above and to the right are absent, consider them to be equal to zero. For convenience, we'll index all of the rows and columns in Pascal's Triangle, starting with 0. So the element at the very top of the triangle is at index (0, 0). The element at index (4, 1) would be at row 4 and column 1, which is 1 + 3 = 4. The function we'll be writing will return the element at row n, column m of Pascal's Triangle and has the following signature: func Pascal(n, m int) int So Pascal(0, 0) should return 1 and Pascal(5, 2) would return 10. In these posts, we'll write several different implementations of the Pascal function, test each one for correctness, and benchmark them to see which is the fastest. Project Structure Now's a good time to setup the directory where your code for this project will reside. Somehwere in $GOPATH/src, create a new directory and call it whatever you want. I recommend $GOPATH/src/github.com/your-github-username/go-benchmark-example. Our basic project structure is going to look like this: go-benchmark-example common common.go implementations builtin.go naive.go recursive.go test benchmark_test.go pascal_test.go The implementations package will hold a few different implementations of the Pascal function, each in their own file. The test package is where we will test the implementations for corectness (in pascal_test.go) and then benchmark their performance (in benchmark_test.go). Without further ado, let's start writing code! The Pascaler Interface To make comparing different implementations easier, we'll create an interface that all of the implementations should implement. (You'll see why this is handy when we write the tests and benchmarks). Add the following to common/common.go: package common type Pascaler interface { Pascal(int, int) int } That's it! All we've done is declare an interface that consists of one method, Pascal, which takes two ints as arguments (the row and column) and returns the value of the specified element in Pascal's Triangle. The seemingly odd name "Pascaler" is just the convention in go for an interface with only one method. Naive Implementation The first implementation we'll write is a naive iterative one. The basic idea is to generate the triangle from top to bottom until we reach row n and column m. We call this implementation "naive" because it does not attempt to do anything clever and will probably not perform the best. Add the following to implementations/naive.go: package implementations type naiveType struct{} var Naive = naiveType{} func (p naiveType) Pascal(n, m int) int { // Instantiate a slice to hold n+1 rows rows := make([][]int, n+1) // Start by hard-coding the first two rows if n < 2 { rows = make([][]int, 2) } rows[0] = []int{1} rows[1] = []int{1, 1} // Iterate from top to bottom until we reach row n for i := 2; i <= n; i++ { numColumns := i + 1 rows[i] = make([]int, numColumns) rows[i][0] = 1 rows[i][numColumns-1] = 1 for j := 1; j < numColumns-1; j++ { // Element (i, j) is equal to the sum of the two elements // directly above it rows[i][j] = rows[i-1][j-1] + rows[i-1][j] } } return rows[n][m] } // Implement the Stringer interface so we can print the Naive // object directly with fmt.Print and friends func (p naiveType) String() string { return "Naive Implementation" } Testing for Correctness The next thing we'll do is test our implementation to make sure it is correct. Since we want to test each implementation the same way, we'll write a small utility function called testPascaler, which takes a Pascaler as an argument. The function will iterate through an array of test cases and run each case against the given implementation. Add the following to test/pascal_test.go: import ( "github.com/albrow/go-benchmark-example/common" "github.com/albrow/go-benchmark-example/implementations" "reflect" "testing" ) func TestNaive(t *testing.T) { testPascaler(t, implementations.Naive) } // testPascaler can be used to test any implementation. It uses a // series of test cases and reports any errors using t.Error. func testPascaler(t *testing.T, p common.Pascaler) { // cases is an array of test cases, each consisting of two inputs // n and m and the expected output. cases := []struct { n int m int expected int }{ {0, 0, 1}, {1, 0, 1}, {1, 1, 1}, {4, 1, 4}, {6, 4, 15}, {7, 3, 35}, {7, 0, 1}, {7, 7, 1}, {32, 16, 601080390}, {64, 32, 1832624140942590534}, } // Iterate through each test case and check the result for _, c := range cases { if reflect.TypeOf(p) == reflect.TypeOf(implementations.Recursive) && c.n > 30 { // Skip cases where n is too large for the recursive implementation. // It takes too long and might even timeout. continue } got := p.Pascal(c.n, c.m) if got != c.expected { t.Errorf("Incorrect result for %s with inputs (%d, %d).nExpected %d but got %d.", p, c.n, c.m, c.expected, got) } } } To run the test, just run the following command from your project directory: go test ./.... If everything works as expected and the test passes, you should see the following output: ? github.com/albrow/go-benchmark-example/common [no test files] ? github.com/albrow/go-benchmark-example/implementations [no test files] ok github.com/albrow/go-benchmark-example/test 0.005s Conclusion Follow along in Part 2 where we will cover benchmarking for performance, a recursive implementation, and a bultin implementation. About the Author Alex Browne is a programmer and entrepreneur with about 4 years of product development experience and 3 years experience working on small startups. He has worked with a wide variety of technologies and has single-handedly built many production-grade applications. He is currently working with two co-founders on an early stage startup exploring ways of applying machine learning and computer vision to the manufacturing industry.His favorite programming language is Go.
Read more
  • 0
  • 0
  • 4706
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-text-mining-r-part-1
Robi Sen
16 Mar 2015
7 min read
Save for later

Text Mining with R: Part 1

Robi Sen
16 Mar 2015
7 min read
R is rapidly becoming the platform of choice for programmers, scientists, and others who need to perform statistical analysis and data mining. In part this is because R is incredibly easy to learn and with just a few commands you can perform data mining and analysis functions that would be very hard in more general purpose languages like Ruby, .Net, Java, or C++. To demonstrate R’s ease, flexibility, and power we will look at how to use R to look at a collection of tweets from the 2014 super bowl, clear up data via R, turn that data it a document matrix so we can analyze the data, then create a “word cloud” so we can visualize our analysis to look for interesting words. Getting Started To get started you need to download both R and R studio. R can be found here and RStudio can be found here. R and RStudio are available for most major operating systems and you should follow the up to date installation guides on their respective websites. For this example we are going to be using a data set from Techtunk which is rather large. For this article I have taken a small excerpt of techtrunks SuperBowl 2014, over 8 million tweets, and cleaned it up for the article. You can download it from the original data source here. Finally you will need to install the R packages text mining package (tm ) and word cloud package (wordcloud). You can use standard library method to install the packages or just use RStudio to install the packets. Preparing our Data As already stated you can find the total SuperBowl 2014 dataset. That being said, it's very large and broken up into many sets of Pipe Delimited files, and they have the .csv file extension but are not .csv, which can be somewhat awkward to work with. This though is a common problem when working with large data sets. Luckily the data set is broken up into fragments in that usually when you are working with large datasets you do not want to try to start developing against the whole data set rather a small and workable dataset the will let you quickly develop your scripts without being so large and unwieldy that it delays development. Indeed you will find that working the large files provided by Techtunk can take 10’s of minutes to process as is. In cases like this is good to look at the data, figure out what data you want, take a sample set of data, massage it as needed, and then work in it from there until you have your coding working exactly how you want. In our cases I took a subset of 4600 tweets from one of the pipe delimited files, converted the file format to Commas Separated Value, .csv, and saved it as a sample file to work from. You can do the same thing, you should consider using files smaller than 5000 records, however you would like or use the file created for this post here. Visualizing our Data For this post all we want to do is get a general sense of what the more common words are that are being tweeted during the superbowl. A common way to visualize this is with a word cloud which will show us the frequency of a term by representing it in greater size to other words in comparison to how many times it is mentioned in the body of tweets being analyzed. To do this we need to a few things first with our data. First we need to create read in our file and turn our collection of tweets into a Corpus. In general a Corpus is a large body of text documents. In R’s textming package it’s an object that will be used to hold our tweets in memory. So to load our tweets as a corpus into R you can do as shown here: # change this file location to suit your machine file_loc <- "yourfilelocation/largerset11.csv" # change TRUE to FALSE if you have no column headings in the CSV myfile <- read.csv(file_loc, header = TRUE, stringsAsFactors=FALSE) require(tm) mycorpus <- Corpus(DataframeSource(myfile[c("username","tweet")])) You can now simply print your Corpus to get a sense of it. > print(mycorpus) <<VCorpus (documents: 4688, metadata (corpus/indexed): 0/0)>> In this case, VCorpus is an automatic assignment meaning that the Corpus is a Volatile object stored in memory. If you want you can make the Corpus permanent using PCorpus. You might do this if you were doing analysis on actual documents such as PDF’s or even databases and in this case R makes pointers to the documents instead of full document structures in memory. Another method you can use to look at your corpus is inspect() which provides a variety of ways to look at the documents in your corpus. For example using: inspect(mycorpus[1,2]) This might give you a result like: > inspect(mycorpus[1:2]) <<VCorpus (documents: 2, metadata (corpus/indexed): 0/0)>> [[1]] <<PlainTextDocument (metadata: 7)>> sstanley84 wow rt thestanchion news fleury just made fraudulent investment karlsson httptco5oi6iwashg [[2]] <<PlainTextDocument (metadata: 7)>> judemgreen 2 hour wait train superbowl time traffic problemsnice job chris christie As such inspect can be very useful in quickly getting a sense of data in your corpus without having to try to print the whole corpus. Now that we have our corpus in memory let's clean it up a little before we do our analysis. Usually you want to do this on documents that you are analyzing to remove words that are not relevant to your analysis such as “stopwords” or words such as and, like, but, if, the, and the like which you don’t care about. To do this with the textmining package you want to use transforms. Transforms essentially various functions to all documents in a corpus and that the form of tm_map(your corpus, some function). For example we can use tm_map like this: mycorpus <- tm_map(mycorpus, removePunctuation) Which will now remove all the punctuation marks from our tweets. We can do some other transforms to clean up our data by converting all the text to lower case, removing stop words, extra whitespace, and the like. mycorpus <- tm_map(mycorpus, removePunctuation) mycorpus <- tm_map(mycorpus, content_transformer(tolower)) mycorpus <- tm_map(mycorpus, stripWhitespace) mycorpus <- tm_map(mycorpus, removeWords, c(stopwords("english"), "news")) Note the last line. In that case we are using the stopwords() method but also adding our own word to it; news. You could append your own list of stopwords in this manner. Summary In this post we have looked at the basics of doing text mining in R by selecting data, preparing it, cleaning, then performing various operations on it to visualize that data. In the next post, Part 2, we look at a simple use case showing how we can derive real meaning and value from a visualization by seeing how a simple word cloud and help you understand the impact of an advertisement. About the author Robi Sen, CSO at Department 13, is an experienced inventor, serial entrepreneur, and futurist whose dynamic twenty-plus year career in technology, engineering, and research has led him to work on cutting edge projects for DARPA, TSWG, SOCOM, RRTO, NASA, DOE, and the DOD. Robi also has extensive experience in the commercial space, including the co-creation of several successful start-up companies. He has worked with companies such as UnderArmour, Sony, CISCO, IBM, and many others to help build out new products and services. Robi specializes in bringing his unique vision and thought process to difficult and complex problems allowing companies and organizations to find innovative solutions that they can rapidly operationalize or go to market with.
Read more
  • 0
  • 0
  • 3404

article-image-zabbix-configuration
Packt
16 Mar 2015
18 min read
Save for later

Zabbix Configuration

Packt
16 Mar 2015
18 min read
In this article by Patrik Uytterhoeven, author of the book Zabbix Cookbook, we will see the following topics: Server installation and configuration Agent installation and configuration Frontend installation and configuration (For more resources related to this topic, see here.) We will begin with the installation and configuration of a Zabbix server, Zabbix agent, and web interface. We will make use of our package manager for the installation. Not only will we show you how to install and configure Zabbix, we will also show you how to compile everything from source. We will also cover the installation of the Zabbix server in a distributed way. Server installation and configuration Here we will explain how to install and configure the Zabbix server, along with the prerequisites. Getting ready To get started, we need a properly configured server, with a Red Hat 6.x or 7.x 64-bit OS installed or a derivate such as CentOS. It is possible to get the installation working on other distributions such as SUSE, Debian, Ubuntu, or another Linux distribution, but I will be focusing on Red Hat based systems. I feel that it's the best choice as the OS is not only available for big companies willing to pay Red Hat for support, but also for those smaller companies that cannot afford to pay for it, or for those just willing to test it or run it with community support. Other distros like Debian, Ubuntu, SUSE, OpenBSD will work fine too. It is possible to run Zabbix on 32-bit systems, but I will only focus on 64-bit installations as 64-bit is probably what you will run in a production setup. However if you want to try it on 32-bit system, it is perfectly possible with the use of the Zabbix 32-bit binaries. How to do it... The following steps will guide you through the server installation process: The first thing we need to do is add the Zabbix repository to our package manager on our server so that we are able to download the Zabbix packages to set up our server. To find the latest repository, go to the Zabbix webpage www.zabbix.com and click on Product | Documentation then select the latest version. At the time of this writing, it is version 2.4. From the manual, select option 3 Installation, then go to option 3 Installation from packages and follow instructions to install the Zabbix repository. Now that our Zabbix repository is installed, we can continue with our installation. For our Zabbix server to work, we will also need a database for example: MySQL, PostgreSQL, Oracle and a web server for the frontend such as Apache, Nginx, and so on. In our setup, we will install Apache and MySQL as they are better known and easiest to set up. There is a bit of a controversy around MySQL that was acquired by Oracle some time ago. Since then, most of the original developers left and forked the project. Those forks have also made major improvements over MySQL. It could be a good alternative to make use of MariaDB or Percona. In Red Hat Enterprise Linux (RHEL) 7.x, MySQL has been replace already by MariaDB. http://www.percona.com/. https://mariadb.com/. http://www.zdnet.com/article/stallman-admits-gpl-flawed-proprietary-licensing-needed-to-pay-for-mysql-development/. The following steps will show you how to install the MySQL server and the Zabbix server with a MySQL connection: # yum install mysql-server zabbix-server-mysql # yum install mariadb-server zabbix-server-mysql (for RHEL 7) # service mysqld start # systemctl start mariadb.service (for RHEL 7) # /usr/bin/mysql_secure_installation We make use of MySQL because it is what most people know best and use most of the time. It is also easier to set up than PostgreSQL for most people. However, a MySQL DB will not shrink in size. It's probably wise to use PostgreSQL instead, as PostgreSQL has a housekeeper process that cleans up the database. However, in very large setups this housekeeper process of PostgreSQL can at times also be the problem of slowness. When this happens, a deeper understanding of how housekeeper works is needed. MySQL will come and ask us some questions here so make sure you read the next lines before you continue: For the MySQL secure installation, we are being asked to give the current root password or press Enter if you don't have one. This is the root password for MySQL and we don't have one yet as we did a clean installation of MySQL. So you can just press Enter here. Next question will be to set a root password; best thing is of course, to set a MySQL root password. Make it a complex one and store it safe in a program such as KeePass or Password Safe. After the root password is set, MySQL will prompt you to remove anonymous users. You can select Yes and let MySQL remove them. We also don't need any remote login of root users, so best is to disallow remote login for the root user as well. For our production environment, we don't need any test databases left on our server. So those can also be removed from our machine and finally we do a reload of the privileges. You can now continue with the rest of the configuration by configuring our database and starting all the services. This way we make sure they will come up again when we restart our server: # mysql -u root -p mysql> create database zabbix character set utf8 collate utf8_bin;mysql> grant all privileges on zabbix.* to zabbix@localhost identified by '<some-safe-password>';mysql> exit; # cd /usr/share/doc/zabbix-server-mysql-2.4.x/create# mysql -u zabbix -p zabbix < schema.sql# mysql -u zabbix -p zabbix < images.sql# mysql -u zabbix -p zabbix < data.sql Depending on the speed of your machine, importing the schema could take some time (a few minutes). It's important to not mix the order of the import of the SQL files! Now let's edit the Zabbix server configuration file and add our database settings in it: # vi /etc/zabbix/zabbix_server.confDBHost=localhostDBName=zabbixDBUser=zabbixDBPassword=<some-safe-password> Let's start our Zabbix server and make sure it will come online together with the MySQL database after reboot: # service zabbix-server start # chkconfig zabbix-server on # chkconfig mysqld on On RHEL 7 this will be: # systemctl start zabbix-server # systemctl enable zabbix-server # systemctl enable mariadb Check now if our server was started correctly: # tail /var/log/zabbix/zabbix_server.log The output would look something like this: # 1788:20140620:231430.274 server #7 started [poller #5]# 1804:20140620:231430.276 server #19 started [discoverer #1] If no errors where displayed in the log, your zabbix-server is online. In case you have errors, they will probably look like this: 1589:20150106:211530.180 [Z3001] connection to database 'zabbix' failed: [1045] Access denied for user 'zabbix'@'localhost' (using password: YES)1589:20150106:211530.180 database is down: reconnecting in 10 seconds In this case, go back to the zabbix_server.conf file and check the DBHost, DBName, DBUser, and DBPassword parameters again to see if they are correct. The only thing that needs to be done is editing the firewalld. Add the following line in the /etc/sysconfig/iptables file under the line with dport 22. This can be done with vi or Emacs or another editor such as: the vi /etc/sysconfig/iptables file. If you would like to know more about iptables have a look at the CentOS wiki In this case, go back to the zabbix_server.conf file and check the DBHost, DBName, DBUser, and DBPassword parameters again to see if they are correct. The only thing that needs to be done is editing the firewalld. Add the following line in the /etc/sysconfig/iptables file under the line with dport 22. This can be done with vi or Emacs or another editor such as: the vi /etc/sysconfig/iptables file. If you would like to know more about iptables have a look at the CentOS wiki. # -A INPUT -m state --state NEW -m tcp -p tcp --dport 10051 -j ACCEPT People making use of RHEL 7 have firewall and need to run following commands instead: # firewall-cmd --permanent --add-port=10051/tcp Now that this is done, you can reload the firewall. The Zabbix server is installed and we are ready to continue to the installation of the agent and the frontend: # service iptables restart # firewall-cmd --reload (For users of RHEL 7) Always check if the ports 10051 and 10050 are also in your /etc/services file both server and agent are IANA registered. How it works... The installation we have done here is just for the Zabbix server and the database. We still need to add an agent and a frontend with a web server. The Zabbix server will communicate through the local socket with the MySQL database. Later, we will see how we can change this if we want to install MySQL on another server than our Zabbix server. The Zabbix server needs a database to store its configuration and the received data, for which we have installed a MySQL database. Remember we did a create database and named it zabbix? Then we did a grant on the zabbix database and we gave all privileges on this database to a user with the name zabbix with some free to choose password <some-safe-password>. After the creation of the database we had to upload three files namely schema.sql, images.sql, and data.sql. Those files contain the database structure and data for the Zabbix server to work. It is very important that you keep the correct order when you upload them to your database. The next thing we did was adjusting the zabbix_server.conf file; this is needed to let our Zabbix server know what database we have used with what credentials and where the location is. The next thing we did was starting the Zabbix server and making sure that with a reboot, both MySQL and the Zabbix server would start up again. Our final step was to check the log file to see if the Zabbix server was started without any errors and the opening of TCP port 10051 in the firewall. Port 10051 is the port being used by Zabbix active agents to communicate with the server. There's more... We have changed some settings for the communication with our database in the /etc/zabbix/zabbix_server.conf file but there are many more options in this file to set. So let's have a look at which are the other options that we can change. The following URL gives us an overview of all supported parameters in the zabbix_server.conf file: https://www.zabbix.com/documentation/2.4/manual/appendix/config/zabbix_server. You can start the server with another configuration file so you can experiment with multiple configuration settings. This can be useful if you like to experiment with certain options. To do this, run the following command where the <config file> file is another zabbix_server.conf file than the original file: zabbix_server -c <config file> See also http://www.greensql.com/content/mysql-security-best-practices-hardening-mysql-tips http://passwordsafe.sourceforge.net/ http://keepass.info/ http://www.fpx.de/fp/Software/Gorilla/ http://wiki.centos.org/HowTos/Network/IPTables Agent installation and configuration In this section, we will explain you the installation and configuration of the Zabbix agent. The Zabbix agent is a small piece of software about 700 KB in size. You will need to install this agent on all your servers to be able to monitor the local resources with Zabbix. Getting ready In this recipe, to get our Zabbix agent installed, we need to have our server with the Zabbix server up and running. In this setup, we will install our agent first on the Zabbix server. We just make use of the Zabbix server in this setup to install a server that can monitor itself. If you monitor another server then there is no need to install a Zabbix server, only the agent is enough. How to do it... Installing the Zabbix agent is quite easy once our server has been set up. The first thing we need to do is install the agent package. Installing the agent packages can be done by running yum. In case you have skipped it, then go back and add the Zabbix repository to your package manager. Install the Zabbix agent from the package manager: # yum install zabbix-agent Open the correct port in your firewall. The Zabbix server communicates to the agent if the agent is passive. So, if your agent is on a server other than Zabbix, then we need to open the firewall on port 10050. Edit the firewall, open the file /etc/sysconfig/iptables and add the following after the line with dport 22 in the next line: # -A INPUT -m state --state NEW -m tcp -p tcp --dport 10050 -j ACCEPT Users of RHEL 7 can run: # firewall-cmd --permanent --add-port=10050/tcp Now that the firewall is adjusted, you can restart the same: # service iptables restart # firewall-cmd --reload (if you use RHEL 7) The only thing left to do is edit the zabbix_agentd.conf file, start the agent, and make sure it starts after a reboot. Edit the Zabbix agent configuration file and add or change the following settings: # vi /etc/zabbix/zabbix_agentd.conf Server=<ip of the zabbix server> ServerActive=<ip of the zabbix server> That's all for now in order to edit in the zabbix_agentd.conf file. Now, let's start the Zabbix agent: # service zabbix-agent start # systemctl start zabbix-agent (if you use RHEL 7) And finally make sure that our agent will come online after a reboot: # chkconfig zabbix-agent on # systemctl enable zabbix-agent (for RHEL 7 users) Check again that there are no errors in the log file from the agent: # tail /var/log/zabbix/zabbix_agentd.log How it works... The agent we have installed is installed from the Zabbix repository on the Zabbix server, and communicates to the server on port 10051 if we make use of an active agent. If we make use of a passive agent, then our Zabbix server will talk to the Zabbix agent on port 10050. Remember that our agent is installed locally on our host; so all communication stays on our server. This is not the case if our agent is installed on another server instead of our Zabbix server. We have edited the configuration file from the agent and changed the Server and ServerActive options. Our Zabbix agent is now ready to communicate with our Zabbix server. Based on the two parameters we have changed, the agent knows what the IP is from the Zabbix server. The difference between passive and active modes is that the client in passive mode will wait for the Zabbix server to ask for data from the Zabbix agent. The agent in active mode will ask the server first what it needs to monitor and pull this data from the Zabbix server. From that moment on, the Zabbix agent will send the values by itself to the server at regular intervals. So when we use a passive agent the Zabbix server pulls the data from the agent where a active agent pushes the data to the server. We did not change the hostname item in the zabbix_agentd.conf file, a parameter we normally need to change and give the host a unique name. In our case the name in the agent will already be in the Zabbix server that we have installed, so there is no need to change it this time. There's more... Just like our server, the agent has a plenty more options to set in its configuration file. So open the file again and have a look at what else we can adjust. In the following URLs you will find all options that can be changed in the Zabbix agent configuration file for Unix and Windows: https://www.zabbix.com/documentation/2.4/manual/appendix/config/zabbix_agentd. https://www.zabbix.com/documentation/2.4/manual/appendix/config/zabbix_agentd_win. Frontend installation and configuration In this recipe, we will finalize our setup with the installation and configuration of the Zabbix web interface. Our Zabbix configuration is different from other monitoring tools such as Nagios in the way that the complete configuration is stored in a database. This means, that we need a web interface to be able to configure and work with the Zabbix server. It is not possible to work without the web interface and just make use of some text files to do the configuration. Getting ready To be successful with this installation, you need to have installed the Zabbix server. It's not necessary to have the Zabbix client installed but it is recommended. This way, we can monitor our Zabbix server because we have a Zabbix agent running on our Zabbix server. This can be useful in monitoring your own Zabbix servers health status. How to do it... The first thing we need to do is go back to our prompt and install the Zabbix web frontend packages. # yum install zabbix-web zabbix-web-mysql With the installation of our Zabbix-web package, Apache was installed too, so we need to start Apache first and make sure it will come online after a reboot: # chkconfig httpd on; service start httpd # systemctl start httpd; systemctl enable httpd (for RHEL 7) Remember we have a firewall, so the same rule applies here. We need to open the port for the web server to be able to see our Zabbix frontend. Edit the /etc/sysconfig/iptables firewall file and add after the line with dport 22 in the next line: # -A INPUT -m state –state NEW -m tcp -p tcp –dport 80 -j ACCEPT If iptables is too intimidating for you, then an alternative could to make use of Shorewall. http://www.cyberciti.biz/faq/centos-rhel-shorewall-firewall-configuration-setup-howto-tutorial/. Users of RHEL 7 can run the following lines: # firewall-cmd --permanent --add-service=http The following screenshot shows the firewall configuration: Now that the firewall is adjusted, you can save and restart the firewall: # iptables-save # service iptables restart # firewall-cmd --reload (If you run RHEL 7) Now edit the Zabbix configuration file with the PHP setting. Uncomment the option for the timezone and fill in the correct timezone: # vi /etc/httpd/conf.d/zabbix.conf php_value date.timezone Europe/Brussels It is now time to reboot our server and see if everything comes back online with our Zabbix server configured like we intended it to. The reboot here is not necessary but it's a good test to see if we did a correct configuration of our server: # reboot Now let's see if we get to see our Zabbix server. Go to the URL of our Zabbix server that we just have installed: # http://<ip of the Zabbix server>/zabbix On the first page, we see our welcome screen. Here, we can just click Next: The standard Zabbix installation will run on port 80, although It isn't really a safe solution. It would be better to make use of HTTPS. However, this is a bit out of the scope but could be done with not too much extra work and would make Zabbix more safe. http://wiki.centos.org/HowTos/Https. Next screen, Zabbix will do a check of the PHP settings. Normally they should be fine as Zabbix provides a file with all correct settings. We only had to change the timezone parameter, remember? In case something goes wrong, go back to the zabbix.conf file and check the parameters: Next, we can fill in our connection details to connect to the database. If you remember, we did this already when we installed the server. Don't panic, it's completely normal. Zabbix, as we will see later, can be setup in a modular way so the frontend and the server both need to know where the database is and what the login credentials are. Press Test connection and when you get an OK just press Next again: Next screen, we have to fill in some Zabbix server details. Host and port should already be filled in; if not, put the correct IP and port in the fields. The field Name is not really important for the working of our Zabbix server but it's probably better to fill in a meaningful name here for your Zabbix installation: Now our setup is finished, and we can just click Next till we get our login screen. The Username and Password are standard the first time we set up the Zabbix server and are Admin for Username and zabbix for the Password: Summary In this article we saw how to install and configure Zabbix server, Zabbix agent, and web interface. We also learnt the commands for MySQL server and the Zabbix server with a MySQL connection. Then we went through the installation and configuration of Zabbix agent. Further we learnt to install the Zabbix web frontend packages and installing the firewall packages. Finally the we saw the steps for installation and configuration of Zabbix through screenshots. By learning the basics of Zabbix, we can now proceed with this technology. Resources for Article: Further resources on this subject: Going beyond Zabbix agents [article] Using Proxies to Monitor Remote Locations with Zabbix 1.8 [article] Triggers in Zabbix 1.8 [article]
Read more
  • 0
  • 0
  • 27102

article-image-add-a-twitter-sign-in-to-your-ios-app-with-twitterkit
Doron Katz
15 Mar 2015
5 min read
Save for later

Add a Twitter Sign In To Your iOS App with TwitterKit

Doron Katz
15 Mar 2015
5 min read
What is TwitterKit & Digits? In this post we take a look at Twitter’s new Sign-in API, TwitterKit and Digits, bundled as part of it’s new Fabric suite of tools, announced by Twitter earlier this year, as well as providing you with two quick snippets on on how to integrate Twitter’s sign-in mechanism into your iOS app. Facebook, and to a lesser-extent Google, have for a while dominated the single sign-on paradigm, through their SDK or Accounts.framework on iOS, to encourage developers to provide a consolidated form of login for their users. Twitter has decided to finally get on the band-wagon and work toward improving its brand through increasing sign-on participation, and providing a concise way for users to log-in to their favorite apps without needing to remember individual passwords. By providing a Login via Twitter button, developers will gain the user’s Twitter identity, and subsequently their associated twitter tweets and associations. That is, once the twitter account is identified, the app can engage followers of that account (friends), or to access the user’s history of tweets, to data-mind for a specific keyword or hashtag. In addition to offering a single sign-on, Twitter is also offering Digits, the ability for users to sign-on anonymously using one’s phone number, synonymous with Facebook’s new Anonymous Login API.   The benefits of Digit The rationale behind Digits is to provide users with the option of trusting the app or website and providing their Twitter identification in order to log-in. Another option for the more hesitant ones wanting to protect their social graph history, is to only provide a unique number, which happens to be a mobile number, as a means of identification and authentication. Another benefit for users is that logging in is dead-simple, and rather than having to go through a deterring form of identification questions, you just ask them for their number, from which they will get an authentication confirmation SMS, allowing them to log-in. With a brief introduction to TwitterKit and Digits, let’s show you how simple it is to implement each one. Logging in with TwitterKit Twitter wanted to make implementing its authentication mechanism a more simpler and attractive process for developers, which they did. By using the SDK as part of Twitter’s Fabric suite, you will already get your Twitter app set-up and ready, registered for use with the company’s SDK. TwitterKit aims to leverage the existing Twitter account on the iOS, using the Accounts.framework, which is the preferred and most rudimentary approach, with a fallback to using the OAuth mechanism. The easiest way to implement Twitter authentication is through the generated button, TWTRLogInButton, as we will demonstrate using iOS’Swift language. let authenticationButton = TWTRLogInButton(logInCompletion: { (session, error) in if (session != nil) { //We signed in, storing session in session object. } else { //we get an error, accessible from error object } }) It’s quite simple, leaving you with a TWTRLoginButton button subclass, that users can add to your view hierarchy and have users interact with. Logging in with Digits Having created a login button using TwitterKit, we will now create the same feature using Digits. The simplicity of implementation is maintained with Digits, with the simplest process once again to create a pre-configured button, DGTAuthenticateButton: let authenticationButton = TWTRLogInButton(logInCompletion: { (session, error) in if (session != nil) { //We signed in, storing session in session object. } else { //we get an error, accessible from error object } }) Summary Implementing TwitterKit and Digits are both quite straight forward in iOS, with different intentions. Whereas TwitterKit allows you to have full-access to the authenticated user’s social history, the latter allows for a more abbreviated, privacy-protected approach to authenticating. If at some stage the user decides to trust the app and feels more comfortable providing full access of her or his social history, you can defer catering to that till later in the app usage. The complete iOS reference for TwitterKit and Digits can be found by clicking here. The popularity and uptake of TwitterKit remains to be seen, but as an extra option for developers, when adding Facebook and Google+ login, users will have the option to pick their most trusted social media tool as their choice of authentication. Providing an anonymous mode of login also falls in line with the more privacy-conscious world, and Digits certainly provides a seamless way of implementing, and impressively straight-forward way for users to authenticate using their phone number. We have briefly demonstrated how to interact with Twitter’s SDK using iOS and Swift, but there is also an Android SDK version, with a Web version in the pipeline very soon, according to Twitter. This is certainly worth exploring, along with the rest of the tools offered in the Fabric suite, including analytics and beta-distribution tools, and more. About the author Doron Katz is an established Mobile Project Manager, Architect and Developer, and a pupil of the methodology of Agile Project Management,such as applying Kanban principles. Doron also believes in BehaviourDriven Development (BDD), anticipating user interaction prior to design, that is. Doron is also heavily involved in various technical user groups, such as CocoaHeads Sydney, and Adobe user Group.
Read more
  • 0
  • 0
  • 8469

article-image-sample-lemp-stack
Packt
12 Mar 2015
14 min read
Save for later

A Sample LEMP Stack

Packt
12 Mar 2015
14 min read
This article is written by Michael Peacock, the author of Creating Development Environments with Vagrant (Second Edition). Now that we have a good knowledge of using Vagrant to manage software development projects and how to use the Puppet provisioning tool, let's take a look at how to use these tools to build a Linux, Nginx, MySQL, and PHP (LEMP) development environment with Vagrant. In this article, you will learn the following topics: How to update the package manager How to create a LEMP-based development environment in Vagrant, including the following: How to install the Nginx web server How to customize the Nginx configuration file How to install PHP How to install and configure MySQL How to install e-mail sending services With the exception of MySQL, we will create simple Puppet modules to install and manage the software required. For MySQL, we will use the official Puppet module from Puppet Labs; this module makes it very easy for us to install and configure all aspects of MySQL. (For more resources related to this topic, see here.) Creating the Vagrant project First, we want to create a new project, so let's create a new folder called lemp-stack and initialize a new ubuntu/trusty64 Vagrant project within it by executing the following commands: mkdir lemp-stack cd lemp-stack vagrant init ubuntu/trusty64 ub The easiest way for us to pull in the MySQL Puppet module is to simply add it as a git submodule to our project. In order to add a git submodule, our project needs to be a git repository, so let's initialize it as a git repository now to save time later: git init To make the virtual machine reflective of a real-world production server, instead of forwarding the web server port on the virtual machine to another port on our host machine, we will instead network the virtual machine. This means that we would be able to access the web server via port 80 (which is typical on a production web server) by connecting directly to the virtual machine. In order to ensure a fixed IP address to which we can allocate a hostname on our network, we need to uncomment the following line from our Vagrantfile by removing the # from the start of the line: # config.vm.network "private_network", ip: "192.168.33.10" The IP address can be changed depending on the needs of our project. As this is a sample LEMP stack designed for web-based projects, let's configure our projects directory to a relevant web folder on the virtual machine: config.vm.synced_folder ".", "/var/www/project", type: "nfs" We will still need to configure our web server to point to this folder; however, it is more appropriate than the default mapping location of /vagrant. Before we run our Puppet provisioner to install our LEMP stack, we should instruct Vagrant to run the apt-get update command on the virtual machine. Without this, it isn't always possible to install new packages. So, let's add the following line to our Vagrant file within the |config| block: config.vm.provision "shell", inline: "apt-get update" As we will put our Puppet modules and manifests in a provision folder, we need to configure Vagrant to use the correct folders for our Puppet manifests and modules as well as the default manifest file. Adding the following code to our Vagrantfile will do this for us: config.vm.provision :puppet do |puppet|    puppet.manifests_path = "provision/manifests"    puppet.module_path = "provision/modules"    puppet.manifest_file = "vagrant.pp" end Creating the Puppet manifests Let's start by creating some folders for our Puppet modules and manifests by executing the following commands: mkdir provision cd provision mkdir modules mkdir manifests For each of the modules we want to create, we need to create a folder within the provision/modules folder for the module. Within this folder, we need to create a manifests folder, and within this, our Puppet manifest file, init.pp. Structurally, this looks something like the following: |-- provision |   |-- manifests |   |   `-- vagrant.pp |   `-- modules |       |-- our module |           |-- manifests |               `-- init.pp `-- Vagrantfile Installing Nginx Let's take a look at what is involved to install Nginx through a module and manifest file provision/modules/nginx/manifests/init.pp. First, we define our class, passing in a variable so that we can change the configuration file we use for Nginx (useful for using the same module for different projects or different environments such as staging and production environments), then we need to ensure that the nginx package is installed: class nginx ($file = 'default') {   package {"nginx":    ensure => present } Note that we have not closed the curly bracket for the nginx class. That is because this is just the first snippet of the file; we will close it at the end. Because we want to change our default Nginx configuration file, we should update the contents of the Nginx configuration file with one of our own (this will need to be placed in the provision/modules/nginx/files folder; unless the file parameter is passed to the class, the file default will be used): file { '/etc/nginx/sites-available/default':      source => "puppet:///modules/nginx/${file}",      owner => 'root',      group => 'root',      notify => Service['nginx'],      require => Package['nginx'] } Finally, we need to ensure that the nginx service is actually running once it has been installed: service { "nginx":    ensure => running,    require => Package["nginx"] } } This completes the manifest. We do still, however, need to create a default configuration file for Nginx, which is saved as provision/modules/nginx/files/default. This will be used unless we pass a file parameter to the nginx class when using the module. The sample file here is a basic configuration file, pointing to the public folder within our synced folder. The server name of lemp-stack.local means that Nginx will listen for requests on that hostname and will serve content from our projects folder: server {    listen   80;      root /var/www/project/public;    index index.php index.html index.htm;      server_name lemp-stack.local;      location / {        try_files $uri $uri/ /index.php?$query_string;    }      location ~ .php$ {        try_files $uri =404;        fastcgi_split_path_info ^(.+.php)(/.+)$;        #fastcgi_pass 127.0.0.1:9000;        fastcgi_param SERVER_NAME $host;        fastcgi_pass unix:/var/run/php5-fpm.sock;        fastcgi_index index.php;        fastcgi_intercept_errors on;        include fastcgi_params;    }      location ~ /.ht {        deny all;    }      location ~* .(jpg|jpeg|gif|css|png|js|ico|html)$ {        access_log off;        expires max;    }      location ~* .svgz {        add_header Content-Encoding "gzip";    } } Because this configuration file listens for requests on lemp-stack.local, we need to add a record to the hosts file on our host machine, which will redirect traffic from lemp-stack.local to the IP address of our virtual machine. Installing PHP To install PHP, we need to install a range of related packages, including the Nginx PHP module. This would be in the file provision/modules/php/manifests/init.pp. On more recent (within the past few years) Linux and PHP installations, PHP uses a handler called php-fpm as a bridge between PHP and the web server being used. This means that when new PHP modules are installed or PHP configurations are changed, we need to restart the php-fpm service for these changes to take effect, whereas in the past, it was often the web servers that needed to be restarted or reloaded. To make our simple PHP Puppet module flexible, we need to install the php5-fpm package and restart it when other modules are installed, but only when we use Nginx on our server. To achieve this, we can use a class parameter, which defaults to true. This lets us use the same module in servers that don't have a web server, and where we don't want to have the overhead of the FPM service, such as a server that runs background jobs or processing: class php ($nginx = true) { If the nginx parameter is true, then we need to install php5-fpm. Since this package is only installed when the flag is set to true, we cannot have PHP and its modules requiring or notifying the php-fpm package, as it may not be installed; so instead we need to have the php5-fpm package subscribe to these packages:    if ($nginx) {        package { "php5-fpm":          ensure => present,          subscribe => [Package['php5-dev'], Package['php5-curl'], Package['php5-gd'], Package['php5-imagick'], Package['php5-mcrypt'], Package['php5-mhash'], Package['php5-pspell'], Package['php5-json'], Package['php5-xmlrpc'], Package['php5-xsl'], Package['php5-mysql']]        }    } The rest of the manifest can then simply be the installation of the various PHP modules that are required for a typical LEMP setup:    package { "php5-dev":        ensure => present    }      package { "php5-curl":        ensure => present    }      package { "php5-gd":        ensure => present    }      package { "php5-imagick":        ensure => present    }      package { "php5-mcrypt":        ensure => present    }      package { "php5-mhash":        ensure => present    }      package { "php5-pspell":        ensure => present    }      package { "php5-xmlrpc":        ensure => present    }      package { "php5-xsl":        ensure => present    }      package { "php5-cli":        ensure => present    }      package { "php5-json":        ensure => present    } } Installing the MySQL module Because we are going to use the Puppet module for MySQL provided by Puppet Labs, installing the module is very straightforward; we simply add it as a git submodule to our project with the following command: git submodule add https://github.com/puppetlabs/puppetlabs-mysql.git provision/modules/mysql You might want to use a specific release for this module, as the code changes on a semi-regular basis. A stable release is available at https://github.com/puppetlabs/puppetlabs-mysql/releases/tag/3.1.0. Default manifest Finally, we need to pull these modules together, and install them when our machine is provisioned. To do this, we simply add the following modules to our vagrant.pp manifest file in the provision/manifests folder. Installing Nginx and PHP We need to include our nginx class and optionally provide a filename for the configuration file; if we don't provide one, the default will be used: class {    'nginx':        file => 'default' } Similarly for PHP, we need to include the class and in this case, pass an nginx parameter to ensure that it installs PHP5-FPM too: class {    'php':        nginx => true } Hostname configuration We should tell our Vagrant virtual machine what its hostname is by adding a host resource to our manifest: host { 'lemp-stack.local':    ip => '127.0.0.1',    host_aliases => 'localhost', } E-mail sending services Because some of our projects might involve sending e-mails, we should install e-mail sending services on our virtual machine. As these are simply two packages, it makes more sense to include them in our Vagrant manifest, as opposed to their own modules: package { "postfix":    ensure => present }   package { "mailutils":    ensure => present } MySQL configuration Because the MySQL module is very flexible and manages all aspects of MySQL, there is quite a bit for us to configure. We need to perform the following steps: Create a database. Create a user. Give the user permission to use the database (grants). Configure the MySQL root password. Install the MySQL client. Install the MySQL client bindings for PHP. The MySQL server class has a range of parameters that can be passed to configure it, including databases, users, and grants. So, first, we need to define what the databases, users, and grants are that we want to be configured: $databases = { 'lemp' => {    ensure => 'present',    charset => 'utf8' }, }   $users = { 'lemp@localhost' => {    ensure                   => 'present',    max_connections_per_hour => '0',    max_queries_per_hour     => '0',    max_updates_per_hour     => '0',    max_user_connections     => '0',    password_hash           => 'MySQL-Password-Hash', }, } The password_hash parameter here is for a hash generated by MySQL. You can generate a password hash by connecting to an existing MySQL instance and running a query such as SELECT PASSWORD('password'). The grant maps our user and database and specifies what permissions the user can perform on that database when connecting from a particular host (in this case, localhost—so from the virtual machine itself): $grants = { 'lemp@localhost/lemp.*' => {    ensure     => 'present',    options   => ['GRANT'],    privileges => ['ALL'],    table      => 'lemp.*',    user       => 'lemp@localhost', }, } We then pass these values to the MySQL server class. We also provide a root password for MySQL (unlike earlier, this is provided in plain text), and we can override the options from the MySQL configuration file. This is unlike our own Nginx module that provides a full file—in this instance, the MySQL module provides a template configuration file and the changes are replaced in that template to create a configuration file: class { '::mysql::server': root_password   => 'lemp-root-password', override_options => { 'mysqld' => { 'max_connections' => '1024' } }, databases => $databases, users => $users, grants => $grants, restart => true } As we will have a web server running on this machine, which needs to connect to this database server, we also need the client library and the client bindings for PHP, so that we can include them too: include '::mysql::client'   class { '::mysql::bindings': php_enable => true } Launching the virtual machine In order to launch our new virtual machine, we simply need to run the following command: Vagrant up We should now see our VM boot and the various Puppet phases execute. If all goes well, we should see no errors in this process. Summary In this article, we learned about the steps involved in creating a brand new Vagrant project, configuring it to integrate with our host machine, and setting up a standard LEMP stack using the Puppet provisioning tool. Now you should have a basic understanding of Vagrant and how to use it to ensure that your software projects are managed more effectively! Resources for Article: Further resources on this subject: Android Virtual Device Manager [article] Speeding Vagrant Development With Docker [article] Hyper-V Basics [article]
Read more
  • 0
  • 0
  • 13555
article-image-sharing-your-story
Packt
10 Mar 2015
3 min read
Save for later

Sharing Your Story

Packt
10 Mar 2015
3 min read
In this article by Ashley Chiasson, author of the book Articulate Storyline Essentials, we will see how to preview your story. (For more resources related to this topic, see here.) Previewing your story Previewing a story might sound like a straightforward concept, and it is, but Storyline gives you a ton of different previewing options, and you can pick and choose what works best for you! There are two main ways for you to preview an entire story. The most straightforward way of previewing a story is to select the Preview button from the Home tab. The other way to preview an entire story is to select the Preview icon on the bottom pane of the Storyline interface. You can also use the shortcut key F12 to preview an entire story. Once you choose to preview the full story, the Preview menu will appear. Here you can go through the story as your audience would and make any necessary adjustments prior to publishing the story. Within the Preview menu, you can close the preview; select individual slides; replay a particular slide, scene, or the entire project; and edit an individual slide. Maybe you only want to preview a particular slide or scene. In this instance, you'll want to select the drop-down icon on the Preview button on the Home tab, and then select whether you want to preview This Slide or This Scene. To preview the selected slide, you can use the shortcut key Ctrl + F12. To preview the selected scene, you can use the shortcut key Shift + F12. These options are fantastic and will save you a lot of preview-generating time, particularly when you have a slide- or scene-heavy story and don't want to go through the motions of previewing the entire story each and every time you wish to see a certain piece of the story. It is important to note that not all content within Storyline is available during preview. These items include hyperlinks, imported interactions (for example, from Articulate Engage), web objects, videos from external websites, and course completion/tracking status. Once you have selected Preview, you will be provided with the Preview menu. This menu allows you to do several things: Close the preview Select a different slide (if previewing the entire story or a scene) Replay the slide, scene, or entire course Edit the selected slide within Slide View Once you have previewed your story and have determined that everything is as you want it to be, you're ready to customize your Storyline player and publish! Summary This article explained how to preview your story. Storyline makes it easy to customize your learners' experience and share your story. Previewing your story allows you to streamline your development; without a preview feature, you would have to publish every single time you wanted to see a slide—no one has time for that! You should now feel comfortable working with the player customization options, so let your imagination flow and create a custom player for your story! If you're looking to dig a bit deeper into Articulate Storyline's capabilities, please check out Learning Articulate Storyline by Stephanie Harnett, and stay tuned for Mastering Articulate Storyline by Ashley Chiasson (slated for release in mid 2015), where you'll learn all about pushing Storyline's features and functionality to the absolute limits! Resources for Article: Further resources on this subject: Creating Your Course with Presenter [article] Rapid Development [article] Moodle for Online Communities [article]
Read more
  • 0
  • 0
  • 973

article-image-pricing-double-no-touch-option
Packt
10 Mar 2015
19 min read
Save for later

Pricing the Double-no-touch option

Packt
10 Mar 2015
19 min read
In this article by Balázs Márkus, coauthor of the book Mastering R for Quantitative Finance, you will learn about pricing and life of Double-no-touch (DNT) option. (For more resources related to this topic, see here.) A Double-no-touch (DNT) option is a binary option that pays a fixed amount of cash at expiry. Unfortunately, the fExoticOptions package does not contain a formula for this option at present. We will show two different ways to price DNTs that incorporate two different pricing approaches. In this section, we will call the function dnt1, and for the second approach, we will use dnt2 as the name for the function. Hui (1996) showed how a one-touch double barrier binary option can be priced. In his terminology, "one-touch" means that a single trade is enough to trigger the knock-out event, and "double barrier" binary means that there are two barriers and this is a binary option. We call this DNT as it is commonly used on the FX markets. This is a good example for the fact that many popular exotic options are running under more than one name. In Haug (2007a), the Hui-formula is already translated into the generalized framework. S, r, b, s, and T have the same meaning. K means the payout (dollar amount) while L and U are the lower and upper barriers. Where Implementing the Hui (1996) function to R starts with a big question mark: what should we do with an infinite sum? How high a number should we substitute as infinity? Interestingly, for practical purposes, small number like 5 or 10 could often play the role of infinity rather well. Hui (1996) states that convergence is fast most of the time. We are a bit skeptical about this since a will be used as an exponent. If b is negative and sigma is small enough, the (S/L)a part in the formula could turn out to be a problem. First, we will try with normal parameters and see how quick the convergence is: dnt1 <- function(S, K, U, L, sigma, T, r, b, N = 20, ploterror = FALSE){    if ( L > S | S > U) return(0)    Z <- log(U/L)    alpha <- -1/2*(2*b/sigma^2 - 1)    beta <- -1/4*(2*b/sigma^2 - 1)^2 - 2*r/sigma^2    v <- rep(0, N)    for (i in 1:N)        v[i] <- 2*pi*i*K/(Z^2) * (((S/L)^alpha - (-1)^i*(S/U)^alpha ) /            (alpha^2+(i*pi/Z)^2)) * sin(i*pi/Z*log(S/L)) *              exp(-1/2 * ((i*pi/Z)^2-beta) * sigma^2*T)    if (ploterror) barplot(v, main = "Formula Error");    sum(v) } print(dnt1(100, 10, 120, 80, 0.1, 0.25, 0.05, 0.03, 20, TRUE)) The following screenshot shows the result of the preceding code: The Formula Error chart shows that after the seventh step, additional steps were not influencing the result. This means that for practical purposes, the infinite sum can be quickly estimated by calculating only the first seven steps. This looks like a very quick convergence indeed. However, this could be pure luck or coincidence. What about decreasing the volatility down to 3 percent? We have to set N as 50 to see the convergence: print(dnt1(100, 10, 120, 80, 0.03, 0.25, 0.05, 0.03, 50, TRUE)) The preceding command gives the following output: Not so impressive? 50 steps are still not that bad. What about decreasing the volatility even lower? At 1 percent, the formula with these parameters simply blows up. First, this looks catastrophic; however, the price of a DNT was already 98.75 percent of the payout when we used 3 percent volatility. Logic says that the DNT price should be a monotone-decreasing function of volatility, so we already know that the price of the DNT should be worth at least 98.75 percent if volatility is below 3 percent. Another issue is that if we choose an extreme high U or extreme low L, calculation errors emerge. However, similar to the problem with volatility, common sense helps here too; the price of a DNT should increase if we make U higher or L lower. There is still another trick. Since all the problem comes from the a parameter, we can try setting b as 0, which will make a equal to 0.5. If we also set r to 0, the price of a DNT converges into 100 percent as the volatility drops. Anyway, whenever we substitute an infinite sum by a finite sum, it is always good to know when it will work and when it will not. We made a new code that takes into consideration that convergence is not always quick. The trick is that the function calculates the next step as long as the last step made any significant change. This is still not good for all the parameters as there is no cure for very low volatility, except that we accept the fact that if implied volatilities are below 1 percent, than this is an extreme market situation in which case DNT options should not be priced by this formula: dnt1 <- function(S, K, U, L, sigma, Time, r, b) { if ( L > S | S > U) return(0) Z <- log(U/L) alpha <- -1/2*(2*b/sigma^2 - 1) beta <- -1/4*(2*b/sigma^2 - 1)^2 - 2*r/sigma^2 p <- 0 i <- a <- 1 while (abs(a) > 0.0001){    a <- 2*pi*i*K/(Z^2) * (((S/L)^alpha - (-1)^i*(S/U)^alpha ) /      (alpha^2 + (i *pi / Z)^2) ) * sin(i * pi / Z * log(S/L)) *        exp(-1/2*((i*pi/Z)^2-beta) * sigma^2 * Time)    p <- p + a    i <- i + 1 } p } Now that we have a nice formula, it is possible to draw some DNT-related charts to get more familiar with this option. Later, we will use a particular AUDUSD DNT option with the following parameters: L equal to 0.9200, U equal to 0.9600, K (payout) equal to USD 1 million, T equal to 0.25 years, volatility equal to 6 percent, r_AUD equal to 2.75 percent, r_USD equal to 0.25 percent, and b equal to -2.5 percent. We will calculate and plot all the possible values of this DNT from 0.9200 to 0.9600; each step will be one pip (0.0001), so we will use 2,000 steps. The following code plots a graph of price of underlying: x <- seq(0.92, 0.96, length = 2000) y <- z <- rep(0, 2000) for (i in 1:2000){    y[i] <- dnt1(x[i], 1e6, 0.96, 0.92, 0.06, 0.25, 0.0025, -0.0250)    z[i] <- dnt1(x[i], 1e6, 0.96, 0.92, 0.065, 0.25, 0.0025, -0.0250) } matplot(x, cbind(y,z), type = "l", lwd = 2, lty = 1,    main = "Price of a DNT with volatility 6% and 6.5% ", cex.main = 0.8, xlab = "Price of underlying" ) The following output is the result of the preceding code: It can be clearly seen that even a small change in volatility can have a huge impact on the price of a DNT. Looking at this chart is an intuitive way to find that vega must be negative. Interestingly enough even just taking a quick look at this chart can convince us that the absolute value of vega is decreasing if we are getting closer to the barriers. Most end users think that the biggest risk is when the spot is getting close to the trigger. This is because end users really think about binary options in a binary way. As long as the DNT is alive, they focus on the positive outcome. However, for a dynamic hedger, the risk of a DNT is not that interesting when the value of the DNT is already small. It is also very interesting that since the T-Bill price is independent of the volatility and since the DNT + DOT = T-Bill equation holds, an increasing volatility will decrease the price of the DNT by the exact same amount just like it will increase the price of the DOT. It is not surprising that the vega of the DOT should be the exact mirror of the DNT. We can use the GetGreeks function to estimate vega, gamma, delta, and theta. For gamma we can use the GetGreeks function in the following way: GetGreeks <- function(FUN, arg, epsilon,...) {    all_args1 <- all_args2 <- list(...)    all_args1[[arg]] <- as.numeric(all_args1[[arg]] + epsilon)    all_args2[[arg]] <- as.numeric(all_args2[[arg]] - epsilon)    (do.call(FUN, all_args1) -        do.call(FUN, all_args2)) / (2 * epsilon) } Gamma <- function(FUN, epsilon, S, ...) {    arg1 <- list(S, ...)    arg2 <- list(S + 2 * epsilon, ...)    arg3 <- list(S - 2 * epsilon, ...)    y1 <- (do.call(FUN, arg2) - do.call(FUN, arg1)) / (2 * epsilon)    y2 <- (do.call(FUN, arg1) - do.call(FUN, arg3)) / (2 * epsilon)  (y1 - y2) / (2 * epsilon) } x = seq(0.9202, 0.9598, length = 200) delta <- vega <- theta <- gamma <- rep(0, 200)   for(i in 1:200){ delta[i] <- GetGreeks(FUN = dnt1, arg = 1, epsilon = 0.0001,    x[i], 1000000, 0.96, 0.92, 0.06, 0.5, 0.02, -0.02) vega[i] <-   GetGreeks(FUN = dnt1, arg = 5, epsilon = 0.0005,    x[i], 1000000, 0.96, 0.92, 0.06, 0.5, 0.0025, -0.025) theta[i] <- - GetGreeks(FUN = dnt1, arg = 6, epsilon = 1/365,    x[i], 1000000, 0.96, 0.92, 0.06, 0.5, 0.0025, -0.025) gamma[i] <- Gamma(FUN = dnt1, epsilon = 0.0001, S = x[i], K =    1e6, U = 0.96, L = 0.92, sigma = 0.06, Time = 0.5, r = 0.02, b = -0.02) }   windows() plot(x, vega, type = "l", xlab = "S",ylab = "", main = "Vega") The following chart is the result of the preceding code: After having a look at the value chart, the delta of a DNT is also very close to intuitions; if we are coming close to the higher barrier, our delta gets negative, and if we are coming closer to the lower barrier, the delta gets positive as follows: windows() plot(x, delta, type = "l", xlab = "S",ylab = "", main = "Delta") This is really a non-convex situation; if we would like to do a dynamic delta hedge, we will lose money for sure. If the spot price goes up, the delta of the DNT decreases, so we should buy some AUDUSD as a hedge. However, if the spot price goes down, we should sell some AUDUSD. Imagine a scenario where AUDUSD goes up 20 pips in the morning and then goes down 20 pips in the afternoon. For a dynamic hedger, this means buying some AUDUSD after the price moved up and selling this very same amount after the price comes down. The changing of the delta can be described by the gamma as follows: windows() plot(x, gamma, type = "l", xlab = "S",ylab = "", main = "Gamma") Negative gamma means that if the spot goes up, our delta is decreasing, but if the spot goes down, our delta is increasing. This doesn't sound great. For this inconvenient non-convex situation, there is some compensation, that is, the value of theta is positive. If nothing happens, but one day passes, the DNT will automatically worth more. Here, we use theta as minus 1 times the partial derivative, since if (T-t) is the time left, we check how the value changes as t increases by one day: windows() plot(x, theta, type = "l", xlab = "S",ylab = "", main = "Theta") The more negative the gamma, the more positive our theta. This is how time compensates for the potential losses generated by the negative gamma. Risk-neutral pricing also implicates that negative gamma should be compensated by a positive theta. This is the main message of the Black-Scholes framework for vanilla options, but this is also true for exotics. See Taleb (1997) and Wilmott (2006). We already introduced the Black-Scholes surface before; now, we can go into more detail. This surface is also a nice interpretation of how theta and delta work. It shows the price of an option for different spot prices and times to maturity, so the slope of this surface is the theta for one direction and delta for the other. The code for this is as follows: BS_surf <- function(S, Time, FUN, ...) { n <- length(S) k <- length(Time) m <- matrix(0, n, k) for (i in 1:n) {    for (j in 1:k) {      l <- list(S = S[i], Time = Time[j], ...)      m[i,j] <- do.call(FUN, l)      } } persp3D(z = m, xlab = "underlying", ylab = "Time",    zlab = "option price", phi = 30, theta = 30, bty = "b2") } BS_surf(seq(0.92,0.96,length = 200), seq(1/365, 1/48, length = 200), dnt1, K = 1000000, U = 0.96, L = 0.92, r = 0.0025, b = -0.0250,    sigma = 0.2) The preceding code gives the following output: We can see what was already suspected; DNT likes when time is passing and the spot is moving to the middle of the (L,U) interval. Another way to price the Double-no-touch option Static replication is always the most elegant way of pricing. The no-arbitrage argument will let us say that if, at some time in the future, two portfolios have the same value for sure, then their price should be equal any time before this. We will show how double-knock-out (DKO) options could be used to build a DNT. We will need to use a trick; the strike price could be the same as one of the barriers. For a DKO call, the strike price should be lower than the upper barrier because if the strike price is not lower than the upper barrier, the DKO call would be knocked out before it could become in-the-money, so in this case, the option would be worthless as nobody can ever exercise it in-the-money. However, we can choose the strike price to be equal to the lower barrier. For a put, the strike price should be higher than the lower barrier, so why not make it equal to the upper barrier. This way, the DKO call and DKO put option will have a very convenient feature; if they are still alive, they will both expiry in-the-money. Now, we are almost done. We just have to add the DKO prices, and we will get a DNT that has a payout of (U-L) dollars. Since DNT prices are linear in the payout, we only have to multiply the result by K*(U-L): dnt2 <- function(S, K, U, L, sigma, T, r, b) {      a <- DoubleBarrierOption("co", S, L, L, U, T, r, b, sigma, 0,        0,title = NULL, description = NULL)    z <- a@price    b <- DoubleBarrierOption("po", S, U, L, U, T, r, b, sigma, 0,        0,title = NULL, description = NULL)    y <- b@price    (z + y) / (U - L) * K } Now, we have two functions for which we can compare the results: dnt1(0.9266, 1000000, 0.9600, 0.9200, 0.06, 0.25, 0.0025, -0.025) [1] 48564.59   dnt2(0.9266, 1000000, 0.9600, 0.9200, 0.06, 0.25, 0.0025, -0.025) [1] 48564.45 For a DNT with a USD 1 million contingent payout and an initial market value of over 48,000 dollars, it is very nice to see that the difference in the prices is only 14 cents. Technically, however, having a second pricing function is not a big help since low volatility is also an issue for dnt2. We will use dnt1 for the rest of the article. The life of a Double-no-touch option – a simulation How has the DNT price been evolving during the second quarter of 2014? We have the open-high-low-close type time series with five minute frequency for AUDUSD, so we know all the extreme prices: d <- read.table("audusd.csv", colClasses = c("character", rep("numeric",5)), sep = ";", header = TRUE) underlying <- as.vector(t(d[, 2:5])) t <- rep( d[,6], each = 4) n <- length(t) option_price <- rep(0, n)   for (i in 1:n) { option_price[i] <- dnt1(S = underlying[i], K = 1000000,    U = 0.9600, L = 0.9200, sigma = 0.06, T = t[i]/(60*24*365),      r = 0.0025, b = -0.0250) } a <- min(option_price) b <- max(option_price) option_price_transformed = (option_price - a) * 0.03 / (b - a) + 0.92   par(mar = c(6, 3, 3, 5)) matplot(cbind(underlying,option_price_transformed), type = "l",    lty = 1, col = c("grey", "red"),    main = "Price of underlying and DNT",    xaxt = "n", yaxt = "n", ylim = c(0.91,0.97),    ylab = "", xlab = "Remaining time") abline(h = c(0.92, 0.96), col = "green") axis(side = 2, at = pretty(option_price_transformed),    col.axis = "grey", col = "grey") axis(side = 4, at = pretty(option_price_transformed),    labels = round(seq(a/1000,1000,length = 7)), las = 2,    col = "red", col.axis = "red") axis(side = 1, at = seq(1,n, length=6),    labels = round(t[round(seq(1,n, length=6))]/60/24)) The following is the output for the preceding code: The price of a DNT is shown in red on the right axis (divided by 1000), and the actual AUDUSD price is shown in grey on the left axis. The green lines are the barriers of 0.9200 and 0.9600. The chart shows that in 2014 Q2, the AUDUSD currency pair was traded inside the (0.9200; 0.9600) interval; thus, the payout of the DNT would have been USD 1 million. This DNT looks like a very good investment; however, reality is just one trajectory out of an a priori almost infinite set. It could have happened differently. For example, on May 02, 2014, there were still 59 days left until expiry, and AUDUSD was traded at 0.9203, just three pips away from the lower barrier. At this point, the price of this DNT was only USD 5,302 dollars which is shown in the following code: dnt1(0.9203, 1000000, 0.9600, 0.9200, 0.06, 59/365, 0.0025, -0.025) [1] 5302.213 Compare this USD 5,302 to the initial USD 48,564 option price! In the following simulation, we will show some different trajectories. All of them start from the same 0.9266 AUDUSD spot price as it was on the dawn of April 01, and we will see how many of them stayed inside the (0.9200; 0.9600) interval. To make it simple, we will simulate geometric Brown motions by using the same 6 percent volatility as we used to price the DNT: library(matrixStats) DNT_sim <- function(S0 = 0.9266, mu = 0, sigma = 0.06, U = 0.96, L = 0.92, N = 5) {    dt <- 5 / (365 * 24 * 60)    t <- seq(0, 0.25, by = dt)    Time <- length(t)      W <- matrix(rnorm((Time - 1) * N), Time - 1, N)    W <- apply(W, 2, cumsum)    W <- sqrt(dt) * rbind(rep(0, N), W)    S <- S0 * exp((mu - sigma^2 / 2) * t + sigma * W )    option_price <- matrix(0, Time, N)      for (i in 1:N)        for (j in 1:Time)          option_price[j,i] <- dnt1(S[j,i], K = 1000000, U, L, sigma,              0.25-t[j], r = 0.0025,                b = -0.0250)*(min(S[1:j,i]) > L & max(S[1:j,i]) < U)      survivals <- sum(option_price[Time,] > 0)    dev.new(width = 19, height = 10)      par(mfrow = c(1,2))    matplot(t,S, type = "l", main = "Underlying price",        xlab = paste("Survived", survivals, "from", N), ylab = "")    abline( h = c(U,L), col = "blue")    matplot(t, option_price, type = "l", main = "DNT price",        xlab = "", ylab = "")} set.seed(214) system.time(DNT_sim()) The following is the output for the preceding code: Here, the only surviving trajectory is the red one; in all other cases, the DNT hits either the higher or the lower barrier. The line set.seed(214) grants that this simulation will look the same anytime we run this. One out of five is still not that bad; it would suggest that for an end user or gambler who does no dynamic hedging, this option has an approximate value of 20 percent of the payout (especially since the interest rates are low, the time value of money is not important). However, five trajectories are still too few to jump to such conclusions. We should check the DNT survivorship ratio for a much higher number of trajectories. The ratio of the surviving trajectories could be a good estimator of the a priori real-world survivorship probability of this DNT; thus, the end user value of it. Before increasing N rapidly, we should keep in mind how much time this simulation took. For my computer, it took 50.75 seconds for N = 5, and 153.11 seconds for N = 15. The following is the output for N = 15: Now, 3 out of 15 survived, so the estimated survivorship ratio is still 3/15, which is equal to 20 percent. Looks like this is a very nice product; the price is around 5 percent of the payout, while 20 percent is the estimated survivorship ratio. Just out of curiosity, run the simulation for N equal to 200. This should take about 30 minutes. The following is the output for N = 200: The results are shocking; now, only 12 out of 200 survive, and the ratio is only 6 percent! So to get a better picture, we should run the simulation for a larger N. The movie Whatever Works by Woody Allen (starring Larry David) is 92 minutes long; in simulation time, that is N = 541. For this N = 541, there are only 38 surviving trajectories, resulting in a survivorship ratio of 7 percent. What is the real expected survivorship ratio? Is it 20 percent, 6 percent, or 7 percent? We simply don't know at this point. Mathematicians warn us that the law of large numbers requires large numbers, where large is much more than 541, so it would be advisable to run this simulation for as large an N as time allows. Of course, getting a better computer also helps to do more N during the same time. Anyway, from this point of view, Hui's (1996) relatively fast converging DNT pricing formula gets some respect. Summary We started this article by introducing exotic options. In a brief theoretical summary, we explained how exotics are linked together. There are many types of exotics. We showed one possible way of classification that is consistent with the fExoticOptions package. We showed how the Black-Scholes surface (a 3D chart that contains the price of a derivative dependent on time and the underlying price) can be constructed for any pricing function. Resources for Article: Further resources on this subject: What is Quantitative Finance? [article] Learning Option Pricing [article] Derivatives Pricing [article]
Read more
  • 0
  • 0
  • 8088

article-image-evidence-acquisition-and-analysis-icloud
Packt
09 Mar 2015
10 min read
Save for later

Evidence Acquisition and Analysis from iCloud

Packt
09 Mar 2015
10 min read
This article by Mattia Epifani and Pasquale Stirparo, the authors of the book, Learning iOS Forensics, introduces the cloud system provided by Apple to all its users through which they can save their backups and other files on remote servers. In the first part of this article, we will show you the main characteristics of such a service and then the techniques to create and recover a backup from iCloud. (For more resources related to this topic, see here.) iCloud iCloud is a free cloud storage and cloud computing service designed by Apple to replace MobileMe. The service allows users to store data (music, pictures, videos, and applications) on remote servers and share them on devices with iOS 5 or later operating systems, on Apple computers running OS X Lion or later, or on a PC with Windows Vista or later. Similar to its predecessor, MobileMe, iCloud allows users to synchronize data between devices (e-mail, contacts, calendars, bookmarks, notes, reminders, iWork documents, and so on), or to make a backup of an iOS device (iPhone, iPad, or iPod touch) on remote servers rather than using iTunes and your local computer. The iCloud service was announced on June 6, 2011 during the Apple Worldwide Developers Conference but became operational to the public from October 12, 2011. The MobileMe service was disabled as a result on June 30, 2012 and all users were transferred to the new environment. In July 2013, iCloud had more than 320 million users. Each iCloud account has 5 GB of free storage for the owners of iDevice with iOS 5 or later and Mac users with Lion or later. Purchases made through iTunes (music, apps, videos, movies, and so on) are not calculated in the count of the occupied space and can be stored in iCloud and downloaded on all devices associated with the Apple ID of the user. Moreover, the user has the option to purchase additional storage in denominations of 20, 200, 500, or 1,000 GB. Access to the iCloud service can be made through integrated applications on devices such as iDevice and Mac computers. Also, to synchronize data on a PC, you need to install the iCloud Control Panel application, which can be downloaded for free from the Apple website. To synchronize contacts, e-mails, and appointments in the calendar on the PC, the user must have Microsoft Outlook 2007 or 2010, while for the synchronization of bookmarks they need Internet Explorer 9 or Safari. iDevice backup on iCloud iCloud allows users to make online backups of iDevices so that they will be able to restore their data even on a different iDevice (for example, in case of replacement of devices). The choice of which backup mode to use can be done directly in the settings of the device or through iTunes when the device is connected to the PC or Mac, as follows: Once the user has activated the service, the device automatically backs up every time the following scenarios occur: It is connected to the power cable It is connected to a Wi-Fi network Its screen is locked iCloud online backups are incremental through subsequent snapshots and each snapshot is the current status of the device at the time of its creation. The structure of the backup stored on iCloud is entirely analogous to that of the backup made with iTunes. iDevice backup acquisition Backups that are made online are, to all intents and purposes, not encrypted. Technically, they are encrypted, but the encryption key is stored with the encrypted files. This choice was made by Apple in order for users to be able to restore the backup on a different device than the one that created it. Currently, the acquisition of the iCloud backup is supported by two types of commercial software (Elcomsoft Phone Password Breaker (EPPB) and Wondershare Dr.Fone) and one open source tool (iLoot, which is available at https://github.com/hackappcom/iloot). The interesting aspect is that the same technique was used in the iCloud hack performed in 2014, when personal photos and videos were hacked from the respective iCloud services and released over the Internet (more information is available at http://en.wikipedia.org/wiki/2014_celebrity_photo_hack). Though there is no such strong evidence yet that describes how the hack was made, it is believed that Apple's Find my iPhone service was responsible for this and Apple did not implement any security measure to lockdown account after a particular number of wrong login attempts, which directly arises the possibility of exploitation (brute force, in this case). The tool used to brute force the iCloud password, named iBrute, is still available at https://github.com/hackappcom/ibrute, but has not been working since January 2015. Case study – iDevice backup acquisition and EPPB with usernames and passwords As reported on the software manufacturer's website, EPPB allows the acquisition of data stored on a backup online. Moreover, online backups can be acquired without having the original iOS device in hand. All that's needed to access online backups stored in the cloud service are the original user's credentials, including their Apple ID, accompanied with the corresponding password. The login credentials in iCloud can be retrieved as follows: Using social engineering techniques From a PC (or a Mac) on which they are stored: iTunes Password Decryptor (http://securityxploded.com/) WebBrowserPassView (http://www.nirsoft.net/) Directly from the device (iPhone/iPad/iPod touch) by extracting the credentials stored in the keychain Once credentials have been extracted, the download of the backup is very simple. Follow the step-by-step instructions provided in the program by entering username and password in Download backup from iCloud dialog by going to Tools | Apple | Download backup from iCloud | Password and clicking on Sign in, as shown in the following screenshot: At this point, the software displays a screen that shows all the backups present in the user account and allows you to download data. It is important to notice the possibility of using the following two options: Restore original file names: If enabled, this option interprets the contents of the Manifest.mbdb file, rebuilding the backup with the same tree structure into domains and sub-domains. If the investigator intends to carry out the analysis with traditional software for data extraction from backups, it is recommended that you disable this option because, if enabled, that software will no longer be able to parse the backup. Download only specific data: This option is very useful when the investigator needs to download only some specific information. Currently, the software supports Call history, Messages, Attachments, Contacts, Safari data, Google data, Calendar, Notes, Info & Settings, Camera Roll, Social Communications, and so on. In this case, the Restore original file names option is automatically activated and it cannot be disabled. Once you have chosen the destination folder for the download, the backup starts. The time required to download depends on the size of the storage space available to the user and the number of snapshots stored within that space. Case study – iDevice backup acquisition and EPPB with authentication token The Forensic edition of Phone Password Breaker from Elcomsoft is a tool that gives a digital forensics examiner the power to obtain iCloud data without having the original Apple ID and password. This kind of access is made possible via the use of an authentication token extracted from the user's computer. These tokens can be obtained from any suspect's computer where iCloud Control Panel is installed. In order to obtain the token, the user must have been logged in to iCloud Control Panel on that PC at the time of acquisition, so it means that the acquisition can be performed only in a live environment or in a virtualized image of the suspect computer connected to Internet. More information about this tool is available at http://www.elcomsoft.com/eppb.html. To extract the authentication token from the iCloud Control Panel, the analyst needs to use a small executable file on the machine called atex.exe. The executable file can be launched from an external pen drive during a live forensics activity. Open Command Prompt and launch the atex –l command to list all the local iCloud users as follows: Then, launch atex.exe again with the getToken parameter (-t) and enter the username of the specific local Windows user and the password for this user's Windows account. A file called icloud_token_<timestamp>.txt will be created in the directory from which atex.exe was launched. The file contains the Apple ID of the current iCloud Control Panel user and its authentication token. Now that the analyst has the authentication token, they can start the EPPB software and navigate to Tools | Apple | Download backup from iCloud | Token and copy and paste the token (be careful to copy the entire second row from the .txt file created by the atex.exe tool) into the software and click on Sign in, as shown in the following screenshot. At this point, the software shows the screen for downloading the iCloud backups stored within the iCloud space of the user, in a similar way as you provide a username and password. The procedure for the Mac OS X version is exactly the same. Just launch the atex Mac version from a shell and follow the steps shown previously in the Windows environment: sudo atex –l: This command is used to get the list of all iCloud users. sudo atex –t –u <username>: This command is used to get the authentication token for a specific user. You will need to enter the user's system password when prompted. Case study – iDevice backup acquisition with iLoot The same activity can be performed using the open source tool called iLoot (available at https://github.com/hackappcom/iloot). It requires Python and some dependencies. We suggest checking out the website for the latest version and requirements. By accessing the help (iloot.py –h), we can see the various available options. We can choose the output folder if we want to download one specified snapshot, if we want the backup being downloaded in original iTunes format or with the Domain-style directories, if we want to download only specific information (for example, call history, SMS, photos, and so on), or only a specific domain, as follows: To download the backup, you just only need to insert the account credentials, as shown in the following screenshot: At the end of the process, you will find the backup in the output folder (the default folder's name is /output). Summary In this article, we introduced the iCloud service provided by Apple to store files on remote servers and backup their iDevice devices. In particular, we showed the techniques to download the backups stored on iCloud when you know the user credentials (Apple ID and password) and when you have access to a computer where it is installed and use the iCloud Control Panel software. Resources for Article: Further resources on this subject: Introduction to Mobile Forensics [article] Processing the Case [article] BackTrack Forensics [article]
Read more
  • 0
  • 0
  • 9901
article-image-advanced-cypher-tricks
Packt
05 Mar 2015
8 min read
Save for later

Advanced Cypher tricks

Packt
05 Mar 2015
8 min read
Cypher is a highly efficient language that not only makes querying simpler but also strives to optimize the result-generation process to the maximum. A lot more optimization in performance can be achieved with the help of knowledge related to the data domain of the application being used to restructure queries. This article by Sonal Raj, the author of Neo4j High Performance, covers a few tricks that you can implement with Cypher for optimization. (For more resources related to this topic, see here.) Query optimizations There are certain techniques you can adopt in order to get the maximum performance out of your Cypher queries. Some of them are: Avoid global data scans: The manual mode of optimizing the performance of queries depends on the developer's effort to reduce the traversal domain and to make sure that only the essential data is obtained in results. A global scan searches the entire graph, which is fine for smaller graphs but not for large datasets. For example: START n =node(*) MATCH (n)-[:KNOWS]-(m) WHERE n.identity = "Batman" RETURN m Since Cypher is a greedy pattern-matching language, it avoids discrimination unless explicitly told to. Filtering data with a start point should be undertaken at the initial stages of execution to speed up the result-generation process. In Neo4j versions greater than 2.0, the START statement in the preceding query is not required, and unless otherwise specified, the entire graph is searched. The use of labels in the graphs and in queries can help to optimize the search process for the pattern. For example: START n =node(*) MATCH (n:superheroes)-[:KNOWS]-(m) WHERE n.identity = "Batman" RETURN m Using the superheroes label in the preceding query helps to shrink the domain, thereby making the operation faster. This is referred to as a label-based scan. Indexing and constraints for faster search: Searches in the graph space can be optimized and made faster if the data is indexed, or we apply some sort of constraint on it. In this way, the traversal avoids redundant matches and goes straight to the desired index location. To apply an index on a label, you can use the following: CREATE INDEX ON: superheroes(identity) Otherwise, to create a constraint on the particular property such as making the value of the property unique so that it can be directly referenced, we can use the following: CREATE CONSTRAINT ON n:superheroes ASSERT n.identity IS UNIQUE We will learn more about indexing, its types, and its utilities in making Neo4j more efficient for large dataset-based operations in the next sections. Avoid Cartesian Products Generation: When creating queries, we should include entities that are connected in some way. The use of unspecific or nonrelated entities can end up generating a lot of unused or unintended results. For example: MATCH (m:Game), (p:Player) This will end up mapping all possible games with all possible players and that can lead to undesired results. Let's use an example to see how to avoid Cartesian products in queries: MATCH ( a:Actor), (m:Movie), (s:Series) RETURN COUNT(DISTINCT a), COUNT(DISTINCT m), COUNT(DISTINCTs) This statement will find all possible triplets of the Actor, Movie, and Series labels and then filter the results. An optimized form of querying will include successive counting to get a final result as follows: MATCH (a:Actor) WITH COUNT(a) as actors MATCH (m:Movie) WITH COUNT(m) as movies, actors MATCH (s:Series) RETURN COUNT(s) as series, movies, actors This increases the 10x improvement in the execution time of this query on the same dataset. Use more patterns in MATCH rather than WHERE: It is advisable to keep most of the patterns used in the MATCH clause. The WHERE clause is not exactly meant for pattern matching; rather it is used to filter the results when used with START and WITH. However, when used with MATCH, it implements constraints to the patterns described. Thus, the pattern matching is faster when you use the pattern with the MATCH section. After finding starting points—either by using scans, indexes, or already-bound points—the execution engine will use pattern matching to find matching subgraphs. As Cypher is declarative, it can change the order of these operations. Predicates in WHERE clauses can be evaluated before, during, or after pattern matching. Split MATCH patterns further: Rather than having multiple match patterns in the same MATCH statement in a comma-separated fashion, you can split the patterns in several distinct MATCH statements. This process considerably decreases the query time since it can now search on smaller or reduced datasets at each successive match stage. When splitting the MATCH statements, you must keep in mind that the best practices include keeping the pattern with labels of the smallest cardinality at the head of the statement. You must also try to keep those patterns generating smaller intermediate result sets at the beginning of the match statements block. Profiling of queries: You can monitor your queries' processing details in the profile of the response that you can achieve with the PROFILE keyword, or setting profile parameter to True while making the request. Some useful information can be in the form of _db_hits that show you how many times an entity (node, relationship, or property) has been encountered. Returning data in a Cypher response has substantial overhead. So, you should strive to restrict returning complete nodes or relationships wherever possible and instead, simply return the desired properties or values computed from the properties. Parameters in queries: The execution engine of Cypher tries to optimize and transform queries into relevant execution plans. In order to optimize the amount of resources dedicated to this task, the use of parameters as compared to literals is preferred. With this technique, Cypher can re-utilize the existing queries rather than parsing or compiling the literal-hbased queries to build fresh execution plans: MATCH (p:Player) –[:PLAYED]-(game) WHERE p.id = {pid} RETURN game When Cypher is building execution plans, it looks at the schema to see whether it can find useful indexes. These index decisions are only valid until the schema changes, so adding or removing indexes leads to the execution plan cache being flushed. Add the direction arrowhead in cases where the graph is to be queries in a directed manner. This will reduce a lot of redundant operations. Graph model optimizations Sometimes, the query optimizations can be a great way to improve the performance of the application using Neo4j, but you can incorporate some fundamental practices while you define your database so that it can make things easier and faster for usage: Explicit definition: If the graph model we are working upon contains implicit relationships between components. A higher efficiency in queries can be achieved when we define these relations in an explicit manner. This leads to faster comparisons but it comes with a drawback that now the graph would require more storage space for an additional entity for all occurrences of data. Let's see this in action with the help of an example. In the following diagram, we see that when two players have played in the same game, they are most likely to know each other. So, instead of going through the game entity for every pair of connected players, we can define the KNOWS relationship explicitly between the players. Property refactoring: This refers to the situation where complex time-consuming operations in the WHERE or MATCH clause can be included directly as properties in the nodes of the graph. This not only saves computation time resulting in much faster queries but it also leads to more organized data storage practices in the graph database for utility. For example: MATCH (m:Movie) WHERE m.releaseDate >1343779201 AND m.releaseDate< 1369094401 RETURN m This query is to compare whether a movie has been released in a particular year; it can be optimized if the release year of the movie is inherently stored in the properties of the movie nodes in the graph as the year range 2012-2013. So, for the new format of the data, the query will now change to this: MATCH (m:Movie)-[:CONTAINS]->(d) WHERE s.name = "2012-2013" RETURN g This gives a marked improvement in the performance of the query in terms of its execution time. Summary These are the various tricks that can be implemented in Cypher for optimization. Resources for Article: Further resources on this subject: Recommender systems dissected [Article] Working with a Neo4j Embedded Database [Article] Adding Graphics to the Map [Article]
Read more
  • 0
  • 0
  • 8190

article-image-learning-random-forest-using-mahout
Packt
05 Mar 2015
11 min read
Save for later

Learning Random Forest Using Mahout

Packt
05 Mar 2015
11 min read
In this article by Ashish Gupta, author of the book Learning Apache Mahout Classification, we will learn about Random forest, which is one of the most popular techniques in classification. It starts with a machine learning technique called decision tree. In this article, we will explore the following topics: Decision tree Random forest Using Mahout for Random forest (For more resources related to this topic, see here.) Decision tree A decision tree is used for classification and regression problems. In simple terms, it is a predictive model that uses binary rules to calculate the target variable. In a decision tree, we use an iterative process of splitting the data into partitions, then we split it further on branches. As in other classification model creation processes, we start with the training dataset in which target variables or class labels are defined. The algorithm tries to break all the records in training datasets into two parts based on one of the explanatory variables. The partitioning is then applied to each new partition, and this process is continued until no more partitioning can be done. The core of the algorithm is to find out the rule that determines the initial split. There are algorithms to create decision trees, such as Iterative Dichotomiser 3 (ID3), Classification and Regression Tree (CART), Chi-squared Automatic Interaction Detector (CHAID), and so on. A good explanation for ID3 can be found at http://www.cse.unsw.edu.au/~billw/cs9414/notes/ml/06prop/id3/id3.html. Forming the explanatory variables to choose the best splitter in a node, the algorithm considers each variable in turn. Every possible split is considered and tried, and the best split is the one that produces the largest decrease in diversity of the classification label within each partition. This is repeated for all variables, and the winner is chosen as the best splitter for that node. The process is continued in the next node until we reach a node where we can make the decision. We create a decision tree from a training dataset so it can suffer from the overfitting problem. This behavior creates a problem with real datasets. To improve this situation, a process called pruning is used. In this process, we remove the branches and leaves of the tree to improve the performance. Algorithms used to build the tree work best at the starting or root node since all the information is available there. Later on, with each split, data is less and towards the end of the tree, a particular node can show patterns that are related to the set of data which is used to split. These patterns create problems when we use them to predict the real dataset. Pruning methods let the tree grow and remove the smaller branches that fail to generalize. Now take an example to understand the decision tree. Consider we have a iris flower dataset. This dataset is hugely popular in the machine learning field. It was introduced by Sir Ronald Fisher. It contains 50 samples from each of three species of iris flower (Iris setosa, Iris virginica, and Iris versicolor). The four explanatory variables are the length and width of the sepals and petals in centimeters, and the target variable is the class to which the flower belongs. As you can see in the preceding diagram, all the groups were earlier considered as Sentosa species and then the explanatory variable and petal length were further used to divide the groups. At each step, the calculation for misclassified items was also done, which shows how many items were wrongly classified. Moreover, the petal width variable was taken into account. Usually, items at leaf nodes are correctly classified. Random forest The Random forest algorithm was developed by Leo Breiman and Adele Cutler. Random forests grow many classification trees. They are an ensemble learning method for classification and regression that constructs a number of decision trees at training time and also outputs the class that is the mode of the classes outputted by individual trees. Single decision trees show the bias–variance tradeoff. So they usually have high variance or high bias. The following are the parameters in the algorithm: Bias: This is an error caused by an erroneous assumption in the learning algorithm Variance: This is an error that ranges from sensitivity to small fluctuations in the training set Random forests attempt to mitigate this problem by averaging to find a natural balance between two extremes. A Random forest works on the idea of bagging, which is to average noisy and unbiased models to create a model with low variance. A Random forest algorithm works as a large collection of decorrelated decision trees. To understand the idea of a Random forest algorithm, let's work with an example. Consider we have a training dataset that has lots of features (explanatory variables) and target variables or classes: We create a sample set from the given dataset: A different set of random features were taken into account to create the random sub-dataset. Now, from these sub-datasets, different decision trees will be created. So actually we have created a forest of the different decision trees. Using these different trees, we will create a ranking system for all the classifiers. To predict the class of a new unknown item, we will use all the decision trees and separately find out which class these trees are predicting. See the following diagram for a better understanding of this concept: Different decision trees to predict the class of an unknown item In this particular case, we have four different decision trees. We predict the class of an unknown dataset with each of the trees. As per the preceding figure, the first decision tree provides class 2 as the predicted class, the second decision tree predicts class 5, the third decision tree predicts class 5, and the fourth decision tree predicts class 3. Now, a Random forest will vote for each class. So we have one vote each for class 2 and class 3 and two votes for class 5. Therefore, it has decided that for the new unknown dataset, the predicted class is class 5. So the class that gets a higher vote is decided for the new dataset. A Random forest has a lot of benefits in classification and a few of them are mentioned in the following list: Combination of learning models increases the accuracy of the classification Runs effectively on large datasets as well The generated forest can be saved and used for other datasets as well Can handle a large amount of explanatory variables Now that we have understood the Random forest theoretically, let's move on to Mahout and use the Random forest algorithm, which is available in Apache Mahout. Using Mahout for Random forest Mahout has implementation for the Random forest algorithm. It is very easy to understand and use. So let's get started. Dataset We will use the NSL-KDD dataset. Since 1999, KDD'99 has been the most widely used dataset for the evaluation of anomaly detection methods. This dataset is prepared by S. J. Stolfo and is built based on the data captured in the DARPA'98 IDS evaluation program (R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. McClung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham, and M. A. Zissman, "Evaluating intrusion detection systems: The 1998 darpa off-line intrusion detection evaluation," discex, vol. 02, p. 1012, 2000). DARPA'98 is about 4 GB of compressed raw (binary) tcp dump data of 7 weeks of network traffic, which can be processed into about 5 million connection records, each with about 100 bytes. The two weeks of test data have around 2 million connection records. The KDD training dataset consists of approximately 4,900,000 single connection vectors, each of which contains 41 features and is labeled as either normal or an attack, with exactly one specific attack type. NSL-KDD is a dataset suggested to solve some of the inherent problems of the KDD'99 dataset. You can download this dataset from http://nsl.cs.unb.ca/NSL-KDD/. We will download the KDDTrain+_20Percent.ARFF and KDDTest+.ARFF datasets. In KDDTrain+_20Percent.ARFF and KDDTest+.ARFF, remove the first 44 lines (that is, all lines starting with @attribute). If this is not done, we will not be able to generate a descriptor file. Steps to use the Random forest algorithm in Mahout The steps to implement the Random forest algorithm in Apache Mahout are as follows: Transfer the test and training datasets to hdfs using the following commands: hadoop fs -mkdir /user/hue/KDDTrainhadoop fs -mkdir /user/hue/KDDTesthadoop fs –put /tmp/KDDTrain+_20Percent.arff /user/hue/KDDTrainhadoop fs –put /tmp/KDDTest+.arff /user/hue/KDDTest Generate the descriptor file. Before you build a Random forest model based on the training data in KDDTrain+.arff, a descriptor file is required. This is because all information in the training dataset needs to be labeled. From the labeled dataset, the algorithm can understand which one is numerical and categorical. Use the following command to generate descriptor file: hadoop jar $MAHOUT_HOME/core/target/mahout-core-xyz.job.jarorg.apache.mahout.classifier.df.tools.Describe-p /user/hue/KDDTrain/KDDTrain+_20Percent.arff-f /user/hue/KDDTrain/KDDTrain+.info-d N 3 C 2 N C 4 N C 8 N 2 C 19 N L Jar: Mahout core jar (xyz stands for version). If you have directly installed Mahout, it can be found under the /usr/lib/mahout folder. The main class Describe is used here and it takes three parameters: The p path for the data to be described. The f location for the generated descriptor file. d is the information for the attribute on the data. N 3 C 2 N C 4 N C 8 N 2 C 19 N L defines that the dataset is starting with a numeric (N), followed by three categorical attributes, and so on. In the last, L defines the label. The output of the previous command is shown in the following screenshot: Build the Random forest using the following command: hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-xyz-job.jar org.apache.mahout.classifier.df.mapreduce.BuildForest-Dmapred.max.split.size=1874231 -d /user/hue/KDDTrain/KDDTrain+_20Percent.arff-ds /user/hue/KDDTrain/KDDTrain+.info-sl 5 -p -t 100 –o /user/hue/ nsl-forest Jar: Mahout example jar (xyz stands for version). If you have directly installed Mahout, it can be found under the /usr/lib/mahout folder. The main class build forest is used to build the forest with other arguments, which are shown in the following list: Dmapred.max.split.size indicates to Hadoop the maximum size of each partition. d stands for the data path. ds stands for the location of the descriptor file. sl is a variable to select randomly at each tree node. Here, each tree is built using five randomly selected attributes per node. p uses partial data implementation. t stands for the number of trees to grow. Here, the commands build 100 trees using partial implementation. o stands for the output path that will contain the decision forest. In the end, the process will show the following result: Use this model to classify the new dataset: hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-xyz-job.jar org.apache.mahout.classifier.df.mapreduce.TestForest-i /user/hue/KDDTest/KDDTest+.arff-ds /user/hue/KDDTrain/KDDTrain+.info -m /user/hue/nsl-forest -a –mr-o /user/hue/predictions Jar: Mahout example jar (xyz stands for version). If you have directly installed Mahout, it can be found under the /usr/lib/mahout folder. The class to test the forest has the following parameters: I indicates the path for the test data ds stands for the location of the descriptor file m stands for the location of the generated forest from the previous command a informs to run the analyzer to compute the confusion matrix mr informs Hadoop to distribute the classification o stands for the location to store the predictions in The job provides the following confusion matrix: So, from the confusion matrix, it is clear that 9,396 instances were correctly classified and 315 normal instances were incorrectly classified as anomalies. And the accuracy percentage is 77.7635 (correctly classified instances by the model / classified instances). The output file in the prediction folder contains the list where 0 and 1. 0 defines the normal dataset and 1 defines the anomaly. Summary In this article, we discussed the Random forest algorithm. We started our discussion by understanding the decision tree and continued with an understanding of the Random forest. We took up the NSL-KDD dataset, which is used to build predictive systems for cyber security. We used Mahout to build the Random forest tree, and used it with the test dataset and generated the confusion matrix and other statistics for the output. Resources for Article: Further resources on this subject: Implementing the Naïve Bayes classifier in Mahout [article] About Cassandra [article] Tuning Solr JVM and Container [article]
Read more
  • 0
  • 1
  • 4176
Modal Close icon
Modal Close icon