Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-heart-it-all
Packt
27 Jan 2016
15 min read
Save for later

The Heart of It All

Packt
27 Jan 2016
15 min read
In this article by Thomas Hamilton, the author of Building a Media Center with Raspberry Pi ,you will learn how to find the operating system that you will use on the system that you chose. Just like with hardware, there are a plethora of options for the operating systems for the Raspberry Pi. For this book, we are going to focus on transforming the Raspberry Pi into a media center. At the time of writing this book, there are two operating systems available that are well known for being geared specifically to do just this. The first one is called the Open Embedded Linux Entertainment Center (openELEC) and is a slimmed-down operating system that has been optimized to be a media center and nothing else. The second option, and the one that we will be using for this project, is called the Open Source Media Center (OSMC). The main advantage of this specific version is that there is a full operating system running in the background. This will be important for some of the add-ons to work correctly. Once you can do this, if you want to try openELEC, you will be fully prepared to be able to do this on your own. In fact, the information in this article will enable you to install practically any operating system that's designed for a Raspberry Pi onto an SD card for you to use and experiment with as you see fit. In this article, we will cover the following topics: Downloading an operating system Installing an operating system to an SD card using Windows Install an operating system to an SD card using Linux (For more resources related to this topic, see here.) The Operating System It is now time to find the correct version of OSMC so that we can download and install it. If you are primarily a Windows or an iOS user, it may feel strange to think that you can search online for operating systems and just download them to your computer. In the Linux world, the world in which the Raspberry Pi resides, this is very normal and one of the great things about open source. The Raspberry Pi is built as a learning tool. It was designed in such a way that it will allow you to modify and add to it. In this way, the community can develop it and make it better. Open source software does the same thing. If you know programming, you can contribute to and change software that someone else developed, and this is encouraged! More eyes on the code means less bugs and vulnerabilities. Most versions of Linux follow this open source principle. Versions of Linux? Yes. This is another point of confusion for Windows and Mac users. For the computers that you buy in a normal retail or computer store, you do not have many choices related to the OS that is already installed. You can either buy an Apple product with the newest version of their OS, or a Windows-based computer with Windows 7, 8, or 10 pre-installed. In this example, Windows 7, 8, and 10 are just newer and older versions of each other. Linux works off a different principle. Linux itself is not an operating system. Think of it more like a type of operating system or maybe as a brand such as Microsoft and Apple. Because it is open source and free, developers can take it and turn it into whatever they need it to be. The most popular versions of Linux are Ubuntu, Fedora, Suse, Mint, and CentOS. They each have a different look and feel and can have different functions. They are also operating systems that can be used daily for your normal computing needs. This article is based on a combination of Ubuntu and Fedora operating systems. The world of Linux and open source software can be confusing at first. Don't be scared! After you get past the shock, you will find that this openness is very exciting and helpful and can actually make your life much easier. Now, lets download OSMC. Raspberrypi.org If you haven't come across this already, it is the official website for the Raspberry Pi. From this website, you can find information about the Raspberry Pi, instructional how-tos and forums to talk with other Raspberry Pi users. This site can point you to their official retailers for the versions of the Raspberry Pi that are currently in production, and for the purpose of this article, it points us to the most popular operating systems for the Raspberry Pi (though not nearly all the ones that can work on it). From the main page, click on the link that says DOWNLOADS near the top of the page. This will bring you to the page that lists the most popular operating systems. Raspbian is the official OS of the Raspberry Pi and what OSMC is based on. Noobs is worth looking at for your next project. It isn't an OS itself, but it gives you the ability to choose from a list of operating systems and install them with a single click. If you want to see what the Raspberry Pi is capable of, start with Noobs. Under these options, you will have a list of third-party operating systems. The names may sound familiar at this point, as we have mentioned most of them already. This list is where you will find OSMC. Click on its link to go to their website. We could have gone straight to this website to download OSMC, but this allowed you to see what other options are available and which is the easiest place to find them. OSMC gives a few different ways to install the OS onto different types of computers. If you want to use their automated way of installing OSMC to an SD card for the Raspberry Pi, you are welcome to do so; just follow their instructions for the operation system that you are using on your main computer. For learning purposes, I am going to explain the method of downloading a disk image and doing it ourselves, as this is how most operating systems are installed to the Raspberry Pi. Under the heading named Get Started, where you can choose the automated installation methods, there is a line just under it that allows you to download disk images. This is what we are going to do. Click on that link. Now, we are presented with choices, namely Raspberry Pi 1 and Raspberry Pi 2. The Raspberry Pi 1 refers to any of the single-core Raspberry Pi devices while the Raspberry Pi 2 refers to the newest Pi with a quad-core processor and more RAM. Click on the link under whichever heading applies for the type of Pi that you will be using and select the newest release option that is available. Verifying the download While OSMC is downloading, let's take a minute to understand what the MD5 Checksum is. An MD5 Checksum is used to verify a file's integrity. The number that you see beside the download is the Checksum that was created when the file that you are downloading was created. After the image has finished downloading, we will check the MD5 Checksum of the file on your computer as well. These numbers should be identical. If they are not, it indicates that the image is corrupt and you will need to download it again. From a security standpoint, a checksum can also be used to ensure that data hasn't been tampered with in the time span between when it was created and when it was given to you. This could indicate malicious software or a data breech. Now that OSMC has been downloaded, we can verify its integrity. In Linux, this is easy. Open a terminal and navigate to the Downloads folder or wherever you downloaded the file. Now type in the following command: [md5sum name-of-file] The output that this gives should match the MD5 Checksum that was beside the file that you clicked on to download. If it doesn't, delete the file and try doing this again. To verify the file integrity using Windows, you will need to install a program that can do this. Search online for MD5 checksum Windows, and you will see that Microsoft has a program that can be downloaded from their website. Once you download and install it, it will work in a fashion that's similar to the Linux method, where you use the Windows command prompt. It comes with a readme file to explain how to use it. If you are unable to find a program to verify the checksum, do not worry. This step isn't required, but it helps you troubleshoot whether the Raspberry Pi will not boot after you install the OS onto the SD card. Installing OSMC - for Windows users For Windows, you need to install two more applications to successfully write OSMC to an SD card. Because the OSMC file that you downloaded is compressed using gzip, you need a program that can unzip it. The recommended program for all of your compression needs in Windows is WinRAR. It is free and can be found at www.filehippo.com along with the next program that you will need. After you unzip the OSMC file, you will need a program that can write (burn) it to your SD card. There are many options to choose from, and these options can be found under the CD/DVD option of Categories on the homepage. ImgBurn and DeepBurner appear to be the most popular image burning software at the time of writing this article. Preparing everything Ensure that you have the appropriate type of SD card for the Raspberry Pi that you own. The original Raspberry Pi Model A and B use full-size SD cards. Thus, if you purchased a miniSD by mistake, do not worry. The miniSD probably came with an adapter that turns it into a full-size SD. If it did not, they are easy to acquire. You will need to insert your SD card into your computer so that you can write the operating system on it. If your computer has an in-built SD card reader, then that is ideal. If it does not, there are card readers available that plug in through your USB port and which can accomplish this goal as well. Once you have inserted your SD card into your computer using either method, ensure that you have taken all the information off the card that you want to keep. Anything that's currently on the card will be erased in the following steps! Install WinRAR and your image burning program if you have not already done so. When it is , you should be able to right-click on the OSMC file that you downloaded and select the option to uncompress or extract the files in a gzip file. Burn It! Now that we have an OSMC file that ends with .img, we can open the image burning program. Each program works differently, but you want to set the destination (where the image will be burned) as your SD card and the source (or input file) as the OSMC image. Once these settings are correct, click on BurnISO to begin burning the image. Now that this is done, congratulations! Installing OSMC - for Linux users As you have seen several times already, Linux comes with nearly everything that you need already installed. The software used to install the operating system to the SD card is no different. Ensure that you have the appropriate type of the SD card for the Raspberry Pi that you own. The original Raspberry Pi Model A and B use full-size SD cards. Therefore, if you purchased a miniSD by mistake, do not worry. The miniSD probably came with an adapter that turns it into a full-size SD. If it did not, they are easy to acquire. Preparing the SD card You will need to insert your SD card into your computer so that you can write the operating system on it. If your computer has an in-built SD card reader, then that is ideal. If it does not, there are card readers available that plug in through your USB port that can accomplish this goal as well. Once you have inserted your SD card into your computer using either method, ensure that you have taken all information that you want to keep off the card. Anything that's currently on the card will be erased in the next step! If the SD card was already formatted with a filesystem, it probably automounted itself somewhere so that you can access it. We need to unmount it so that the system is not actually using it, but it is still inserted into the computer. To do this, type the following command into your command line: lsblk This command lists the block devices that are currently on your computer. In other words, it shows the storage devices and the partitions on them. Sda is most likely your hard drive; you can tell by the size of the device in the right columns. Sda1 and sda2 are the partitions on the sda device. Look for your device by its size. If you have a 4 GB SD card, then you will see something like this: NAME                    MAJ:MIN RM   SIZE    RO TYPE  MOUNTPOINT sda                          8:0          0     238.5G  0   disk  ├─sda1                      8:1          0     476M  0   part    /boot └─sda2                       8:2          0      186.3G  0   part    / sdb                          8:16        1      3.8G  0   disk  ├─sdb1                       8:17        1      2.5G   0  part   └─sdb2                       8:18        1      1.3G   0  part    /run/media/username/mountpoint In this case, my SD card is sdb and the second partition is mounted. To unmount this, we are going to issue the following command in the terminal again: sudo umount /dev/sdb* It will then ask you for your sudo (administrator) password and then unmount all the partitions for the sdb device. In this case, you could have replaced the sdb* with the partition number (sdb2) to be more specific if you only wanted to unmount one partition and not the entire device. In this example, we will erase everything on the device so that we unmount everything. Now, we can write the operating system to the SD card. Burn It! The process of installing an OSMC to the SD card is called burning an image. The process of burning an image is done with a program called dd, and it is done via the terminal. dd is a very useful tool that's used to copy disks and partitions to other disks or partitions or to images and vice versa. In this instance, we will take an image and copy it to a disk. In the terminal, navigate to the directory where you downloaded OSMC. The file that you downloaded is compressed using gzip. Before we can burn it to the disk, we need to unzip it. To do so, type in the following command: gunzip name-of-file.img.gz This will leave you with a new file that has the same name but with the .gz file no longer at the end. This file is also much bigger than the gzipped version. This .img (image) file is what we will burn to the SD card. In the previous step, we found out what device our SD card was listed under (it was sdb in the preceding example) and unmounted it. Now, we are going to use the following command to burn the image: sudo dd if=name-of-file.img of=/dev/sdb  (change /dev/sdb to whatever it is on your computer)  And that's it! This will take several minutes to complete and the terminal will look like it froze, but this is because it is working. When it is done, the prompt will come back and you can remove the SD card: Summary If your computer already uses Linux, these steps will be a little bit faster because you already have the needed software. For Windows users, hunting for the right software and installing it will take some time. Just have patience and know that the exciting part is just around the corner. Now that we have downloaded OSMC, verified the download, prepared the SD card, and burned OSMC on it, the hardest part is over. Resources for Article:   Further resources on this subject: Raspberry Pi LED Blueprints [article] Raspberry Pi and 1-Wire [article] Raspberry Pi Gaming Operating Systems [article]
Read more
  • 0
  • 0
  • 14553

article-image-configuring-extra-features
Packt
27 Jan 2016
10 min read
Save for later

Configuring Extra Features

Packt
27 Jan 2016
10 min read
In this article by Piotr J Kula, the author of the book Raspberry Pi 2 Server Essentials, you will learn how to keep the Pi up-to-date and use the extra features of the GPU. There are some extra features on the Broadcom chip that can be used out of box or activated using extra licenses that can be purchased. Many of these features are undocumented and found by developers or hobbyists working on various projects for the Pi. (For more resources related to this topic, see here.) Updating the Raspberry Pi The Pi essentially has three software layers: the closed source GPU boot process, the boot loader—also known as the firmware, and the operating system. As of writing this book, we cannot update the GPU code. But maybe one day, Broadcom or hardware hackers will tell us how do to this. This leaves us with the firmware and operating system packages. Broadcom releases regular updates for the firmware as precompiled binaries to the Raspberry Pi Foundation, which then releases it to the public. The Foundation and other community members work on Raspbian and release updates via the aptitude repository; this is where we get all our wonderful applications from. It is essential to keep both the firmware and packages up-to-date so that you can benefit from bug fixes and new or improved functionality from the Broadcom chip. The Raspberry Pi 2 uses ARMv7 as opposed to the Pi 1, which uses ARMv6. It recommended using the latest version of Raspbian release to benefit from the speed increase. Thanks to the ARMv7 upgrade as it now supports standard Debian Hard Float packages and other ARMv7 operating systems, such as Windows IoT Core. Updating firmware Updating the firmware used to be quite an involved process, but thanks to a user on GitHub who goes under by the alias of Hexxeh. He has made some code to automatically do this for us. You don't need to run this as often as apt-update, but if you constantly upgrade the operating system, you may need to run this if advised, or when you are experiencing problems with new features or instability. rpi-update is now included as standard in the Raspbian image, and we can simply run the following: sudo rpi-update After the process is complete, you will need to restart the Pi in order to load the new firmware. Updating packages Keeping Raspbian packages up-to-date is also very important, as many changes might work together with fixes published in the firmware. Firstly, we update the source list, which downloads a list of packages and their versions to the aptitude cache. Then, we run the upgrade command that will compare the packages, which are already installed. It will also compare their dependencies, and then it downloads and updates them accordingly: sudo apt-get update sudo apt-get upgrade If there are major changes in the libraries, updating some packages might break your existing custom code or applications. If you need to change anything in your code before updating, you should always check the release notes. Updating distribution We may find that running the firmware update process and package updates does not always solve a particular problem. If you use a release, such as debian-armhf, you can use the following commands without the need to set everything up again: sudo apt-get dist-upgrade sudo apt-get install raspberrypi-ui-mods Outcomes If you have a long-term or production project that will be running independently, it is not a good idea to login from time to time to update the packages. With Linux, it is acceptable to configure your system and let it run for long periods of time without any software maintenance. You should be aware of critical updates and evaluate if you need to install them. For example, consider the recent Heartbleed vulnerability in SSH. If you had a Pi directly connected to the public internet, this would require instant action. Windows users are conditioned to update frequently, and it is very rare that something will go wrong. Though on Linux, running updates will update your software and operating system components, which could cause incompatibilities with other custom software. For example, you used an open source CMS web application to host some of your articles. It was specifically designed for PHP version x, but upgrading to version y also requires the entire CMS system to be upgraded. Sometimes, less popular open source sources may take several months before the code gets refactored to work with the latest PHP version, and consequently, unknowingly upgrading to the latest PHP may completely or partially break your CMS. One way to try and work around this is to clone your SD card and perform the updates on one card. If any issues are encountered, you can easily go back and use the other SD card. A distribution called CentOS tries to deal with this problem by releasing updates once a year. This is deliberate to make sure that everybody has enough time and have tested their software before you can do a full update with minimal or even no breaking changes. Unfortunately, CentOS has no ARM support, but you can follow this guideline by updating packages when you need them. Hardware watchdog A hardware watchdog is a digital clock that needs to be regularly restarted before it reaches a certain time. Just as in the TV series LOST, there is a dead man's switch hidden on the island that needs to be pressed at regular intervals; otherwise, an unknown event will begin. In terms of the Broadcom GPU, if the switch is not pressed, it means that the system has stopped responding, and the reaction event is used to restart the Raspberry Pi and reload the operating system with the expectation that it will, at least temporarily, resolve the issue. Raspbian has a kernel module included, which is disabled by default that deals with the watchdog hardware. A configurable daemon runs on the software layer that sends regular events (such as pressing a button) referred to as a heartbeat to the watchdog via the kernel module. Enabling the watchdog and daemon To get everything up and running, we need to do a few things as follows: Add the following in the console: sudomodprobebcm2708_wdog sudo vi /etc/modules Add the line of the text bcm2708_wdog to the file, then save and exit by pressing ESC and typing :wq. Next, we need to install the daemon that will send the heartbeat signals every 10 seconds. We use chkconfig and add it to the startup process. Then, we enable it as follows: sudo apt-get install watchdog chkconfig sudochkconfig --add watchdog chkconfig watchdog on We can now configure the daemon to do simple checks. Edit the following file: sudo vi /etc/watchdog.conf Uncomment the max-load-1 = 24 and watchdog-device lines by removing the hash (#) character. The max load means that it will take 24 Pi's to complete the task in 1 minute. In normal usage, this will never happen and would only really occur when the Pi is hung. You can now start the watchdog with that configuration. Each time you change something, you need to restart the watchdog: sudo /etc/init.d/watchdog start There are some other examples in the configuration file that you may find of interest. Testing the watchdog In Linux, you can easily place a function into a separate thread, which runs in a new process by using the & character on the command line. By exploiting this feature together with some anonymous functions, we can issue a very crude but effective system halt. This is a quick way to test if the watchdog daemon is working correctly, and it should not be used to halt the Pi. It is known as a fork bomb and many operating systems are susceptible to this. The random-looking series of characters are actually anonymous functions that create other new anonymous function. This is an endless and uncontrollable loop. Most likely, it adopts the name as a bomb because once it starts, it cannot be stopped. Even if you try to kill the original thread, it has created several new threads that need to be killed. It is just impossible to stop, and eventually, it bombs the system into a critical state, which is also known as a stack overflow. Type these characters into the command line and press Enter: : (){ :|:& };: After you press Enter, the Pi will restart after about 30 seconds, but it might take up to a minute. Enabling extra decoders The Broadcom chip actually has extra hardware for encoding and decoding a few other well-known formats. The Raspberry Pi foundation did not include these licenses because they wanted to keep the costs down to a minimum, but they have included the H.264 license. This allows you to watch HD media on your TV, use the webcam module, or transcode media files. If you would like to use these extra encoders/decoders, they did provide a way for users to buy separate licenses. At the time of writing this book, the only project to use these hardware codecs was the OMXPlayer project maintained by XBMC. The latest Raspbian package has the OMX package included. Buying licenses You can go to http://www.raspberrypi.com/license-keys/ to buy licenses that can be used once per device. Follow the instruction on the website to get your license key. MPEG-2 This is alos known as H.222/H.262. It is the standard of video and audio encoding, which is widely used by digital television, cable, and satellite TV. It is also the format used to store video and audio data on DVDs. This means that watching DVDs from a USB DVD-ROM drive should be possible without any CPU overhead whatsoever. Unfortunately, there is no package that uses this hardware directly, but hopefully, in the near future, it would be as simple as buying this license, which will allow us to watch DVDs or video stream in this format with ease. VC-1 VC-1 is formally known as SMPTE421M and was developed by Microsoft. Today, it is the official video format used on the Xbox and Silverlight frameworks. The format is supported by the HD-DVD and Blu-ray players. The only use for this codec will be to watch the Silverlight packaged media, and its popularity has grown over the years but still not very popular. This codec may need to be purchased if you would like to stream video using the Windows 10 IoT API. Hardware monitoring The Raspberry foundation provides a tool called vcgencmd, which gives you detailed data about various hardware used in the Pi. This tool is updated from time to time and can be used to log temperate of the GPU, voltage levels, processor frequencies, and so on: To see a list of supported commands, we type in this console: vcgencmd commands As newer versions are released, there will be more command available in here. To check the current GPU temperature, we use the following command: vcgencmdmeasure_temp We can use the following command to check how RAM is split for the CPU and GPU: vcgencmdget_mem arm/gpu To check the firmware version, we can use the following command: vcgencmd version The output of all these commands is simple text that can be parsed and displayed on a website or stored in a database. Summary This article's intention was to teach you about how hardware relies on good software, but most importantly, it's intention was to show you how to use leverage hardware using ready-made software packages. For reference, you can go to the following link: http://www.elinux.org/RPI_vcgencmd_usage Resources for Article: Further resources on this subject: Creating a Supercomputer [article] Develop a Digital Clock [article] Raspberry Pi and 1-Wire [article]
Read more
  • 0
  • 0
  • 28300

article-image-customizing-and-automating-google-applications
Packt
27 Jan 2016
7 min read
Save for later

Customizing and Automating Google Applications

Packt
27 Jan 2016
7 min read
In this article by the author, Ramalingam Ganapathy, of the book, Learning Google Apps Script, we will see how to create new projects in sheets and send an email with inline image and attachments. You will also learn to create clickable buttons, a custom menu, and a sidebar. (For more resources related to this topic, see here.) Creating new projects in sheets Open any newly created google spreadsheet (sheets). You will see a number of menu items at the top of the window. Point your mouse to it and click on Tools. Then, click on Script editor as shown in the following screenshot: A new browser tab or window with a new project selection dialog will open. Click on Blank Project or close the dialog. Now, you have created a new untitled project with one script file (Code.gs), which has one default empty function (myFunction). To rename the project, click on project title (at the top left-hand side of the window), and then a rename dialog will open. Enter your favored project name, and then click on the OK button. Creating clickable buttons Open the script editor in a newly created or any existing Google sheet. Select the cell B3 or any other cell. Click on Insert and Drawing as shown in the following screenshot: A drawing editor window will open. Click on the Textbox icon and click anywhere on the canvas area. Type Click Me. Resize the object so as to only enclose the text as shown in the screenshot here: Click on Save & Close to exit from the drawing editor. Now, the Click Me image will be inserted at the top of the active cell (B3) as shown in the following screenshot: You can drag this image anywhere around the spreadsheet. In Google sheets, images are not anchored to a particular cell, it can be dragged or moved around. If you right-click on the image, then a drop-down arrow at the top right corner of the image will be visible. Click on the Assign script menu item. A script assignment window will open as shown here: Type "greeting" or any other name as you like but remember its name (so as to create a function with the same name for the next steps). Click on the OK button. Now, open the script editor in the same spreadsheet. When you the open script editor, the project selector dialog will open. You'll close or select blank project. A default function called myFunction will be there in the editor. Delete everything in the editor and insert the following code. function greeting() { Browser.msgBox("Greeting", "Hello World!", Browser.Buttons.OK); } Click on the save icon and enter a project name if asked. You have completed coding your greeting function. Activate the spreadsheet tab/window, and click on your button called Click Me. Then, an authorization window will open; click on Continue. In the successive Request for Permission window, click on Allow button. As soon as you click on Allow and the permission gets dialog disposed, your actual greeting message box will open as shown here: Click on OK to dispose the message box. Whenever you click on your button, this message box will open. Creating a custom menu Can you execute the function greeting without the help of the button? Yes, in the script editor, there is a Run menu. If you click on Run and greeting, then the greeting function will be executed and the message box will open. Creating a button for every function may not be feasible. Although, you cannot alter or add items to the application's standard menu (except the Add-on menu), such as File, Edit and View, and others, you can add the custom menu and its items. For this task, create a new Google docs document or open any existing document. Open the script editor and type these two functions: function createMenu() { DocumentApp.getUi() .createMenu("PACKT") .addItem("Greeting", "greeting") .addToUi(); } function greeting() { var ui = DocumentApp.getUi(); ui.alert("Greeting", "Hello World!", ui.ButtonSet.OK); } In the first function, you use the DocumentApp class, invoke the getUi method, and consecutively invoke the createMenu, addItem, and addToUi methods by method chaining. The second function is familiar to you that you have created in the previous task but this time with the DocumentApp class and associated methods. Now, run the function called createMenu and flip to the document window/tab. You can notice a new menu item called PACKT added next to the Help menu. You can see the custom menu PACKT with an item Greeting as shown next. The item label called Greeting is associated with the function called greeting: The menu item called Greeting works the same way as your button created in previous task. The drawback with this method of inserting custom menu is used to show up the custom menu. You need to run createMenu every time within the script editor. Imagine how your user can use this greeting function if he/she doesn't know about the GAS and script editor? Think that your user might not be a programmer as you. To enable your users to execute the selected GAS functions, then you should create a custom menu and make it visible as soon as the application is opened. To do so, rename the function called createMenu to onOpen, that's it. Creating a sidebar Sidebar is a static dialog box and it will be included in the right-hand side of the document editor window. To create a sidebar, type the following code in your editor: function onOpen() { var htmlOutput = HtmlService .createHtmlOutput('<button onclick="alert('Hello World!');">Click Me</button>') .setTitle('My Sidebar'); DocumentApp.getUi() .showSidebar(htmlOutput); } In the previous code, you use HtmlService and invoke its method called createHtmlOutput and consecutively invoke the setTitle method. To test this code, run the onOpen function or the reload document. The sidebar will be opened in the right-hand side of the document window as shown in the following screenshot. The sidebar layout size is a fixed one that means you cannot change, alter, or resize it: The button in the sidebar is an HTML element, not a GAS element, and if clicked, it opens the browser interface's alert box. Sending an email with inline image and attachments To embed images such as logo in your email message, you may use HTML codes instead of some plain text. Upload your image to Google Drive and get and use the file ID in the code: function sendEmail(){ var file = SpreadsheetApp.getActiveSpreadsheet() .getAs(MimeType.PDF); var image = DriveApp.getFileById("[[image file's id in Drive ]]").getBlob(); var to = "[[receiving email id]]"; var message = '<img src="cid:logo" /> Embedding inline image example.</p>'; MailApp.sendEmail( to, "Email with inline image and attachment", "", { htmlBody:message, inlineImages:{logo:image}, attachments:[file] } ); } Summary In this article, you learned how to customize and automate Google applications with a few examples. Many more useful and interesting applications have been described in the actual book.  Resources for Article: Further resources on this subject: How to Expand Your Knowledge [article] Google Apps: Surfing the Web [article] Developing Apps with the Google Speech Apis [article]
Read more
  • 0
  • 0
  • 7659

article-image-how-structure-your-sass-scalability-using-itcss
Cameron
25 Jan 2016
5 min read
Save for later

How to structure your Sass for scalability using ITCSS

Cameron
25 Jan 2016
5 min read
When approaching a large project with Sass, it can be tempting to dive right into code and start adding partials into a Sass folder, styling parts of your website or app, and completely forgetting about taking a moment to consider how you might structure your code and implement a strategy for expanding your codebase. When designers or developers lose sight of this important concept during a project, it usually ends up in a messy codebase where a ton of arbitrary partials are being imported into a big style.scss that not only will make it difficult for other developers to follow and understand, but is by no means scalable. CSS has faults While Sass has powerful features like functions, loops, and variables, it still doesn’t solve some of the fundamental problems that exist within CSS. There are two main problems that come up when styling CSS at scale that make it difficult to work in a straightforward way. The first problem exists with the CSS cascade. The issue with cascade is that it makes the entire codebase very highly dependent on source order and exposes a global namespace where selectors can inherit other selectors making it hard to fully encapsulate styles. Because of this design flaw, any new styles we add will always be subject to previous dependencies and, without careful consideration, can quickly become overridden in an undesirable manner. The second and biggest problem occurs from specificity. When writing highly specific selectors, such as an ID or nested descendant selectors, these styles will problematically bypass the cascade, making it challenging to add additional styles that may be less specific. These problems need to be addressed at the early stages of a project in order for designers and developers to understand the codebase, to keep new code DRY (Don’t Repeat Yourself), and to allow for scalability. Harry Roberts' ITCSS ITCSS (Inverted Triangle CSS) is an architecture methodology by Harry Roberts for creating scalable, managed CSS. It is primarily a way of thinking about your codebase and a methodology that designers and developers can follow to allow for project clarity and scalability. It’s also not tied to CSS specifically and so therefore can also be used in projects with preprocessors like Sass. The primary idea behind ITCSS is that you should structure your code in an order of specificity. This means your generic styles like global resets and tag selectors (less specific) will go at the top, and you’ll gradually put more explicit styles further down the stylesheet. This creates an “inverted triangle“ shape from the order of specificity. With this methodology, we can begin to structure our Sass in an organized way and follow a strategy when approaching new styles. Creating layers The fundamental key to using ITCSS is to divide our styles into layers. These layers will consist of directories that will contain specific aspects of our code with related partials that we can build upon. In a similar fashion to MVC (Model-View-Controller), where you know where to look for certain things, let’s examine each layer and look at what it can be used for. Settings These are your global variables and configuration settings. This is where you would put your Sass variables containing all your fonts, typography sizes, colors, paddings, margins, and breakpoints. Tools These are your Sass mixins and functions. They could be utility functions or layout or theme mixins. Generic These are ground-zero styles. This means things like global resets, box-sizing, or print styles. Base This layer contains any un-classed selectors. This means things like h1 tags and p tags. In essence, what does an h1 look like without a class? These partials should be adjustments to base elements. Objects In objects, we’re really talking about design patterns like the media object. This is the first layer where you’d begin to use classes. Here you’d want to choose agnostic names that aren’t specific to the type of object. For example, you may have a .slider-list, but not a .product-slider-list. The idea is to keep these cosmetic-free in order to keep them reusable across component instances. Components These are more explicit to the type of object. In this case, a .product-slider-list would be in a components/_product-slider.scss partial within this layer. Trumps Lastly, the trumps, or “override” layer, should contain high-specificity selectors. These are things like utility classes such as .hide which may use a rule like display none !important.  Conclusion It’s important to remember that when you’re styling a new project, you should consider a structural approach early on and have a strategy like ITCSS that allows for scalability. With a sane environment set up that keeps a clear, contextual separation of styles, you’ll be able to tame and manage the source order, abstract design patterns, and scale your code while leveraging features within Sass. From 11th to 17th April, you can save 70% on some of our very best web development titles. From ReactJS to Angular 2, we've got everything the modern web developer needs. Find them here.  About the author Cameron is a freelance web designer, developer, and consultant based in Brooklyn, NY. Whether he’s shipping a new MVP feature for an early-stage startup or harnessing the power of cutting-edge technologies with a digital agency, his specialities in UX, Agile, and front-end Development unlock the possibilities that help his clients thrive. He blogs about design, development, and entrepreneurship and is often tweeting something clever at @cameronjroe.
Read more
  • 0
  • 0
  • 8499

article-image-accessing-data-spring
Packt
25 Jan 2016
8 min read
Save for later

Accessing Data with Spring

Packt
25 Jan 2016
8 min read
In this article written by Shameer Kunjumohamed and Hamidreza Sattari, authors of the book Spring Essentials, we will learn how to access data with Spring. (For more resources related to this topic, see here.) Data access or persistence is a major technical feature of data-driven applications. It is a critical area where careful design and expertise is required. Modern enterprise systems use a wide variety of data storage mechanisms, ranging from traditional relational databases such as Oracle, SQL Server, and Sybase to more flexible, schema-less NoSQL databases such as MongoDB, Cassandra, and Couchbase. Spring Framework provides comprehensive support for data persistence in multiple flavors of mechanisms, ranging from convenient template components to smart abstractions over popular Object Relational Mapping (ORM) tools and libraries, making them much easier to use. Spring's data access support is another great reason for choosing it to develop Java applications. Spring Framework offers developers the following primary approaches for data persistence mechanisms to choose from: Spring JDBC ORM data access Spring Data Furthermore, Spring standardizes the preceding approaches under a unified Data Access Object (DAO) notation called @Repository. Another compelling reason for using Spring is its first class transaction support. Spring provides consistent transaction management, abstracting different transaction APIs such as JTA, JDBC, JPA, Hibernate, JDO, and other container-specific transaction implementations. In order to make development and prototyping easier, Spring provides embedded database support, smart data source abstractions, and excellent test integration. This article explores various data access mechanisms provided by Spring Framework and its comprehensive support for transaction management in both standalone and web environments, with relevant examples. Why use Spring Data Access when we have JDBC? JDBC (short for Java Database Connectivity), the Java Standard Edition API for data connectivity from Java to relational databases, is a very a low-level framework. Data access via JDBC is often cumbersome; the boilerplate code that the developer needs to write makes the it error-prone. Moreover, JDBC exception handling is not sufficient for most use cases; there exists a real need for simplified but extensive and configurable exception handling for data access. Spring JDBC encapsulates the often-repeating code, simplifying the developer's code tremendously and letting him focus entirely on his business logic. Spring Data Access components abstract the technical details, including lookup and management of persistence resources such as connection, statement, and result set, and accept the specific SQL statements and relevant parameters to perform the operation. They use the same JDBC API under the hood while exposing simplified, straightforward interfaces for the client's use. This approach helps make a much cleaner and hence maintainable data access layer for Spring applications. DataSource The first step of connecting to a database from any Java application is obtaining a connection object specified by JDBC. DataSource, part of Java SE, is a generalized factory of java.sql.Connection objects that represents the physical connection to the database, which is the preferred means of producing a connection. DataSource handles transaction management, connection lookup, and pooling functionalities, relieving the developer from these infrastructural issues. DataSource objects are often implemented by database driver vendors and typically looked up via JNDI. Application servers and servlet engines provide their own implementations of DataSource, a connector to the one provided by the database vendor, or both. Typically configured inside XML-based server descriptor files, server-supplied DataSource objects generally provide built-in connection pooling and transaction support. As a developer, you just configure your DataSource objects inside the server (configuration files) declaratively in XML and look it up from your application via JNDI. In a Spring application, you configure your DataSource reference as a Spring bean and inject it as a dependency to your DAOs or other persistence resources. The Spring <jee:jndi-lookup/> tag (of  the http://www.springframework.org/schema/jee namespace) shown here allows you to easily look up and construct JNDI resources, including a DataSource object defined from inside an application server. For applications deployed on a J2EE application server, a JNDI DataSource object provided by the container is recommended. <jee:jndi-lookup id="taskifyDS" jndi-name="java:jboss/datasources/taskify"/> For standalone applications, you need to create your own DataSource implementation or use third-party implementations, such as Apache Commons DBCP, C3P0, and BoneCP. The following is a sample DataSource configuration using Apache Commons DBCP2: <bean id="taskifyDS" class="org.apache.commons.dbcp2.BasicDataSource" destroy-method="close"> <property name="driverClassName" value="${driverClassName}" /> <property name="url" value="${url}" /> <property name="username" value="${username}" /> <property name="password" value="${password}" /> . . . </bean> Make sure you add the corresponding dependency (of your DataSource implementation) to your build file. The following is the one for DBCP2: <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-dbcp2</artifactId> <version>2.1.1</version> </dependency> Spring provides a simple implementation of DataSource called DriverManagerDataSource, which is only for testing and development purposes, not for production use. Note that it does not provide connection pooling. Here is how you configure it inside your application: <bean id="taskifyDS" class="org.springframework.jdbc.datasource.DriverManagerDataSource"> <property name="driverClassName" value="${driverClassName}" /> <property name="url" value="${url}" /> <property name="username" value="${username}" /> <property name="password" value="${password}" /> </bean> It can also be configured in a pure JavaConfig model, as shown in the following code: @Bean DataSource getDatasource() { DriverManagerDataSource dataSource = new DriverManagerDataSource(pgDsProps.getProperty("url")); dataSource.setDriverClassName( pgDsProps.getProperty("driverClassName")); dataSource.setUsername(pgDsProps.getProperty("username")); dataSource.setPassword(pgDsProps.getProperty("password")); return dataSource; } Never use DriverManagerDataSource on production environments. Use third-party DataSources such as DBCP, C3P0, and BoneCP for standalone applications, and JNDI DataSource, provided by the container, for J2EE containers instead. Using embedded databases For prototyping and test environments, it would be a good idea to use Java-based embedded databases for quickly ramping up the project. Spring natively supports HSQL, H2, and Derby database engines for this purpose. Here is a sample DataSource configuration for an embedded HSQL database: @Bean DataSource getHsqlDatasource() { return new EmbeddedDatabaseBuilder().setType(EmbeddedDatabaseType.HSQL) .addScript("db-scripts/hsql/db-schema.sql") .addScript("db-scripts/hsql/data.sql") .addScript("db-scripts/hsql/storedprocs.sql") .addScript("db-scripts/hsql/functions.sql") .setSeparator("/").build(); } The XML version of the same would look as shown in the following code: <jdbc:embedded-database id="dataSource" type="HSQL"> <jdbc:script location="classpath:db-scripts/hsql/db-schema.sql" /> . . . </jdbc:embedded-database> Handling exceptions in the Spring data layer With traditional JDBC based applications, exception handling is based on java.sql.SQLException, which is a checked exception. It forces the developer to write catch and finally blocks carefully for proper handling and to avoid resource leakages. Spring, with its smart exception hierarchy based on runtime exception, saves the developer from this nightmare. By having DataAccessException as the root, Spring bundles a big set of meaningful exceptions that translate the traditional JDBC exceptions. Besides JDBC, Spring covers the Hibernate, JPA, and JDO exceptions in a consistent manner. Spring uses SQLErrorCodeExceptionTranslator, which inherits SQLExceptionTranslator in order to translate SQLExceptions to DataAccessExceptions. We can extend this class for customizing the default translations. We can replace the default translator with our custom implementation by injecting it into the persistence resources (such as JdbcTemplate, which we will cover soon). DAO support and @Repository annotation The standard way of accessing data is via specialized DAOs that perform persistence functions under the data access layer. Spring follows the same pattern by providing DAO components and allowing developers to mark their data access components as DAOs using an annotation called @Repository. This approach ensures consistency over various data access technologies, such as JDBC, Hibernate, JPA, and JDO, as well as project-specific repositories. Spring applies SQLExceptionTranslator across all these methods consistently. Spring recommends your data access components to be annotated with the stereotype @Repository. The term "repository" was originally defined in Domain-Driven Design, Eric Evans, Addison-Wesley, as "a mechanism for encapsulating storage, retrieval, and search behavior which emulates a collection of objects." This annotation makes the class eligible for DataAccessException translation by Spring Framework. Spring Data, another standard data access mechanism provided by Spring, revolves around @Repository components. Summary We have so far explored Spring Framework's comprehensive coverage of all the technical aspects around data access and transaction. Spring provides multiple convenient data access methods, which takes away much of the hard work involved in building the data layer from the developer and also standardizes business components. Correct usage of Spring data access components will ensure that our data layer is clean and highly maintainable. Resources for Article: Further resources on this subject: So, what is Spring for Android?[article] Getting Started with Spring Security[article] Creating a Spring Application[article]
Read more
  • 0
  • 0
  • 16718

article-image-configuring-hbase
Packt
25 Jan 2016
14 min read
Save for later

Configuring HBase

Packt
25 Jan 2016
14 min read
In this article by Ruchir Choudhry, the author of the book HBase High Performance Cookbook, we will cover the configuration and deployment of HBase. (For more resources related to this topic, see here.) Introduction HBase is an open source, nonrelational, column-oriented distributed database modeled after Google's Cloud BigTable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project, and it runs on top of Hadoop Distributed File System (HDFS), providing BigTable-like capabilities for Hadoop. It's a column-oriented database, which is empowered by a fault-tolerant distributed file structure knows as HDFS. In addition to this, it also provides advanced features, such as auto sharding, load balancing, in-memory caching, replication, compression, near real-time lookups, strong consistency (using multiversions), block caches, and bloom filters for real-time queries and an array of client APIs. Throughout the chapter, we will discuss how to effectively set up mid and large size HBase clusters on top of the Hadoop and HDFS framework. This article will help you set up an HBase on a fully distributed cluster. For the cluster setup, we will consider redhat-6.2 Linux 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 GNU/Linux, which will have six nodes. Configuration and Deployment Before we start HBase in a fully distributed mode, we will first be setting up Hadoop-2.4.0 in a distributed mode, and then, on top of a Hadoop cluster, we will set up HBase because it stores data in Hadoop Distributed File System (HDFS). Check the permissions of the users; HBase must have the ability to create a directory. Let's create two directories in which the data for NameNode and DataNode will reside: drwxrwxr-x 2 app app 4096 Jun 19 22:22 NameNodeData drwxrwxr-x 2 app app 4096 Jun 19 22:22 DataNodeData -bash-4.1$ pwd /u/HbaseB/hadoop-2.4.0 -bash-4.1$ ls -lh total 60K drwxr-xr-x 2 app app 4.0K Mar 31 08:49 bin drwxrwxr-x 2 app app 4.0K Jun 19 22:22 DataNodeData drwxr-xr-x 3 app app 4.0K Mar 31 08:49 etc Getting Ready Following are the steps to install and configure HBase: The first step to start is to choose a Hadoop cluster. Then, get the hardware details required for it. Get the software required to perform the setup. Get the OS required to do the setup. Perform the configuration steps. We will require the following components for NameNode: Components Details Type of systems An operating system redhat-6.2 Linux 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 GNU/Linux, or other standard linux kernel.   Hardware/CPUS 16 to 24 CPUS cores. NameNode /Secondry NameNode. Hardware/RAM 64 to 128 GB. In special cases, 128 GB to 512 GB RAM. NameNode/Secondry NameNodes. Hardware/storage Both NameNode servers should have highly reliable storage for their namespace storage and edit log journaling. Typically, hardware RAID and/or reliable network storage are justifiable options. Note that the previous commands including an onsite disk replacement option in your support contract so that a failed RAID disk can be replaced quickly. NameNode/Secondry Namenodes.   RAID: Raid is nothing but a Random Access Inexpensive Drive or Independent Disk; there are many levels of RAID drives, but for Master or NameNode, RAID-1 will be enough. JBOD: This stands for Just a Bunch of Disk. The design is to have multiple hard drives stacked over each other with no redundancy. The calling software needs to take care of the failure and redundancy. In essence, it works as a single logical volume. The following screenshot shows the working mechanism of RAID and JBOD: Before we start for the cluster setup, a quick recap of the Hadoop setup is essential, with brief descriptions. How to do it… Let's create a directory where you will have all the software components to be downloaded: For simplicity, let's take this as /u/HbaseB. Create different users for different purposes. The format will be user/group; this is essentially required to differentiate various roles for specific purposes: HDFS/Hadoop: This is for the handling of Hadoop-related setups Yarn/Hadoop: This is for Yarn-related setups HBase/Hadoop Pig/Hadoop Hive/Hadoop Zookeeper/Hadoop HCat/Hadoop Set up directories for the Hadoop cluster: let's assume /u as a shared mount point; we can create specific directories, which will be used for specific purposes: -bash-4.1$ ls -ltr total 32 drwxr-xr-x 9 app app 4096 Oct 7 2013 hadoop-2.2.0 drwxr-xr-x 10 app app 4096 Feb 20 10:58 zookeeper-3.4.6 drwxr-xr-x 15 app app 4096 Apr 5 08:44 pig-0.12.1 drwxrwxr-x 7 app app 4096 Jun 30 00:57 hbase-0.98.3-hadoop2 drwxrwxr-x 8 app app 4096 Jun 30 00:59 apache-hive-0.13.1-bindrwxrwxr-x 7 app app 4096 Jun 30 01:04 mahout-distribution-0.9 Make sure that you have adequate privileges in the folder to add, edit, and execute a command. Also, you must set up password-less communication between different machines, such as from the name node to DataNode and from HBase Master to all the region server nodes. Refer to this webpage to learn how to do this: http://www.debian-administration.org/article/152/Password-less_logins_with_OpenSSH. Here, we will list the procedure to achieve the end result of the recipe. This section will follow a numbered bullet form. We do not need to explain the reason we are following a procedure. Numbered single sentences will do fine. Let's assume there is a /u directory and you have downloaded the entire stack of software from /u/HbaseB/hadoop-2.2.0/etc/hadoop/; look for the core-site.xml file. Place the following lines in this file: configuration> <property> <name>fs.default.name</name> <value>hdfs://mynamenode-hadoop:9001</value> <description>The name of the default file system. </description> </property> </configuration> You can specify a port that you want to use; it should not clash with the ports that are already in use by the system for various purposes. A quick look at this link can provide more specific details about this; complete detail on this topic is out of the scope of this book. You can refer to http://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers. Save the file. This helps us create a master/NameNode directory. Now let's move on to set up secondary nodes. Edit /u/HbaseB/hadoop-2.4.0/etc/hadoop/ and look for the core-site.xml file: <configuration> <property> <name>fs.checkpoint.dir</name> <value>/u/dn001/hadoop/hdf/secdn /u/dn002/hadoop/hdfs/secdn </value> <description>A comma separated list of paths. Use the list of directories from $FS_CHECKPOINT_DIR. example, /u/dn001/hadoop/hdf/secdn,/u/dn002/hadoop/hdfs/secd n </description> </property> </configuration> The separation of the directory structure is for the purpose of the clean separation of the hdfs block separation and to keep the configurations as simple as possible. This also allows us to do proper maintenance. Now let's move toward changing the setup for hdfs; the file location will be /u/HbaseB/hadoop-2.4.0/etc/hadoop/hdfs-site.xmlfor NameNode: <property> <name>dfs.name.dir</name> <value> /u/nn01/hadoop/hdfs/nn/u/nn02/hadoop/hdfs/nn </value> <description> Comma separated list of path, Use the list of directories </description> </property> for DataNode: <property> <name>dfs.data.dir</name> <value>/u/dnn01/hadoop/hdfs/dn,/u/dnn02/hadoop/hdfs/dn </value> <description>Comma separated list of path, Use the list of directories </description> </property> Now let's go for NameNode for the HTTP address or to NameNode using the HTTP protocol: <property> <name>dfs.http.address</name> <value>namenode.full.hostname:50070</value> <description>Enter your NameNode hostname for http access. </description> </property> The HTTP address for the secondary NameNode is as follows: <property> <name>dfs.secondary.http.address</name> <value> secondary.namenode.full.hostname:50090 </value> <description> Enter your Secondary NameNode hostname. </description> </property> We can go for an HTTPS setup for NameNode as well, but let's keep this optional for now: Now let's look for the Yarn setup in the /u/HbaseB/ hadoop-2.2.0/etc/hadoop/ yarn-site.xml file: For the resource tracker that's a part of the Yarn resource manager, execute the following code: <property> <name>yarn.resourcemanager.resourcetracker.address</name> <value>yarnresourcemanager.full.hostname:8025</value> <description>Enter your yarn Resource Manager hostname.</description> </property> For the resource schedule that's part of the Yarn resource scheduler, execute the following code: <property> <name>yarn.resourcemanager.scheduler.address</name> <value>resourcemanager.full.hostname:8030</value> <description>Enter your ResourceManager hostname</description> </property> For scheduler address, execute the following code: <property> <name>yarn.resourcemanager.address</name> <value>resourcemanager.full.hostname:8050</value> <description>Enter your ResourceManager hostname.</description> </property> For scheduler admin address, execute the following code: <property> <name>yarn.resourcemanager.admin.address</name> <value>resourcemanager.full.hostname:8041</value> <description>Enter your ResourceManager hostname.</description> </property> To set up the local directory, execute the following code: <property> <name>yarn.nodemanager.local-dirs</name> <value>/u/dnn01/hadoop/hdfs /yarn,/u/dnn02/hadoop/hdfs/yarn </value> <description>Comma separated list of paths. Use the list of directories from,.</description> </property> To set up the log location, execute the following code: <property> <name>yarn.nodemanager.logdirs</name> <value>/u/var/log/hadoop/yarn</value> <description>Use the list of directories from $YARN_LOG_DIR. <description> </property> This completes the configuration changes required for Yarn Now let's make the changes for MapReduce. Open /u/HbaseB/ hadoop-2.2.0/etc/hadoop/mapred-site.xml. Now let's place this configuration setup in mapred-site.xml and place this between <configuration></configuration>: <property> <name>mapreduce.jobhistory.address</name> <value>jobhistoryserver.full.hostname:10020</value> <description>Enter your JobHistoryServer hostname.</description> </property> Once we have configured MapReduce, we can move on to configuring HBase. Let's go to the /u/HbaseB/hbase-0.98.3-hadoop2/conf path and open the hbase-site.xml file. You will see a template that has <configuration></configurations>. We need to add the following lines between the starting and ending tags: <property> <name>hbase.rootdir</name> <value>hdfs://hbase.namenode.full.hostname:8020/apps/hbase/data</value> <description> Enter the HBase NameNode server hostname</description> </property> <property> <!—this id for binding address --> <name>hbase.master.info.bindAddress</name> <value>$hbase.master.full.hostname</value> <description>Enter the HBase Master server hostname</description> </property> This competes the HBase changes. ZooKeeper: Now let's focus on the setup of ZooKeeper. In distributed a environment, let's go to /u/HbaseB/zookeeper-3.4.6/conf locations, rename zoo_sample.cfg to zoo.cfg, and place the details as follows: yourzooKeeperserver.1=zoo1:2888:3888 yourZooKeeperserver.2=zoo2:2888:3888 If you want to test this setup locally, use different port combinations. Atomic broadcasting is an atomic messaging system that keeps all the servers in sync and provides reliable delivery, total orders, casual orders, and so on. Region servers: Before concluding, let's go to the region server setup process. Go to the /u/HbaseB/hbase-0.98.3-hadoop2/conf folder and edit the regionserver file. Specify the region servers accordingly: RegionServer1 RegionServer2 RegionServer3 RegionServer4 Copy all the configuration files of Hbase and ZooKeeper to the relative host dedicated for Hbase and ZooKeeper. Let's quickly validate the setup that we worked on: Sudo su $HDFS_USER /u/HbaseB/hadoop-2.2.0/bin/hadoop namenode -format /u/HbaseB/hadoop-2.4.0/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode Now let's go to the secondary nodes: Sudo su $HDFS_USER /u/HbaseB/hadoop-2.2.0/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start secondarynamenode Now let's perform all the steps for DataNode: Sudo su $HDFS_USER /u/HbaseB/hadoop-2.2.0/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start datanode Test 01> See if you can reach from your browser http://namenode.full.hostname:50070 Test 02> sudo su $HDFS_USER /u/HbaseB/hadoop-2.2.0/sbin/hadoop dfs -copyFromLocal /tmp/hello.txt /u/HbaseB/hadoop-2.2.0/sbin/hadoop dfs –ls you must see hello.txt once the command executes. Test 03> Browse http://datanode.full.hostname:50075/browseDirectory.jsp?namenodeInfoPort=50070&dir=/&nnaddr=$datanode.full.hostname:8020 you should see the details on the datanode. Validate the Yarn and MapReduce setup by following these steps: Execute the command from Resource Manager: <login as $YARN_USER and source the directories.sh companion script> /u/HbaseB/hadoop-2.2.0/sbin /yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager Execute the command from Node Manager <login as $YARN_USER and source the directories.sh companion script> /usr/lib/hadoop-yarn/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager Execute the following commands: hadoop fs -mkdir /app-logs hadoop fs -chown $YARN_USER /app-logs hadoop fs -chmod 1777 /app-logs Execute MapReduce Sudo su $HDFS_USER /u/HbaseB/hadoop-2.2.0/sbin/hadoop fs -mkdir -p /mapred/history/done_intermediate /u/HbaseB/hadoop-2.2.0/sbin/hadoop fs -chmod -R 1777 /mapred/history/done_intermediate /u/HbaseB/hadoop-2.2.0/sbin/hadoop fs -mkdir -p /mapred/history/done /u/HbaseB/hadoop-2.2.0/sbin/hadoop fs -chmod -R 1777 /mapred/history/done /u/HbaseB/hadoop-2.2.0/sbin/hadoop fs -chown -R mapred /mapred export HADOOP_LIBEXEC_DIR=/u/HbaseB/hadoop-2.2.0/libexec/ export HADOOP_MAPRED_HOME=/=/u/HbaseB/hadoop-2.2.0/hadoop-mapreduceexport HADOOP_MAPRED_LOG_DIR==/u/HbaseB/hadoop-2.2.0//mapred Start the jobhistory servers: <login as $MAPRED_USER and source the directories.sh companion script> /u/HbaseB/hadoop-2.2.0/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR Test 01: from the browser or from curl use the link to browse. http://resourcemanager.full.hostname:8088/ Test 02: Sudo su $HDFS_USER /u/HbaseB/hadoop-2.2.0/bin/hadoop jar /u/HbaseB/hadoop-2.2.0/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.2.1-alpha.jar teragen 100 /test/10gsort/input /u/HbaseB/hadoop-2.2.0/bin/hadoop jar /u/HbaseB/hadoop-2.2.0/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.2.1-alpha.jar Validate the HBase setup:     Login as $HDFS_USER /u/HbaseB/hadoop-2.2.0/bin/hadoop fs –mkdir /apps/hbase /u/HbaseB/hadoop-2.2.0/bin/hadoop fs –chown –R /apps/hbase      Now login as $HBASE_USER /u/HbaseB/hbase-0.98.3-hadoop2/bin/hbas-daemon.sh –-config $HBASE_CONF_DIR start master this will start the master node      Now let’s move to HBase Region server nodes: /u/HbaseB/hbase-0.98.3-hadoop2/bin/hbase-daemon.sh –config $HBASE_CONF_DIR start regionservers this will start the regionservers For single machine direct sudo ./hbase master start can also be used. Please check the logs in case of any logs. Now lets login using Sudo su- $HBASE_USER ./hbase shell will connect us to the hbase to the master. Validate the ZooKeeper setup: -bash-4.1$ sudo ./zkServer.sh start JMX enabled by default Using config: /u/HbaseB/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED You can also pipe the log to the ZooKeeper logs. /u/logs//u/HbaseB/zookeeper-3.4.6/zoo.out 2>&1 Summary In this article, we learned how to configure and set up HBase. We set up HBase to store data in Hadoop Distributed File System. We explored the working structure of RAID and JBOD and the differences between both filesystems. Resources for Article: Further resources on this subject: Understanding the HBase Ecosystem[article] The HBase's Data Storage[article] HBase Administration, Performance Tuning[article]
Read more
  • 0
  • 0
  • 14107
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-creating-simple-maps-openlayers-3
Packt
22 Jan 2016
14 min read
Save for later

Creating Simple Maps with OpenLayers 3

Packt
22 Jan 2016
14 min read
In this article by Gábor Farkas, the author of the book Mastering OpenLayers 3, you will learn about OpenLayers 3 which is the most robust open source web mapping library out there, highly capable of handling the client side of a WebGIS environment. Whether you know how to use OpenLayers 3 or you are new to it, this article will help you to create a simple map and either refresh some concepts or get introduced to them. As this is a mastering book, we will mainly discuss the library's structure and capabilities in greater depth. In this article we will create a simple map with the library, and revise the basic terms related to it. In this article we will cover the following topics: Structure of OpenLayers 3 Architectural considerations Creating a simple map Using the API documentation effectively Debugging the code (For more resources related to this topic, see here.) Before getting started Take a look at the code provided with the book. You should see a js folder in which the required libraries are stored. For this article, ol.js, and ol.css in the ol3-3.11.0 folder will be sufficient. The code is also available on GitHub. You can download a copy from the following URL: https://github.com/GaborFarkas/mastering_openlayers3/releases. You can download the latest release of OpenLayers 3 from its GitHub repository at https://github.com/openlayers/ol3/releases. For now, grabbing the distribution version (v3.11.0-dist.zip) should be enough. Creating a working environment There is a security restriction in front end development, called CORS (Cross Origin Resource Sharing). By default, this restriction prevents the application from grabbing content from a different domain. On top of that, some browsers disallow reaching content from the hard drive when a web page is opened from the file system. To prevent this behavior, please make sure you possess one of the following: A running web server (highly recommended) Firefox web browser with security.fileuri.strict_origin_policy set to false (you can reach flags in Firefox by opening about:config from the address bar) Google Chrome web browser started with the --disable-web-security parameter (make sure you have closed every other instance of Chrome before) Safari web browser with Disable Local File Restrictions (in the Develop menu, which can be enabled in the Advanced tab of Preferences) You can easily create a web server if you have Python 2 with SimpleHTTPServer, or if you have Python 3 with http.server. For basic tutorials, you can consult the appropriate Python documentation pages. Structure of OpenLayers 3 OpenLayers 3 is a well structured, modular, and complex library, where flexibility, and consistency take a higher priority than performance. However, this does not mean OpenLayers 3 is slow. On the contrary, the library highly outperforms its predecessor; therefore its comfortable and logical design does not really adversely affect its performance. The relationship of some of the most essential parts of the library can be described with a radial UML (Universal Modeling Language) diagram, such as the following : Reading an UML scheme can seem difficult, and can be difficult if it is a proper one. However, this simplified scheme is quite easy to understand. With regard to the arrows, a single 1 represents a one-to-one relation, while the 0..n and 1 symbols denote a one-to-many relationship. You will probably never get into direct contact with the two superclasses at the top of the OpenLayers 3 hierarchy: ol.Observable, and ol.Object. However, most of the classes you actively use are children of these classes. You can always count with their methods, when you design a web mapping or WebGIS application. In the diagram we can see, that the parent of the most essential objects is the ol.Observable class. This superclass ensures all of its children have consistent listener methods. For example, every descendant of this superclass bears the on, once, and un functions, making registering event listeners to them as easy as possible. The next superclass, ol.Object, extends its parent with methods capable of easy property management. Every inner property managed by its methods (get, set, and unset) are observable. There are also convenience methods for bulk setting and getting properties, called getProperties, and setProperties. Most of the other frequently used classes are direct, or indirect, descendants of this superclass. Building the layout Now, that we covered some of the most essential structural aspects of the library, let's consider the architecture of an application deployed in a production environment. Take another look at the code. There is a chapters folder, in which you can access the examples within the appropriate subfolder. If you open ch01, you can see three file types in it. As you have noticed, the different parts of the web page (HTML, CSS, and JavaScript) are separated. There is one main reason behind this: the code remains as clean as possible. With a clean and rational design, you will always know where to look when you would like to make a modification. Moreover, if you're working for a company there is a good chance someone else will also work with your code. This kind of design will make sure your colleague can easily handle your code. On top of that, if you have to develop a wrapper API around OpenLayers 3, this is the only way your code can be integrated into future projects. Creating the appeal As the different parts of the application are separated, we will create a minimalistic HTML document. It will expand with time, as the application becomes more complicated and needs more container elements. For now, let's write a simple HTML document: <!DOCTYPE html> <html lang="en"> <head> <title>chapter 1 - Creating a simple map</title> <link href="../../js/ol3-3.11.0/ol.css" rel="stylesheet"> <link href="ch01.css" rel="stylesheet"> <script type="text/javascript" src="../../js/ol3- 3.11.0/ol.js"></script> <script type="text/javascript" src="ch01_simple_map.js"></script> </head> <body> <div id="map" class="map"></div> </body> </html> In this simple document, we defined the connection points between the external resources, and our web page. In the body, we created a simple div element with the required properties. We don't really need anything else; the magic will happen entirely in our code. Now we can go on with our CSS file and define one simple class, called map: .map { width: 100%; height: 100%; } Save this simple rule to a file named ch01.css, in the same folder you just saved the HTML file. If you are using a different file layout, don't forget to change the relative paths in the link, and script tags appropriately. Writing the code Now that we have a nice container for our map, let's concentrate on the code. In this book, most of the action will take place in the code; therefore this will be the most important part. First, we write the main function for our code. function init() { document.removeEventListener('DOMContentLoaded', init); } document.addEventListener('DOMContentLoaded', init); By using an event listener, we can make sure the code only runs when the structure of the web page has been initialized. This design enables us to use relative values for sizing, which is important for making adaptable applications. Also, we make sure the map variable is wrapped into a function (therefore we do not expose it) and seal a potential security breach. In the init function, we detach the event listener from the document, because it will not be needed once the DOM structure has been created. The DOMContentLoaded event waits for the DOM structure to build up. It does not wait for images, frames, and dynamically added content; therefore the application will load faster. Only IE 8, and prior versions, do not support this event type, but if you have to fall back you can always use the window object's load event. To check a feature's support in major browsers, you can consult the following site: http://www.caniuse.com/. Next, we extend the init function, by creating a vector layer and assigning it to a variable. Note that, in OpenLayers 3.5.0, creating vector layers has been simplified. Now, a vector layer has only a single source class, and the parser can be defined as a format in the source. var vectorLayer = new ol.layer.Vector({ source: new ol.source.Vector({ format: new ol.format.GeoJSON({ defaultDataProjection: 'EPSG:4326' }), url: '../../res/world_capitals.geojson', attributions: [ new ol.Attribution({ html: 'World Capitals © Natural Earth' }) ] }) }); We are using a GeoJSON data source with a WGS84 projection. As the map will use a Web Mercator projection, we provide a defaultDataProjection value to the parser, so the data will be transformed automatically into the view's projection. We also give attribution to the creators of the vector dataset. You can only give attribution with an array of ol.Attribution instances passed to the layer's source. Remember: giving attribution is not a matter of choice. Always give proper attribution to every piece of data used. This is the only way to avoid copyright infringement. Finally, construct the map object, with some extra controls and one extra interaction. var map = new ol.Map({ target: 'map', layers: [ new ol.layer.Tile({ source: new ol.source.OSM() }), vectorLayer ], controls: [ //Define the default controls new ol.control.Zoom(), new ol.control.Rotate(), new ol.control.Attribution(), //Define some new controls new ol.control.ZoomSlider(), new ol.control.MousePosition(), new ol.control.ScaleLine(), new ol.control.OverviewMap() ], interactions: ol.interaction.defaults().extend([ new ol.interaction.Select({ layers: [vectorLayer] }) ]), view: new ol.View({ center: [0, 0], zoom: 2 }) }); In this example, we provide two layers: a simple OpenStreetMap tile layer and the custom vector layer saved into a separate variable. For the controls, we define the default ones, then provide a zoom slider, a scale bar, a mouse position notifier, and an overview map. There are too many default interactions, therefore we extend the default set of interactions with ol.interaction.Select. This is the point where saving the vector layer into a variable becomes necessary. The view object is a simple view that defaults to projection EPSG:3857 (Web Mercator). OpenLayers 3 also has a default set of controls that can be accessed similarly to the interactions, under ol.control.defaults(). Default controls and interactions are instances of ol.Collection, therefore both of them can be extended and modified like any other collection object. Note that the extend method requires an array of features. Save the code to a file named ch01_simple_map.js in the same folder as your HTML file. If you open the HTML file, you should see the following map: You have different, or no results? Do not worry, not even a bit! Open up your browser's developer console (F12 in modern ones, or CTRL + J if F12 does not work), and resolve the error(s) noted there. If there is no result, double-check the HTML and CSS files; if you have a different result, check the code or the CORS requirements based on the error message. If you use Internet Explorer, make sure you have version 9, or better. Using the API documentation The API documentation for OpenLayers 3.11.0, the version we are using, can be found at http://www.openlayers.org/en/v3.11.0/apidoc/. The API docs, like the library itself, are versioned, thus you can browse the appropriate documentation for your OpenLayers 3 version by changing v3.11.0 in the URL to the version you are currently using. The development version of the API is also documented; you can always reach it at http://www.openlayers.org/en/master/apidoc/. Be careful when you use it, though. It contains all of the newly implemented methods, which probably won't work with the latest stable version. Check the API documentation by typing one of the preceding links in your browser. You should see the home page with the most frequently used classes. There is also a handy search box, with all of the classes listed on the left side. We have talked about default interactions, and their lengthy nature before. On the home page you can see a link to the default interactions. If you click on it, you will be directed to the following page: Now you can also see that nine interactions are added to the map by default. It would be quite verbose to add them one by one just to keep them when we define only one extra interaction, wouldn't it? You can see some features marked as experimental while you browse the API documentation with the Stable Only checkbox unchecked. Do not consider those features to be unreliable. They are stable, but experimental, and therefore they can be modified or removed in future versions. If the developer team considers a feature is useful and does not need further optimization or refactoring, it will be marked as stable. Understanding type definitions For every constructor and function in the API, the input and expected output types are well documented. To see a good example, let's search for a function with inputs and outputs as well. If you search for ol.proj.fromLonLat, you will see the following function: The function takes two arguments as input, one named coordinate and one named projection; projection is an optional one. coordinate is an ol.Coordinate type (an array with two numbers), while projection is an ol.proj.ProjectionLike type (a string representing the projection). The returned value, as we can see next to the white arrow, is also an ol.Coordinate type, with the transformed values. A good developer always keeps track of future changes in the library. This is especially important with OpenLayers 3, as it lacks backward-compatibility, when a major change occurs. You can see all of the major changes in the library in the OpenLayers 3 GitHub repository: https://github.com/openlayers/ol3/blob/master/changelog/upgrade-notes.md. Debugging the code As you will have noticed, there was a third file in the OpenLayers 3 folder discussed at the beginning of the article (js/ol3-3.11.0). This file, named ol-debug.js, is the uncompressed source file, in which the library is concatenated with all of its dependencies. We will use this file for two purpose in this book. Now, we will use it for debugging. First, open up ch01_simple_map.js. Next, extend the init function with an obvious mistake: var geometry = new ol.geom.Point([0, 0]); vectorLayer.getSource().addFeature(geometry); Don't worry if you can't spot the error immediately. That's what is debugging for. Save this extended JavaScript file with the name ch01_error.js. Next, replace the old script with the new one in the HTML file, like this: <script type="text/javascript" src="ch01_error.js"></script> If you open the updated HTML, and open your browser's developer console, you will see the following error message: Now that we have an error, let's check it in the source file by clicking on the error link on the right side of the error message: Quite meaningless, isn't it? The compiled library is created with Google's Closure Library, which obfuscates everything by default in order to compress the code. We have to tell it which precise part of the code should be exported. We will learn how to do that in the last article. For now, let's use the debug file. Change the ol.js in the HTML to ol-debug.js, load up the map, and check for the error again. Finally, we can see, in a well-documented form, the part that caused the error. This is a validating method, which makes sure the added feature is compatible with the library. It requires an ol.Feature as an input, which is how we caught our error. We passed a simple geometry to the function, instead of wrapping it in an ol.Feature first. Summary In this article, you were introduced to the basics of OpenLayers 3 with a more advanced approach. We also discussed some architectural considerations, and some of the structural specialties of the library. Hopefully, along with the general revision, we acquired some insight in using the API documentation and debugging practices. Congratulations! You are now on your way to mastering OpenLayers 3. Resources for Article: Further resources on this subject: What is OpenLayers? [article] OpenLayers' Key Components [article] OpenLayers: Overview of Vector Layer [article]
Read more
  • 0
  • 0
  • 6591

article-image-your-first-swift-app
Packt
22 Jan 2016
13 min read
Save for later

Your First Swift App

Packt
22 Jan 2016
13 min read
In this article by Giordano Scalzo, the author of the book Swift 2 by Example, learning a language is just half of the difficulty in building an app; the other half is the framework. This means that learning a language is not enough. In this article, we'll build a simple Guess a Number app just to become familiar with Xcode and a part of the CocoaTouch framework. (For more resources related to this topic, see here.) The app is… Our first complete Swift program is a Guess a Number app, a classic educational game for children where the player must guess a number that's generated randomly. For each guess, the game tells the player whether the guess is greater or lower than the generated number, which is also called the secret number. It is worth remembering that the goal is not to build an Apple's App Store-ready app with a perfect software architecture but to show you how to use Xcode to build software for iOS. So forgive me if the code is not exactly clean and the game is simple. Before diving into the code, we must define the interface of the app and the expected workflow. This game presents only one screen, which is shown in the following screenshot: At the top of the screen, a label reports the name of the app: Guess a Number. In the next row, another static label field with the word between connects the title with a dynamic label field that reports the current range. The text inside the label must change every time a new number is inserted. A text field at the center of the screen is where the player will insert their guess. A big button with OK written on it is the command that confirms that the player has inserted the chosen number. The last two labels give feedback to the player, as follows: Your last guess was too low is displayed if the number that was inserted is lower than the secret number Your last guess was too high is displayed if the number that was inserted is greater than the secret number The last label reports the current number of guesses. The workflow is straightforward: The app selects a random number. The player inserts their guess. If the number is equal to the secret number, a popup tells the player that they have won and shows them the number of guesses. If the number is lower than the secret number but greater than the lower bound, it becomes the new lower bound. Otherwise, it is silently discarded. If the number is greater and lower than the upper bound, it becomes the new upper bound. Otherwise, it's again silently discarded. Building a skeleton app Let's start building the app. There are two different ways to create a new project in Xcode: using a wizard or selecting a new project from the menu. When Xcode starts, it presents a wizard that shows the recently used projects and a shortcut to create a new project, as shown in the following screenshot: If you already have Xcode open, you can select a new project by navigating to File | New | Project…, as shown in the following screenshot: Whichever way you choose, Xcode will ask for the type of app that needs to be created. The app is really simple. Therefore, we choose Single View Application, as shown in the following screenshot: Before we start writing code, we need to complete the configuration by adding the organization identifier using the reverse domain name notation and Product Name. Together, they produce a Bundle Identifier, which is the unique identifier of the app. Pay attention to the selected language, which must obviously be Swift. Here is a screenshot that shows you how to fill the form: Once you're done with this data, you are ready to run the app by navigating to Product | Run, as shown in the following screenshot: After the simulator finishes loading the app, you can see our magnificent creation: a shiny, brilliant, white page! We can stop the app by navigating to Product | Stop, as shown in the following screenshot: Now, we are ready to implement the app. Adding the graphic components When we are developing an iOS app, it is considered good practice to implement the app outside-in, starting from the graphics. By taking a look at the files generated by the Xcode template, we can identify the two files that we'll use to build the Guess a Number app: Main.storyboard: This contains the graphics components ViewController.swift: This handles all the business logic of the app Here is a screenshot that presents the structure of the files in an Xcode project: Let's start by selecting the storyboard file to add the labels. The first thing that you will notice is that the canvas is not the same size or ratio as that of an iPhone and an iPad. To handle different sizes and different devices, Apple (since iOS 5) added a constraints system called Auto Layout as a system to connect the graphics components in a relative way regardless of the actual size of the running device. As Auto Layout is beyond the scope of this article, we'll implement the created app only for iPhone 6. After deciding upon the target device, we need to resize the canvas according to the real size of the device. From the tree structure to the right, we select View Controller, as shown in the following screenshot: After doing this, we move to the right, where you will see the properties of the View Controller. There, we select the tab containing Simulated Metrics; in this, we can insert the requested size. The following screenshot will help you locate the correct tab: Now that the size is what's expected, we can proceed to add labels, text fields, and the buttons from the list at the bottom-right corner of the screen. To add a component, we must choose it from the list of components. Then, we must drag it onto the screen, where we can place it at the expected coordinates. The following screenshot shows the list of UI components called an object library: When you add a text field, pay attention to how we select Number Pad as the value for Keyboard Type, as illustrated in the following screenshot: After selecting the values for all the components, the app should appear as shown in the mockup that we had drawn earlier, which can be confirmed in the following screenshot: Connecting the dots If we run the app, the screen is the same as the one in the storyboard, but if we try to insert a number into the text field and then press the OK button, nothing happens. This is so because the storyboard is still detached from the View Controller, which handles all the logic. To connect the labels to the View Controller, we need to create instances of a label prepended with the @IBOutlet keyword. Using this signature, the graphic editor inside Xcode named Interface Builder can recognize the instances available for a connection to the components: class ViewController: UIViewController { @IBOutlet weak var rangeLbl: UILabel! @IBOutlet weak var numberTxtField: UITextField! @IBOutlet weak var messageLbl: UILabel! @IBOutlet weak var numGuessesLbl: UILabel! @IBAction func onOkPressed(sender: AnyObject) { } } We have also added a method with the @IBAction prefix, which will be called when the button is pressed. Now, let's move on to Interface Builder to connect the labels and outlets. First of all, we need to select View Controller from the tree of components, as shown in the following screenshot: In the tabs to the right, select the outlet views; the last one with an arrow is a symbol. The following screenshot will help you find the correct symbol: This shows all the possible outlets to which a component can be connected. Upon moving the cursor onto the circle beside the rangeLbl label, we see that it changes to a cross. Now, we must click and drag a line to the label in the storyboard, as shown in the following screenshot: After doing the same for all the labels, the following screenshot shows the final configurations for the outlets: For the action of the button, the process is similar. Select the circle close to the onOkPressed action and drag a line to the OK button, as shown in the following screenshot: When the button is released, a popup appears with a list of the possible events that you can connect the action to. In our case, we connect the action to the Touch Up Inside event, which is triggered when we release the button without moving from its area. The following screenshot presents the list of the events raised by the UIButton component: Now, consider a situation where we add a log command like the following one: @IBAction func onOkPressed(sender: AnyObject) { println(numberTxtField.text) } Then, we can see the value of the text field that we insert and which is printed on the debug console. Now that all the components are connected to their respective outlets, we can add the simple code that's required to create the app. Adding the code First of all, we need to add a few instance variables to handle the state, as follows: private var lowerBound = 0 private var upperBound = 100 private var numGuesses = 0 private var secretNumber = 0 Just for the sake of clarity and the separation of responsibilities, we create two extensions to the View Controller. An extension in Swift is similar to a category in Objective-C programming language, a distinct data structure that adds a method to the class that it extends. Since we don't need the source of the class that the extension extends, we can use this mechanism to add features to third-party classes or even to the CocoaTouch classes. Given this original purpose, extensions can also be used to organize the code inside a source file. This may seem a bit unorthodox, but if it doesn't hurt and is useful. So why not use it? The first extension contains the following logic of the game: private extension ViewController{ enum Comparison{ case Smaller case Greater case Equals } func selectedNumber(number: Int){ } func compareNumber(number: Int, otherNumber: Int) -> Comparison { } } Note that the private keyword is added to the extension, making the methods inside private. This means that other classes that hold a reference to an instance of ViewController can't call these private methods. Also, this piece of code shows that it is possible to create enumerations inside a private extension. The second extension, which looks like this, is used to render all the labels: private extension ViewController{ func extractSecretNumber() { } func renderRange() { } func renderNumGuesses() { } func resetData() { } func resetMsg() { } func reset(){ resetData() renderRange() renderNumGuesses() extractSecretNumber() resetMsg() } } Let's start from the beginning, which is the viewDidLoad method in the case of the View Controller: override func viewDidLoad() { super.viewDidLoad() numberTxtField.becomeFirstResponder() reset() } When the becomeFirstResponder method is called, the component called numberTxtField in our case gets the focus and the keyboard appears. After this, the reset() method is called, as follows: func reset(){ resetData() renderRange() renderNumGuesses() extractSecretNumber() resetMsg() } This basically calls the reset method of each component, as follows: func resetData() { lowerBound = 0 upperBound = 100 numGuesses = 0 } func resetMsg() { messageLbl.text = "" } Then, the method is called and is used to render the two dynamic labels, as follows: func renderRange() { rangeLbl.text = "(lowerBound) and (upperBound)" } func renderNumGuesses() { numGuessesLbl.text = "Number of Guesses: (numGuesses)" } The reset method also extracts the secret number using the arc4random_uniform function and performs some typecast magic to align numbers to the expected numeric type, as follows: func extractSecretNumber() { let diff = upperBound - lowerBound let randomNumber = Int(arc4random_uniform(UInt32(diff))) secretNumber = randomNumber + Int(lowerBound) } Now, all the action is in the onOkPressed action (pun intended): @IBAction func onOkPressed(sender: AnyObject) { guard let number = Int(numberTxtField.text!) else { let alert = UIAlertController(title: nil, message: "Enter a number", preferredStyle: UIAlertControllerStyle.Alert) alert.addAction(UIAlertAction(title: "OK", style: UIAlertActionStyle.Default, handler: nil)) self.presentViewController(alert, animated: true, completion: nil) return } selectedNumber(number) } Here, we retrieve the inserted number. Then, if it is valid (that is, it's not empty, not a word, and so on), we call the selectedNumber method. Otherwise, we present a popup that asks for a number. This code uses the guard Swift 2.0 keyword that allows you to create a really clear code flow. Note that the text property of a UITextField function is optional, but because we are certain that it is present, we can safely unwrap it. Also, the handy Int(String) constructor converts a string into a number only if the string is a valid number. All the juice is in selectedNumber, where there is a switch case: func selectedNumber(number: Int){ switch compareNumber(number, otherNumber: secretNumber){ // The compareNumber basically transforms a compare check into an Enumeration: func compareNumber(number: Int, otherNumber: Int) -> Comparison{ if number < otherNumber { return .Smaller } else if number > otherNumber { return .Greater } return .Equals } Let's go back to the switch statement of selectedNumber; it first checks whether the number inserted is the same as the secret number: case .Equals: let alert = UIAlertController(title: nil, message: "You won in (numGuesses) guesses!", preferredStyle: UIAlertControllerStyle.Alert) alert.addAction(UIAlertAction(title: "OK", style: UIAlertActionStyle.Default, handler: { cmd in self.reset() self.numberTxtField.text = "" })) self.presentViewController(alert, animated: true, completion: nil) If this is the case, a popup with the number of guesses is presented, and when it is dismissed, all the data is cleaned and the game starts again. If the number is smaller, we calculate the lower bound again and then render the feedback labels, as follows: case .Smaller: lowerBound = max(lowerBound, number) messageLbl.text = "Your last guess was too low" numberTxtField.text = "" numGuesses++ renderRange() renderNumGuesses() If the number is greater, the code is similar, but instead of the lower bound, we calculate the upper bound, as follows: case .Greater: upperBound = min(upperBound, number) messageLbl.text = "Your last guess was too high" numberTxtField.text = "" numGuesses++ renderRange() renderNumGuesses() } Et voilà! With this simple code, we have implemented our app. You can download the code of the app from https://github.com/gscalzo/Swift2ByExample/tree/1_GuessTheNumber. Summary This article showed us how, by utilizing the power of Xcode and Swift, we can create a fully working app. Depending on your level of iOS knowledge, you may have found this app either too hard or too simple to understand. If the former is the case, don't loose your enthusiasm. Read the code again and try to execute the app by adding a few strategically placed println() instructions in the code to see the content of the various variables. If the latter is the case, I hope that you have found at least some tricks that you can start to use right now. Of course, simply after reading this article, nobody can be considered an expert in Swift and Xcode. However, the information here is enough to let you understand all the code. Resources for Article: Further resources on this subject: Exploring Swift [article] Swift Power and Performance [article] Creating Mutable and Immutable Classes in Swift [article]
Read more
  • 0
  • 0
  • 7303

article-image-advanced-fetching
Packt
21 Jan 2016
6 min read
Save for later

Advanced Fetching

Packt
21 Jan 2016
6 min read
In this article by Ramin Rad, author of the book Mastering Hibernate, we have discussed various ways of fetching the data from the permanent store. We will focus a little more on annotations related to data fetch. (For more resources related to this topic, see here.) Fetching strategy In Java Persistence API, JPA, you can provide a hint to fetch the data lazily or eagerly using the FetchType. However, some implementations may ignore lazy strategy and just fetch everything eagerly. Hibernate's default strategy is FetchType.LAZY to reduce the memory footprint of your application. Hibernate offers additional fetch modes in addition to the commonly used JPA fetch types. Here, we will discuss how they are related and provide an explanation, so you understand when to use which. JOIN fetch mode The JOIN fetch type forces Hibernate to create a SQL join statement to populate both the entities and the related entities using just one SQL statement. However, the JOIN fetch mode also implies that the fetch type is EAGER, so there is no need to specify the fetch type. To understand this better, consider the following classes: @Entity public class Course { @Id @GeneratedValue private long id; private String title; @OneToMany(cascade=CascadeType.ALL, mappedBy="course") @Fetch(FetchMode.JOIN) private Set<Student> students = new HashSet<Student>(); // getters and setters } @Entity public class Student { @Id @GeneratedValue private long id; private String name; private char gender; @ManyToOne private Course course; // getters and setters } In this case, we are instructing Hibernate to use JOIN to fetch course and student in one SQL statement and this is the SQL that is composed by Hibernate: select course0_.id as id1_0_0_, course0_.title as title2_0_0_, students1_.course_id as course_i4_0_1_, students1_.id as id1_1_1_, students1_.gender as gender2_1_2_, students1_.name as name3_1_2_ from Course course0_ left outer join Student students1_ on course0_.id=students1_.course_id where course0_.id=? As you can see, Hibernate is using a left join all courses and any student that may have signed up for those courses. Another important thing to note is that if you use HQL, Hibernate will ignore JOIN fetch mode and you'll have to specify the join in the HQL. (we will discuss HQL in the next section) In other words, if you fetch a course entity using a statement such as this: List<Course> courses = session .createQuery("from Course c where c.id = :courseId") .setLong("courseId", chemistryId) .list(); Then, Hibernate will use SELECT mode; but if you don't use HQL, as shown in the next example, Hibernate will pay attention to the fetch mode instructions provided by the annotation. Course course = (Course) session.get(Course.class, chemistryId); SELECT fetch mode In SELECT mode, Hibernate uses an additional SELECT statement to fetch the related entities. This mode doesn't affect the behavior of the fetch type (LAZY, EAGER), so they will work as expected. To demonstrate this, consider the same example used in the last section and lets examine the output: select id, title from Course where id=? select course_id, id, gender, name from Student where course_id=? Note that the first Hibernate fetches and populates the Course entity and then uses the course ID to fetch the related students. Also, if your fetch type is set to LAZY and you never reference the related entities, the second SELECT is never executed. SUBSELECT fetch mode The SUBSELECT fetch mode is used to minimize the number of SELECT statements executed to fetch the related entities. If you first fetch the owner entities and then try to access the associated owned entities, without SUBSELECT, Hibernate will issue an additional SELECT statement for every one of the owner entities. Using SUBSELECT, you instruct Hibernate to use a SQL sub-select to fetch all the owners for the list of owned entities already fetched. To understand this better, let's explore the following entity classes. @Entity public class Owner { @Id @GeneratedValue private long id; private String name; @OneToMany(cascade=CascadeType.ALL, mappedBy="owner") @Fetch(FetchMode.SUBSELECT) private Set<Car> cars = new HashSet<Car>(); // getters and setters } @Entity public class Car { @Id @GeneratedValue private long id; private String model; @ManyToOne private Owner owner; // getters and setters } If you try to fetch from the Owner table, Hibernate will only issue two select statements; one to fetch the owners and another to fetch the cars for those owners, by using a sub-select, as follows: select id, name from Owner select owner_id, id, model from Car where owner_id in (select id from Owner) Without the SUBSELECT fetch mode, instead of the second select statement as shown in the preceding section, Hibernate will execute a select statement for every entity returned by the first statement. This is known as the n+1 problem, where one SELECT statement is executed, then, for each returned entity another SELECT statement is executed to fetch the associated entities. Finally, SUBSELECT fetch mode is not supported in the ToOne associations, such as OneToOne or ManyToOne because it was designed for relationships where the ownership of the entities is clear. Batch fetching Another strategy offered by Hibernate is batch fetching. The idea is very similar to SUBSELECT, except that instead of using SUBSELECT, the entity IDs are explicitly listed in the SQL and the list size is determined by the @BatchSize annotation. This may perform slightly better for smaller batches. (Note that all the commercial database engines also perform query optimization.) To demonstrate this, let's consider the following entity classes: @Entity public class Owner { @Id @GeneratedValue private long id; private String name; @OneToMany(cascade=CascadeType.ALL, mappedBy="owner") @BatchSize(size=10) private Set<Car> cars = new HashSet<Car>(); // getters and setters } @Entity public class Car { @Id @GeneratedValue private long id; private String model; @ManyToOne private Owner owner; // getters and setters } Using @BatchSize, we are instructing Hibernate to fetch the related entities (cars) using a SQL statement that uses a where in clause; thus listing the relevant ID for the owner entity, as shown: select id, name from Owner select owner_id, id, model from Car where owner_id in (?, ?) In this case, the first select statement only returned two rows, but if it returns more than the batch size there would be multiple select statements to fetch the owned entities, each fetching 10 entities at a time. Summary In this article, we covered many ways of fetching datasets from the database. Resources for Article: Further resources on this subject: Hibernate Types[article] Java Hibernate Collections, Associations, and Advanced Concepts[article] Integrating Spring Framework with Hibernate ORM Framework: Part 1[article]
Read more
  • 0
  • 0
  • 23051

article-image-creating-mutable-and-immutable-classes-swift
Packt
20 Jan 2016
8 min read
Save for later

Creating Mutable and Immutable Classes in Swift

Packt
20 Jan 2016
8 min read
In this article by Gastón Hillar, author of the book Object-Oriented Programming with Swift, we will learn how to create mutable and immutable classes in Swift. (For more resources related to this topic, see here.) Creating mutable classes So far, we worked with different type of properties. When we declare stored instance properties with the var keyword, we create a mutable instance property, which means that we can change their values for each new instance we create. When we create an instance of a class that defines many public-stored properties, we create a mutable object, which is an object that can change its state. For example, let's think about a class named MutableVector3D that represents a mutable 3D vector with three public-stored properties: x, y, and z. We can create a new MutableVector3D instance and initialize the x, y, and z attributes. Then, we can call the sum method with the delta values for x, y, and z as arguments. The delta values specify the difference between the existing and new or desired value. So, for example, if we specify a positive value of 30 in the deltaX parameter, it means we want to add 30 to the X value. The following lines declare the MutableVector3D class that represents the mutable version of a 3D vector in Swift: public class MutableVector3D { public var x: Float public var y: Float public var z: Float init(x: Float, y: Float, z: Float) { self.x = x self.y = y self.z = z } public func sum(deltaX: Float, deltaY: Float, deltaZ: Float) { x += deltaX y += deltaY z += deltaZ } public func printValues() { print("X: (self.x), Y: (self.y), Z: (self.z))") } } Note that the declaration of the sum instance method uses the func keyword, specifies the arguments with their types enclosed in parentheses, and then declares the body for the method enclosed in curly brackets. The public sum instance method receives the delta values for x, y, and z (deltaX, deltaY and deltaZ) and mutates the object, which means that the method changes the values of x, y, and z. The public printValues method prints the values of the three instance-stored properties: x, y, and z. The following lines create a new MutableVector3D instance method called myMutableVector, initialized with the values for the x, y, and z properties. Then, the code calls the sum method with the delta values for x, y, and z as arguments and finally calls the printValues method to check the new values after the object mutated with the call to the sum method: var myMutableVector = MutableVector3D(x: 30, y: 50, z: 70) myMutableVector.sum(20, deltaY: 30, deltaZ: 15) myMutableVector.printValues() The results of the execution in the Playground are shown in the following screenshot: The initial values for the myMutableVector fields are 30 for x, 50 for y, and 70 for z. The sum method changes the values of the three instance-stored properties; therefore, the object state mutates as follows: myMutableVector.X mutates from 30 to 30 + 20 = 50 myMutableVector.Y mutates from 50 to 50 + 30 = 80 myMutableVector.Z mutates from 70 to 70 + 15 = 85 The values for the myMutableVector fields after the call to the sum method are 50 for x, 80 for y, and 85 for z. We can say that the method mutated the object's state; therefore, myMutableVector is a mutable object and an instance of a mutable class. It's a very common requirement to generate a 3D vector with all the values initialized to 0—that is, x = 0, y = 0, and z = 0. A 3D vector with these values is known as an origin vector. We can add a type method to the MutableVector3D class named originVector to generate a new instance of the class initialized with all the values in 0. Type methods are also known as class or static methods in other object-oriented programming languages. It is necessary to add the class keyword before the func keyword to generate a type method instead of an instance. The following lines define the originVector type method: public class func originVector() -> MutableVector3D { return MutableVector3D(x: 0, y: 0, z: 0) } The preceding method returns a new instance of the MutableVector3D class with 0 as the initial value for all the three elements. The following lines call the originVector type method to generate a 3D vector, the sum method for the generated instance, and finally, the printValues method to check the values for the three elements on the Playground: var myMutableVector2 = MutableVector3D.originVector() myMutableVector2.sum(5, deltaY: 10, deltaZ: 15) myMutableVector2.printValues() The following screenshot shows the results of executing the preceding code in the Playground: Creating immutable classes Mutability is very important in object-oriented programming. In fact, whenever we expose mutable properties, we create a class that will generate mutable instances. However, sometimes a mutable object can become a problem and in certain situations, we want to avoid the objects to change their state. For example, when we work with concurrent code, an object that cannot change its state solves many concurrency problems and avoids potential bugs. For example, we can create an immutable version of the previous MutableVector3D class to represent an immutable 3D vector. The new ImmutableVector3D class has three immutable instance properties declared with the let keyword instead of the previously used var[SI1]  keyword: x, y, and z. We can create a new ImmutableVector3D instance and initialize the immutable instance properties. Then, we can call the sum method with the delta values for x, y, and z as arguments. The sum public instance method receives the delta values for x, y, and z (deltaX, deltaY, and deltaZ), and returns a new instance of the same class with the values of x, y, and z initialized with the results of the sum. The following lines show the code of the ImmutableVector3D class: public class ImmutableVector3D { public let x: Float public let y: Float public let z: Float init(x: Float, y: Float, z: Float) { self.x = x self.y = y self.z = z } public func sum(deltaX: Float, deltaY: Float, deltaZ: Float) -> ImmutableVector3D { return ImmutableVector3D(x: x + deltaX, y: y + deltaY, z: z + deltaZ) } public func printValues() { print("X: (self.x), Y: (self.y), Z: (self.z))") } public class func equalElementsVector(initialValue: Float) -> ImmutableVector3D { return ImmutableVector3D(x: initialValue, y: initialValue, z: initialValue) } public class func originVector() -> ImmutableVector3D { return equalElementsVector(0) } } In the new ImmutableVector3D class, the sum method returns a new instance of the ImmutableVector3D class—that is, the current class. In this case, the originVector type method returns the results of calling the equalElementsVector type method with 0 as an argument. The equalElementsVector type method receives an initialValue argument for all the elements of the 3D vector, creates an instance of the actual class, and initializes all the elements with the received unique value. The originVector type method demonstrates how we can call another type method within a type method. Note that both the type methods specify the returned type with -> followed by the type name (ImmutableVector3D) after the arguments enclosed in parentheses. The following line shows the declaration for the equalElementsVector type method with the specified return type: public class func equalElementsVector(initialValue: Float) -> ImmutableVector3D { The following lines call the originVector type method to generate an immutable 3D vector named vector0 and the sum method for the generated instance and save the returned instance in the new vector1 variable. The call to the sum method generates a new instance and doesn't mutate the existing object: var vector0 = ImmutableVector3D.originVector() var vector1 = vector0.sum(5, deltaX: 10, deltaY: 15) vector1.printValues() The code doesn't allow the users of the ImmutableVector3D class to change the values of the x, y, and z properties declared with the let keyword. The code doesn't compile if you try to assign a new value to any of these properties after they were initialized. Thus, we can say that the ImmutableVector3D class is 100 percent immutable. Finally, the code calls the printValues method for the returned instance (vector1) to check the values for the three elements on the Playground, as shown in the following screenshot: The immutable version adds an overhead compared with the mutable version because it is necessary to create a new instance of the class as a result of calling the sum method. The previously analyzed mutable version just changed the values for the attributes, and it wasn't necessary to generate a new instance. Obviously, the immutable version has both a memory and performance overhead. However, when we work with concurrent code, it makes sense to pay for the extra overhead to avoid potential issues caused by mutable objects. We just have to make sure we analyze the advantages and tradeoffs in order to decide which is the most convenient way of coding our specific classes. Summary In this article, we learned how to create mutable and immutable classes in Swift. Resources for Article: Further resources on this subject: Exploring Swift[article] The Swift Programming Language[article] Playing with Swift[article]
Read more
  • 0
  • 0
  • 16618
article-image-how-create-themed-bootstrap-components-sass
Cameron
20 Jan 2016
5 min read
Save for later

How to create themed Bootstrap components with Sass

Cameron
20 Jan 2016
5 min read
Bootstrap is an essential tool for designers and front-end developers. It is packed with a variety of useful components, including grid systems and CSS styles, that can be easily extended and customized. When using Bootstrap, we often run into a situation in which we would like to use our own branding or theme variations instead of what Bootstrap offers by default. In this article, we’ll look at how to leverage Bootstrap with Sass to approach creating and extending themed Bootstrap components. Bootstrap Variants Bootstrap offers six different variations of components, which are based on different types of states an application might be in. For example, you may have a .label-default or .label-primary, which would be two different types, or you might have an .alert-success, which would be based on the state of the application. Depending on the application you’re building, you may or may not need these variations. Let’s look at how you might use these variants in Sass. Theme Variables With the Sass version of Bootstrap, you can customize variables, leverage component partials, and extend existing mixins to create variations that fit your website or application’s brand. To get started with a custom theme, let’s open the _variables.scss partial and find the variables that begin with $brand. Here we can customize our color palette to fit our brand and even utilize Sass functions like darken($color, $amount) or lighten($color, $amount) to adjust the percentage of color lightness. Let’s change these values to the following: $brand-primary: darken(#5733B7, 6.5%); $brand-success: #3C9514; $brand-info: #C88ED9; $brand-warning: #E1D241; $brand-danger: #C84D17; Buttons One of the most important elements on our webpage that helps to denote our brand is the button. Let’s customize our buttons using the Sass variables and variant mixins included with Bootstrap. First we can decide what variables we’ll need for each variant. By default Bootstrap gives you the color, background, and border. Let’s change the defaults to something that fits our buttons: $btn-default-color: #5733B7; $btn-default-bg: #ffffff; $btn-default-box-shadow: 0px 0px 2px rgba(0, 0, 0, 0.2); With our variables in place, let’s open up _buttons.scss and go to the Alternate buttons section. Here we can see that each class defines a button variation and uses Bootstrap’s button-variant mixin. Since we’ve modified the variables we’ll be using for our buttons, let’s also change the button-variant mixin arguments within mixins/_buttons.scss. @mixin button-variant($color, $background, $shadow) { color: $color; background-color: $background; box-shadow: $shadow; &:hover, &:focus, &.focus, &:active, &.active, .open > &.dropdown-toggle { color: $color; background-color: darken($background, 10%); } &:active, &.active, .open > &.dropdown-toggle { background-image: none; } } Now we can use our new button-variant to create custom buttons that fit our theme: .btn-default { @include button-variant($btn-default-color, $btndefault- bg, $btn-default-box-shadow); } Forms Another common set of elements that you might want to customize are form inputs. Bootstrap has a shared .form-control class that can be modified for this purpose and a few different mixins that help you to style the sizes and validation states. Let’s take a look at a few ways you might create themed form elements. First, in _forms.scss, let’s remove the border-radius, box-shadow, and border from .form-control. Remember, good design doesn’t always need to add something, and often is about removing what’s not needed. This class’s styles will be shared across all our form elements. .form-control { display: block; width: 100%; height: $input-height-base; padding: $padding-base-vertical $padding-basehorizontal; font-size: $font-size-base; line-height: $line-height-base; color: $input-color; background-color: $input-bg; background-image: none; @include transition(border-color ease-in-out .15s, box-shadow ease-in-out .15s); @include form-control-focus; // Placeholder @include placeholder; &[disabled], &[readonly], fieldset[disabled] & { background-color: $input-bg-disabled; opacity: 1; } &[disabled], fieldset[disabled] & { cursor: $cursor-disabled; } } Next, let’s look at adjusting two mixins for our forms in mixins/_forms.scss. Here we’ll remove the box-shadow from each input’s :focused state and the border-radius from each input’s size. @mixin form-control-focus($color: $input-border-focus) { &:focus { border-color: $color; outline: 0; } } @mixin input-size($parent, $input-height, $paddingvertical, $padding-horizontal, $font-size, $line-height) { #{$parent} { height: $input-height; padding: $padding-vertical $padding-horizontal; font-size: $font-size; line-height: $line-height; } select#{$parent} { height: $input-height; line-height: $input-height; } textarea#{$parent}, select[multiple]#{$parent} { height: auto; } } Conclusion Leveraging variables and mixins in Sass will help to speed up your design workflow and make future changes simple. Whether you’re building a marketing site or full style guide for a web application, using Sass with Bootstrap can give you the power and flexibility to create custom, themed components. About the author Cameron is a freelance web designer, developer, and consultant based in Brooklyn, NY. Whether he’s shipping a new MVP feature for an early-stage startup or harnessing the power of cutting-edge technologies with a digital agency, his specialities in UX, Agile, and Front-end Development unlock the possibilities that help his clients thrive. He blogs about design, development, and entrepreneurship and is often tweeting something clever at @cameronjroe.
Read more
  • 0
  • 0
  • 8066

article-image-building-surveys-using-xcode
Packt
19 Jan 2016
14 min read
Save for later

Building Surveys using Xcode

Packt
19 Jan 2016
14 min read
In this article by Dhanushram Balachandran and Edward Cessna author of book Getting Started with ResearchKit, you can find the Softwareitis.xcodeproj project in the Chapter_3/Softwareitis folder of the RKBook GitHub repository (https://github.com/dhanushram/RKBook/tree/master/Chapter_3/Softwareitis). (For more resources related to this topic, see here.) Now that you have learned about the results of tasks from the previous section, we can modify the Softwareitis project to incorporate processing of the task results. In the TableViewController.swift file, let's update the rows data structure to include the reference for processResultsMethod: as shown in the following: //Array of dictionaries. Each dictionary contains [ rowTitle : (didSelectRowMethod, processResultsMethod) ] var rows : [ [String : ( didSelectRowMethod:()->(), processResultsMethod:(ORKTaskResult?)->() )] ] = [] Update the ORKTaskViewControllerDelegate method taskViewController(taskViewController:, didFinishWithReason:, error:) in TableViewController to call processResultsMethod, as shown in the following: func taskViewController(taskViewController: ORKTaskViewController, didFinishWithReason reason: ORKTaskViewControllerFinishReason, error: NSError?) { if let indexPath = tappedIndexPath { //1 let rowDict = rows[indexPath.row] if let tuple = rowDict.values.first { //2 tuple.processResultsMethod(taskViewController.result) } } dismissViewControllerAnimated(true, completion: nil) } Retrieves the dictionary of the tapped row and its associated tuple containing the didSelectRowMethod and processResultsMethod references from rows. Invokes the processResultsMethod with taskViewController.result as the parameter. Now, we are ready to create our first survey. In Survey.swift, under the Surveys folder, you will find two methods defined in the TableViewController extension: showSurvey() and processSurveyResults(). These are the methods that we will be using to create the survey and process the results. Instruction step Instruction step is used to show instruction or introductory content to the user at the beginning or middle of a task. It does not produce any result as its an informational step. We can create an instruction step using the ORKInstructionStep object. It has title and detailText properties to set the appropriate content. It also has the image property to show an image. The ORKCompletionStep is a special type of ORKInstructionStep used to show the completion of a task. The ORKCompletionStep shows an animation to indicate the completion of the task along with title and detailText, similar to ORKInstructionStep. In creating our first Softwareitis survey, let's use the following two steps to show the information: func showSurvey() { //1 let instStep = ORKInstructionStep(identifier: "Instruction Step") instStep.title = "Softwareitis Survey" instStep.detailText = "This survey demonstrates different question types." //2 let completionStep = ORKCompletionStep(identifier: "Completion Step") completionStep.title = "Thank you for taking this survey!" //3 let task = ORKOrderedTask(identifier: "first survey", steps: [instStep, completionStep]) //4 let taskViewController = ORKTaskViewController(task: task, taskRunUUID: nil) taskViewController.delegate = self presentViewController(taskViewController, animated: true, completion: nil) } The explanation of the preceding code is as follows: Creates an ORKInstructionStep object with an identifier "Instruction Step" and sets its title and detailText properties. Creates an ORKCompletionStep object with an identifier "Completion Step" and sets its title property. Creates an ORKOrderedTask object with the instruction and completion step as its parameters. Creates an ORKTaskViewController object with the ordered task that was previously created and presents it to the user. Let's update the processSurveyResults method to process the results of the instruction step and the completion step as shown in the following: func processSurveyResults(taskResult: ORKTaskResult?) { if let taskResultValue = taskResult { //1 print("Task Run UUID : " + taskResultValue.taskRunUUID.UUIDString) print("Survey started at : (taskResultValue.startDate!) Ended at : (taskResultValue.endDate!)") //2 if let instStepResult = taskResultValue.stepResultForStepIdentifier("Instruction Step") { print("Instruction Step started at : (instStepResult.startDate!) Ended at : (instStepResult.endDate!)") } //3 if let compStepResult = taskResultValue.stepResultForStepIdentifier("Completion Step") { print("Completion Step started at : (compStepResult.startDate!) Ended at : (compStepResult.endDate!)") } } } The explanation of the preceding code is given in the following: As mentioned at the beginning, each task run is associated with a UUID. This UUID is available in the taskRunUUID property, which is printed in the first line. The second line prints the start and end date of the task. These are useful user analytics data with regards to how much time the user took to finish the survey. Obtains the ORKStepResult object corresponding to the instruction step using the stepResultForStepIdentifier method of the ORKTaskResult object. Prints the start and end date of the step result, which shows the amount of time for which the instruction step was shown before the user pressed the Get Started or Cancel buttons. Note that, as mentioned earlier, ORKInstructionStep does not produce any results. Therefore, the results property of the ORKStepResult object will be nil. You can use a breakpoint to stop the execution at this line of code and verify it. Obtains the ORKStepResult object corresponding to the completion step. Similar to the instruction step, this prints the start and end date of the step. The preceding code produces screens as shown in the following image: After the Done button is pressed in the completion step, Xcode prints the output that is similar to the following: Task Run UUID : 0A343E5A-A5CD-4E7C-88C6-893E2B10E7F7 Survey started at : 2015-08-11 00:41:03 +0000     Ended at : 2015-08-11 00:41:07 +0000Instruction Step started at : 2015-08-11 00:41:03 +0000   Ended at : 2015-08-11 00:41:05 +0000Completion Step started at : 2015-08-11 00:41:05 +0000   Ended at : 2015-08-11 00:41:07 +0000 Question step Question steps make up the body of a survey. ResearchKit supports question steps with various answer types such as boolean (Yes or No), numeric input, date selection, and so on. Let's first create a question step with the simplest boolean answer type by inserting the following line of code in showSurvey(): let question1 = ORKQuestionStep(identifier: "question 1", title: "Have you ever been diagnosed with Softwareitis?", answer: ORKAnswerFormat.booleanAnswerFormat()) The preceding code creates a ORKQuestionStep object with identifier question 1, title with the question, and an ORKBooleanAnswerFormat object created using the booleanAnswerFormat() class method of ORKAnswerFormat. The answer type for a question is determined by the type of the ORKAnswerFormat object that is passed in the answer parameter. The ORKAnswerFormat has several subclasses such as ORKBooleanAnswerFormat, ORKNumericAnswerFormat, and so on. Here, we are using ORKBooleanAnswerFormat. Don't forget to insert the created question step in the ORKOrderedTask steps parameter by updating the following line: let task = ORKOrderedTask(identifier: "first survey", steps: [instStep, question1, completionStep]) When you run the preceding changes in Xcode and start the survey, you will see the question step with the Yes or No options. We have now successfully added a boolean question step to our survey, as shown in the following image: Now, its time to process the results of this question step. The result is produced in an ORKBooleanQuestionResult object. Insert the following lines of code in processSurveyResults(): //1 if let question1Result = taskResultValue.stepResultForStepIdentifier("question 1")?.results?.first as? ORKBooleanQuestionResult { //2 if question1Result.booleanAnswer != nil { let answerString = question1Result.booleanAnswer!.boolValue ? "Yes" : "No" print("Answer to question 1 is (answerString)") } else { print("question 1 was skipped") } } The explanation of the preceding code is as follows: Obtains the ORKBooleanQuestionResult object by first obtaining the step result using the stepResultForStepIdentifier method, accessing its results property, and finally obtaining the only ORKBooleanQuestionResult object available in the results array. The booleanAnswer property of ORKBooleanQuestionResult contains the user's answer. We will print the answer if booleanAnswer is non-nil. If booleanAnswer is nil, it indicates that the user has skipped answering the question by pressing the Skip this question button. You can disable the skipping-of-a-question step by setting its optional property to false. We can add the numeric and scale type question steps using the following lines of code in showSurvey(): //1 let question2 = ORKQuestionStep(identifier: "question 2", title: "How many apps do you download per week?", answer: ORKAnswerFormat.integerAnswerFormatWithUnit("Apps per week")) //2 let answerFormat3 = ORKNumericAnswerFormat.scaleAnswerFormatWithMaximumValue(10, minimumValue: 0, defaultValue: 5, step: 1, vertical: false, maximumValueDescription: nil, minimumValueDescription: nil) let question3 = ORKQuestionStep(identifier: "question 3", title: "How many apps do you download per week (range)?", answer: answerFormat3) The explanation of the preceding code is as follows: Creates ORKQuestionStep with the ORKNumericAnswerFormat object, created using the integerAnswerFormatWithUnit method with Apps per week as the unit. Feel free to refer to the ORKNumericAnswerFormat documentation for decimal answer format and other validation options that you can use. First creates ORKScaleAnswerFormat with minimum and maximum values and step. Note that the number of step increments required to go from minimumValue to maximumValue cannot exceed 10. For example, maximum value of 100 and minimum value of 0 with a step of 1 is not valid and ResearchKit will raise an exception. The step needs to be at least 10. In the second line, ORKScaleAnswerFormat is fed in the ORKQuestionStep object. The following lines in processSurveyResults() process the results from the number and the scale questions: //1 if let question2Result = taskResultValue.stepResultForStepIdentifier("question 2")?.results?.first as? ORKNumericQuestionResult { if question2Result.numericAnswer != nil { print("Answer to question 2 is (question2Result.numericAnswer!)") } else { print("question 2 was skipped") } } //2 if let question3Result = taskResultValue.stepResultForStepIdentifier("question 3")?.results?.first as? ORKScaleQuestionResult { if question3Result.scaleAnswer != nil { print("Answer to question 3 is (question3Result.scaleAnswer!)") } else { print("question 3 was skipped") } } The explanation of the preceding code is as follows: Question step with ORKNumericAnswerFormat generates the result with the ORKNumericQuestionResult object. The numericAnswer property of ORKNumericQuestionResult contains the answer value if the question is not skipped by the user. The scaleAnswer property of ORKScaleQuestionResult contains the answer for a scale question. As you can see in the following image, the numeric type question generates a free form text field to enter the value, while scale type generates a slider: Let's look at a slightly complicated question type with ORKTextChoiceAnswerFormat. In order to use this answer format, we need to create the ORKTextChoice objects before hand. Each text choice object provides the necessary data to act as a choice in a single choice or multiple choice question. The following lines in showSurvey() create a single choice question with three options: //1 let textChoice1 = ORKTextChoice(text: "Games", detailText: nil, value: 1, exclusive: false) let textChoice2 = ORKTextChoice(text: "Lifestyle", detailText: nil, value: 2, exclusive: false) let textChoice3 = ORKTextChoice(text: "Utility", detailText: nil, value: 3, exclusive: false) //2 let answerFormat4 = ORKNumericAnswerFormat.choiceAnswerFormatWithStyle(ORKChoiceAnswerStyle.SingleChoice, textChoices: [textChoice1, textChoice2, textChoice3]) let question4 = ORKQuestionStep(identifier: "question 4", title: "Which category of apps do you download the most?", answer: answerFormat4) The explanation of the preceding code is as follows: Creates text choice objects with text and value. When a choice is selected, the object in the value property is returned in the corresponding ORKChoiceQuestionResult object. The exclusive property is used in multiple choice questions context. Refer to the documentation for its use. First, creates an ORKChoiceAnswerFormat object with the text choices that were previously created and specifies a single choice type using the ORKChoiceAnswerStyle enum. You can easily change this question to multiple choice question by changing the ORKChoiceAnswerStyle enum to multiple choice. Then, an ORKQuestionStep object is created using the answer format object. Processing the results from a single or multiple choice question is shown in the following. Needless to say, this code goes in the processSurveyResults() method: //1 if let question4Result = taskResultValue.stepResultForStepIdentifier("question 4")?.results?.first as? ORKChoiceQuestionResult { //2 if question4Result.choiceAnswers != nil { print("Answer to question 4 is (question4Result.choiceAnswers!)") } else { print("question 4 was skipped") } } The explanation of the preceding code is as follows: The result for a single or multiple choice question is returned in an ORKChoiceQuestionResult object. The choiceAnswers property holds the array of values for the chosen options. The following image shows the generated choice question UI for the preceding code: There are several other question types, which operate in a very similar manner like the ones we discussed so far. You can find them in the documentations of ORKAnswerFormat and ORKResult classes. The Softwareitis project has implementation of two additional types: date format and time interval format. Using custom tasks, you can create surveys that can skip the display of certain questions based on the answers that the users have provided so far. For example, in a smoking habits survey, if the user chooses "I do not smoke" option, then the ability to not display the "How many cigarettes per day?" question. Form step A form step allows you to combine several related questions in a single scrollable page and reduces the number of the Next button taps for the user. The ORKFormStep object is used to create the form step. The questions in the form are represented using the ORKFormItem objects. The ORKFormItem is similar to ORKQuestionStep, in which it takes the same parameters (title and answer format). Let's create a new survey with a form step by creating a form.swift extension file and adding the form entry to the rows array in TableViewController.swift, as shown in the following: func setupTableViewRows() { rows += [ ["Survey" : (didSelectRowMethod: self.showSurvey, processResultsMethod: self.processSurveyResults)], //1 ["Form" : (didSelectRowMethod: self.showForm, processResultsMethod: self.processFormResults)] ] } The explanation of the preceding code is as follows: The "Form" entry added to the rows array to create a new form survey with the showForm() method to show the form survey and the processFormResults() method to process the results from the form. The following code shows the showForm() method in Form.swift file: func showForm() { //1 let instStep = ORKInstructionStep(identifier: "Instruction Step") instStep.title = "Softwareitis Form Type Survey" instStep.detailText = "This survey demonstrates a form type step." //2 let question1 = ORKFormItem(identifier: "question 1", text: "Have you ever been diagnosed with Softwareitis?", answerFormat: ORKAnswerFormat.booleanAnswerFormat()) let question2 = ORKFormItem(identifier: "question 2", text: "How many apps do you download per week?", answerFormat: ORKAnswerFormat.integerAnswerFormatWithUnit("Apps per week")) //3 let formStep = ORKFormStep(identifier: "form step", title: "Softwareitis Survey", text: nil) formStep.formItems = [question1, question2] //1 let completionStep = ORKCompletionStep(identifier: "Completion Step") completionStep.title = "Thank you for taking this survey!" //4 let task = ORKOrderedTask(identifier: "survey with form", steps: [instStep, formStep, completionStep]) let taskViewController = ORKTaskViewController(task: task, taskRunUUID: nil) taskViewController.delegate = self presentViewController(taskViewController, animated: true, completion: nil) } The explanation of the preceding code is as follows: Creates an instruction and a completion step, similar to the earlier survey. Creates two ORKFormItem objects using the questions from the earlier survey. Notice the similarity with the ORKQuestionStep constructors. Creates ORKFormStep object with an identifier form step and sets the formItems property of the ORKFormStep object with the ORKFormItem objects that are created earlier. Creates an ordered task using the instruction, form, and completion steps and presents it to the user using a new ORKTaskViewController object. The results are processed using the following processFormResults() method: func processFormResults(taskResult: ORKTaskResult?) { if let taskResultValue = taskResult { //1 if let formStepResult = taskResultValue.stepResultForStepIdentifier("form step"), formItemResults = formStepResult.results { //2 for result in formItemResults { //3 switch result { case let booleanResult as ORKBooleanQuestionResult: if booleanResult.booleanAnswer != nil { let answerString = booleanResult.booleanAnswer!.boolValue ? "Yes" : "No" print("Answer to (booleanResult.identifier) is (answerString)") } else { print("(booleanResult.identifier) was skipped") } case let numericResult as ORKNumericQuestionResult: if numericResult.numericAnswer != nil { print("Answer to (numericResult.identifier) is (numericResult.numericAnswer!)") } else { print("(numericResult.identifier) was skipped") } default: break } } } } } The explanation of the preceding code is as follows: Obtains the ORKStepResult object of the form step and unwraps the form item results from the results property. Iterates through each of the formItemResults, each of which will be the result for a question in the form. The switch statement detects the different types of question results and accesses the appropriate property that contains the answer. The following image shows the form step: Considerations for real world surveys Many clinical research studies that are conducted using a pen and paper tend to have well established surveys. When you try to convert these surveys to ResearchKit, they may not convert perfectly. Some questions and answer choices may have to be reworded so that they can fit on a phone screen. You are advised to work closely with the clinical researchers so that the changes in the surveys still produce comparable results with their pen and paper counterparts. Another aspect to consider is to eliminate some of the survey questions if the answers can be found elsewhere in the user's device. For example, age, blood type, and so on, can be obtained from HealthKit if the user has already set them. This will help in improving the user experience of your app. Summary Here we have learned to build surveys using Xcode. Resources for Article: Further resources on this subject: Signing up to be an iOS developer[article] Code Sharing Between iOS and Android[article] Creating a New iOS Social Project[article]
Read more
  • 0
  • 0
  • 16980

article-image-configuration-manager-troubleshooting-toolkit
Packt
18 Jan 2016
8 min read
Save for later

The Configuration Manager Troubleshooting Toolkit

Packt
18 Jan 2016
8 min read
In this article by Peter Egerton and Gerry Hampson, the author of the book Troubleshooting System Center Configuration Manager you will be able to dive deeper in the troubleshoot Configuration Manager concepts. In order to successfully troubleshoot Configuration Manager, there are a number of tools that are recommended to be always kept in your troubleshooting toolkit. These include a mixture of Microsoft provided tools, third-party tools, and some community developed tools. Best of all is that they are free. As it could be expected with the broad scope of functionality within Configuration Manager, there are also quite a variety of different utilities out there, so we need to know where to use the right tool for the problem. We are going to take a look at some commonly used tools and some not so commonly used ones and see what they do and where we can use them. These are not necessarily the be all and end all, but they will certainly help us get on the way to solving problems and undoubtedly save some time. In this article, we are going to cover the following: Registry editor Group policy tools Log file viewer PowerShell Community tools (For more resources related to this topic, see here.) Registry Editor Also worth a mention is the Registry Editor that is built into Microsoft Windows on both server and client operating systems. Most IT administrators know this as regedit.exe and it is the default tool of choice for making any changes to, or just simply viewing the contents of a registry key or value. Many of the Configuration Manager roles and the clients allow us to make changes to enables features such as extended logging or manually changing policy settings by using the registry to do so. It should be noted that changing the registry is not something that should be taken lightly however, as making incorrect changes can result in creating more problems not just in Configuration Manager but the operating system as a whole. If we stick to the published settings though, we should be fine and this can be a fine tool while troubleshooting oddities and problems in a Configuration Manager environment. Group Policy Tools As Configuration Manager is a client management tool, there are certain features and settings on a client such as software updates that may conflict with settings defined in Group Policy. In particular, in larger organizations, it can often be useful to compare and contrast the settings that may conflict between Group Policy and Configuration Manager. Using integrated tools such as Resultant Set of Policy (RSoP) and Group Policy Result (gpresult.exe) or the Group Policy management console as part of the Remote Server Administration Tools (RSAT) can help identify where and why clients are not functioning as expected. We can then move forward and amend group policies as and where required using the Group Policy object editor. Used in combination, these tools can prove essential while dealing with Configuration Manager clients in particular. Log file viewer Those who have spent any time at all working with Configuration Manager will know that it contains quite a few log files, literally hundreds. We will go through the log files in more detail in the next chapter but we will need to use something to read the logs. We can use something as simple as Notepad and to an extent there are some advantages with using this as it is a no nonsense text reader. Having said that, generally speaking most people want a little more when it comes to reading Configuration Manager logs as they can often be long, complex, and frequently refreshed. We have already seen one example of a log viewer as part of the Configuration Manager Support Center, but Configuration Manager includes its own log file viewer that is tailored to the needs of troubleshooting the product logs. In Configuration Manager 2012 versions, we are provided with CMTrace.exe. The previous versions provided us with Trace32.exe or SMSTrace.exe. They are very similar tools but we will highlight some of the features of CMTrace which is the more modern of the two. To begin with, we can typically find CMTrace in the following locations: <INSTALL DRIVE>Program FilesMicrosoft Configuration ManagerToolsCMTrace.exe <INSTALL MEDIA>SMSSETUPTOOLSCMTrace.exe Those that are running Configuration Manager 2012 R2 and up also have CMTrace available out of the box in WinPE when running Operating System Deployments. We can simply hit F8 if we have command support enabled in the WinPE image and type CMTrace. This can also be added to the later stages of a task sequence when running in the full operating system by copying the file onto the hard disk. The single biggest advantage of using CMTrace over a standard text reader is that it is a tail reader which by default is refreshed every 500 milliseconds or, in others words, it will update the window as new lines are logged in the log file; we also have the functionality to pause the file too. The other functionality of CMTrace is to allow filtering of the log based on certain conditions and there is also a highlight feature which can highlight a whole line in yellow if a word we are looking for is found on the line. The program automatically highlights lines if certain words are found such as error or warning, which is useful but can also be a red herring at times, so this is something to be aware of if we come across logs with these key words. We can also merge log files; this is particularly useful when looking at time critical incidents, as we can analyze data from multiple sources in the order they happened and understand the flow of information between the different components. PowerShell PowerShell is here to stay. A phrase often heard recently is Learn PowerShell or learn golf and like it or not you cannot get away from the emphasis on this homemade product from Microsoft. This is evident in just about all the current products, as PowerShell is so deeply embedded. Configuration Manager is no exception to this and although we cannot quite do everything you can in the console, there are an increasing number of cmdlets becoming available, more than 500 at the time of writing. So the question we may ask is where does this come into troubleshooting? Well, for the uninitiated in PowerShell, maybe it won't be the first tool they turn to, but with some experience, we can soon find that performing things like WMI queries and typical console tasks can be made quicker and slicker with PowerShell. If we prefer, we can also read log files from PowerShell and make remote changes to the machines. PowerShell can be a one-stop shop for our troubleshooting needs if we spend the time to pick up the skills. Community tools Finally, as user group community leaders, we couldn't leave this section out of the troubleshooting toolkit. Configuration Manager has such a great collection of community contributors that have likely to have been through our troubleshooting pain before us and either blog about it, post it on a forum or create a fix for it. There is such an array of free tools out there that people share that we cannot ignore them. Outside of troubleshooting specifically, some of the best add-ons available for Configuration Manager are community contributions whether that be from individuals or businesses. There are so many utilities which are ever evolving and not all will suit your needs, but if we browse the Microsoft TechNet galleries, Codeplex and GitHub, you are sure find a great resource to meet your requirements. Why not get involved with a user group too, in terms of troubleshooting, this is probably one of the best things I personally could recommend. It gives access to a network of people who work on the same product as us and are often using them in the same way, so it is quite likely that someone has seen our problem before and can fast forward us to a solution. Microsoft TechNet Galleries:https://www gallery.technet.microsoft.com/ Codeplex: https://www.codeplex.com/ GitHub: https://www.github.com/ Summary In this article, you learned about various troubleshoot Configuration Manager tools such as Registry editor, Group policy tools, Log file viewer, PowerShell, and Community tools. Resources for Article: Further resources on this subject: Basic Troubleshooting Methodology [article] Monitoring and Troubleshooting Networking [article] Troubleshooting your BAM Applications [article]
Read more
  • 0
  • 0
  • 8682
article-image-controlling-relevancy
Packt
18 Jan 2016
19 min read
Save for later

Controlling Relevancy

Packt
18 Jan 2016
19 min read
In this article written by Bharvi Dixit, author of the book Elasticsearch Essentials, we understand that getting a search engine to behave can be very hard. It does not matter if you are a newbie or have years of experience with Elasticsearch or Solr, you must have definitely struggled with low-quality search results in your application. The default algorithm of Lucene does not come close to meeting your requirements, and there is always a struggle to deliver the relevant search results. We will be covering the following topics: (For more resources related to this topic, see here.) Introducing relevant search Out of the Box Tools from Elasticsearch Controlling relevancy with custom scoring Introducing relevant search Relevancy is the root of a search engine's value proposition and can be defined as the art of ranking content for a user's search based on how much that content satisfies the needs of the user or the business. In an application, it does not matter how beautiful your user interface looks or how many functionalities you are providing to the user; search relevancy cannot be avoided at any cost. So, despite of the mystical behavior of search engines, you have to find a solution to get the relevant results. The relevancy becomes more important because a user does not care about the whole bunch of documents that you have. The user enters his keywords, selects filters, and focuses on a very small amount of data—the relevant results. And if your search engine fails to deliver according to expectations, the user might be annoyed, which might be a loss for your business. A search engine like Elasticsearch comes with a built-in intelligence. You enter the keyword and within a blink of an eye, it returns to you the results that it thinks are relevant according to its intelligence. However, Elasticsearch does not a built-in intelligence according to your application domain. The relevancy is not defined by a search engine; rather it is defined by your users, their business needs, and the domains. Take an example of Google or Twitter, they have put in years of engineering experience, but still fail occasionally while providing relevancy. Don't they? Further, the challenges of search differ with the domain: the search on an e-commerce platform is about driving sales and bringing positive customer outcomes, whereas in fields such as medicine, it is about the matter of life and death. The lives of search engineers become more complicated because they do not have domain-specific knowledge, which can be used to understand the semantics of user queries. However, despite of all the challenges, the implementation of search relevancy is up to you, and it depends on what information you can extract from the users, their queries, and the content they see. We continuously take feedbacks from the users, create funnels, or enable loggings to capture the search behavior of the users so that we can improve our algorithms to provide the relevant results. The Elasticsearch out-of-the-box tools Elasticsearch primarily works with two models of information retrieval: the Boolean model and the Vector Space model. In addition to these, there are other scoring algorithms available in Elasticsearch as well, such as Okapi BM25, Divergence from Randomness (DFR), and Information Based (IB). Working with these three models requires an extensive mathematical knowledge and needs some extra configurations in Elasticsearch. The Boolean model uses the AND, OR, and NOT conditions in a query to find all the matching documents. This Boolean model can be further combined with the Lucene scoring formula, TF/IDF, to rank documents. The Vector Space model works differently from the Boolean model, as it represents both queries and documents as vectors. In the vector space model, each number in the vector is the weight of a term that is calculated using TF/IDF. The queries and documents are compared using a cosine similarity in which angles between two vectors are compared to find the similarity, which ultimately leads to finding the relevancy of the documents. An example: why defaults are not enough Let's build an index with sample documents to understand the examples in a better way. First, create an index with the name profiles: curl -XPUT 'localhost:9200/profiles' Then, put the mapping with the document type as candidate: curl -XPUT 'localhost:9200/profiles/candidate' {  "properties": {    "geo_code": {      "type": "geo_point",      "lat_lon": true    }  } } Please note that in preceding mapping, we are putting mapping only for the geo data type. The rest of the fields will be indexed dynamically. Now, you can create a data.json file with the following content in it: { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 1 }} { "name" : "Sam", "geo_code" : "12.9545163,77.3500487", "total_experience":5, "skills":["java","python"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 2 }} { "name" : "Robert", "geo_code" : "28.6619678,77.225706", "total_experience":2, "skills":["java"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 3 }} { "name" : "Lavleen", "geo_code" : "28.6619678,77.225706", "total_experience":4, "skills":["java","Elasticsearch"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 4 }} { "name" : "Bharvi", "geo_code" : "28.6619678,77.225706", "total_experience":3, "skills":["java","lucene"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 5 }} { "name" : "Nips", "geo_code" : "12.9545163,77.3500487", "total_experience":7, "skills":["grails","python"] } { "index" : { "_index" : "profiles", "_type" : "candidate", "_id" : 6 }} { "name" : "Shikha", "geo_code" : "28.4250666,76.8493508", "total_experience":10, "skills":["c","java"] }  If you are indexing skills, which are separated by spaces or which include non-English characters, that is, c++, c#, or core java, you need to create mapping for the skills field as not_analyzed in advance to have exact term matching. Once the file is created, execute the following command to put the data inside the index we have just created: curl -XPOST 'localhost:9200' --data-binary @data.json If you look carefully at the example, the documents contain the data of the candidates who might be looking for jobs. For hiring candidates, a recruiter can have the following criteria: Candidates should know about Java Candidate should have an experience between 3 to 5 years Candidate should fall in the distance range of 100 kilometers from the office of the recruiter. You can construct a simple bool query in combination with a term query on the skills field along with geo_distance and range filters on the geo_code and total_experience fields respectively. However, does this give a relevant set of results? The answer would be NO. The problem is that if you are restricting the range of experience and distance, you might even get zero results or no suitable candidate. For example, you can put a range of 0 to 100 kilometers of distance but your perfect candidate might be at a distance of 101 kilometers. At the same time, if you define a wide range, you might get a huge number of non-relevant results. The other problem is that if you search for candidates who know Java, there are chances that a person who knows only Java and not any other programming language will be at the top, while a person who knows other languages apart from Java will be at the bottom. This happens because during the ranking of documents with TF/IDF, the lengths of the fields are taken into account. If the length of a field is small, the document is more relevant. Elasticsearch is not intelligent enough to understand the semantic meaning of your queries but for these scenarios, it offers you the full power to redefine how scoring and document ranking should be done. Controlling relevancy with custom scoring In most cases, you are good to go with the default scoring algorithms of Elasticsearch to return the most relevant results. However, some cases require you to have more control on the calculation of a score. This is especially required while implementing a domain-specific logic such as finding the relevant candidates for a job, where you need to implement a very specific scoring formula. Elasticsearch provides you with the function_score query to take control of all these things. Here we cover the code examples only in Java because a Python client gives you the flexibility to pass the query inside the body parameter of a search function. Python programmers can simply use the example queries in the same way. There is no extra module required to execute these queries. function_score query Function score query allows you to take the complete control of how a score needs to be calculated for a particular query: Syntax of a function_score query: {   "query": {"function_score": {     "query": {},     "boost": "boost for the whole query",     "functions": [       {}     ],     "max_boost": number,     "score_mode": "(multiply|max|...)",     "boost_mode": "(multiply|replace|...)",     "min_score" : number   }} } The function_score query has two parts: the first is the base query that finds the overall pool of results you want. The second part is the list of functions, which are used to adjust the scoring. These functions can be applied to each document that matches the main query in order to alter or completely replace the original query _score. In a function_score query, each function is composed of an optional filter that tells Elasticsearch which records should have their scores adjusted (defaults to "all records") and a description of how to adjust the score. The other parameters that can be used with a functions_score query are as follows: boost: An optional parameter that defines the boost for the entire query. max_boost: The maximum boost that will be applied by a function score. boost_mode: An optional parameter, which defaults to multiply. Score mode defines how the combined result of the score functions will influence the final score together with the subquery score. This can be replace (only the function score is used, the query score is ignored), max (the maximum of the query score and the function score), min (the minimum of the query score and the function score), sum (the query score and the function score are added), avg, or multiply (the query score and the function score are multiplied). score_mode: This parameter specifies how the results of individual score functions will be aggregated. The possible values can be first (the first function that has a matching filter is applied), avg, max, sum, min, and multiply. min_score: The minimum score to be used. Excluding Non-Relevant Documents with min_score To exclude documents that do not meet a certain score threshold, the min_score parameter can be set to the desired score threshold. The following are the built-in functions that are available to be used with the function score query: weight field_value_factor script_score The decay functions—linear, exp, and gauss Let's see them one by one and then you will learn how to combine them in a single query. weight A weight function allows you to apply a simple boost to each document without the boost being normalized: a weight of 2 results in 2 * _score. For example: GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "term": {           "skills": {             "value": "java"           }         }       },       "functions": [         {           "filter": {             "term": {               "skills": "python"             }           },           "weight": 2         }       ],       "boost_mode": "replace"     }   } } The preceding query will match all the candidates who know Java, but will give a higher score to the candidates who also know Python. Please note that boost_mode is set to replace, which will cause _score to be calculated by a query that is to be overridden by the weight function for our particular filter clause. The query output will contain the candidates on top with a _score of 2 who know both Java and Python. Java example The previous query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore.FunctionScoreQueryBuilder; import org.elasticsearch.index.query.functionscore.ScoreFunctionBuilders; Then the following code snippets can be used to implement the query: FunctionScoreQueryBuilder functionQuery = new FunctionScoreQueryBuilder(QueryBuilders.termQuery("skills", "java"))     .add(QueryBuilders.termQuery("skills", "python"),   ScoreFunctionBuilders.weightFactorFunction(2)).boostMode("replace");   SearchResponse response = client.prepareSearch().setIndices(indexName)         .setTypes(docType).setQuery(functionQuery)         .execute().actionGet(); field_value_factor It uses the value of a field in the document to alter the _score: GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "term": {           "skills": {             "value": "java"           }         }       },       "functions": [         {           "field_value_factor": {             "field": "total_experience"           }         }       ],       "boost_mode": "multiply"     }   } } The preceding query finds all the candidates with java in their skills, but influences the total score depending on the total experience of the candidate. So, the more experience the candidate will have, the higher ranking he will get. Please note that boost_mode is set to multiply, which will yield the following formula for the final scoring: _score = _score * doc['total_experience'].value However, there are two issues with the preceding approach: first are the documents that have the total experience value as 0 and will reset the final score to 0. Second, Lucene _score usually falls between 0 and 10, so a candidate with an experience of more than 10 years will completely swamp the effect of the full text search score. To get rid of this problem, apart from using the field parameter, the field_value_factor function provides you with the following extra parameters to be used: factor: This is an optional factor to multiply the field value with. This defaults to 1. modifier: This is a mathematical modifier to apply to the field value. This can be :none, log, log1p, log2p, ln, ln1p, ln2p, square, sqrt, or reciprocal. It defaults to none. Java example The preceding query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore*; Then the following code snippets can be used to implement the query: FunctionScoreQueryBuilder functionQuery = new FunctionScoreQueryBuilder(QueryBuilders.termQuery("skills", "java"))     .add(new FieldValueFactorFunctionBuilder("total_experience")).boostMode("multiply");   SearchResponse response = client.prepareSearch().setIndices("profiles")         .setTypes("candidate").setQuery(functionQuery)         .execute().actionGet(); script_score script_score is the most powerful function available in Elasticsearch. It uses a custom script to take complete control of the scoring logic. You can write a custom script to implement the logic you need. Scripting allows you to write from a simple to very complex logic. Scripts are cached, too, to allow faster executions of repetitive queries. Let's see an example: {   "script_score": {     "script": "doc['total_experience'].value"   } } Look at the special syntax to access the field values inside the script parameter. This is how the value of the fields is accessed using groovy scripting language. Scripting is, by default, disabled in Elasticsearch, so to use script score functions, first you need to add this line in your elasticsearch.yml file: script.inline: on To see some of the power of this function, look at the following example: GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "term": {           "skills": {             "value": "java"           }         }       },       "functions": [         {           "script_score": {             "params": {               "skill_array_provided": [                 "java",                 "python"               ]             },             "script": "final_score=0; skill_array = doc['skills'].toArray(); counter=0; while(counter<skill_array.size()){for(skill in skill_array_provided){if(skill_array[counter]==skill){final_score = final_score+doc['total_experience'].value};};counter=counter+1;};return final_score"           }         }       ],       "boost_mode": "replace"     }   } } Let's understand the preceding query: params is the placeholder where you can pass the parameters to your function, similar to how you use parameters inside a method signature in other languages. Inside the script parameter, you write your complete logic. This script iterates through each document that has Java mentioned in the skills, and for each document, it fetches all the skills and stores them inside the skill_array variable. Finally, each skill that we have passed inside the params section is compared with the skills inside skill_array. If this matches, the value of the final_score variable is incremented with the value of the total_experience field of that document. The score calculated by the script score will be used to rank the documents because boost_mode is set to replace the original _score value. Do not try to work with the analyzed fields while writing the scripts. You might get weird results. This is because, had our skills field contained a value such as "core java", you could not have got the exact matching for it inside the script section. So, the fields with space-separated values need to be set as not_analyzed or the keyword has to be analyzed in advance. To write these script functions, you need to have some command over groovy scripting. However, if you find it complex, you can write these scripts in other languages, such as python, using the language plugin of Elasticsearch. More on this can be found here: https://github.com/elastic/elasticsearch-lang-python For a fast performance, use Groovy or Java functions. Python and JavaScript code requires the marshalling and unmarshalling of values that kill performances due to more CPU/memory usage. Java example The previous query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore.*; import org.elasticsearch.script.Script; Then, the following code snippets can be used to implement the query: String script = "final_score=0; skill_array =            doc['skills'].toArray(); "         + "counter=0; while(counter<skill_array.size())"         + "{for(skill in skill_array_provided)"         + "{if(skill_array[counter]==skill)"         + "{final_score =     final_score+doc['total_experience'].value};};"         + "counter=counter+1;};return final_score";   ArrayList<String> skills = new ArrayList<String>();   skills.add("java");   skills.add("python");   Map<String, Object> params = new HashMap<String, Object>();   params.put("skill_array_provided",skills);   FunctionScoreQueryBuilder functionQuery = new   FunctionScoreQueryBuilder(QueryBuilders.termQuery("skills", "java"))     .add(new ScriptScoreFunctionBuilder(new Script(script,   ScriptType.INLINE, "groovy", params))).boostMode("replace");   SearchResponse response =   client.prepareSearch().setIndices(indexName)         .setTypes(docType).setQuery(functionQuery)         .execute().actionGet(); As you can see, the script logic is a simple string that is used to instantiate the Script class constructor inside ScriptScoreFunctionBuilder. Decay functions - linear, exp, gauss We have seen the problems of restricting the range of experience and distance that could result in getting zero results or no suitable candidates. May be a recruiter would like to hire a candidate from a different province because of a good candidate profile. So, instead of completely restricting with the range filters, we can incorporate sliding-scale values such as geo_location or dates into _score to prefer documents near a latitude/longitude point or recently published documents. Function score provide to work with this sliding scale with the help of three decay functions: linear, exp (that is, exponential), and gauss (that is, Gaussian). All three functions take the same parameter as shown in the following code and are required to control the shape of the curve created for the decay function: origin, scale, decay, and offset. The point of origin is used to calculate distance. For date fields, the default is the current timestamp. The scale parameter defines the distance from the origin at which the computed score will be equal to the decay parameter. The origin and scale parameters can be thought of as your min and max that define a bounding box within which the curve will be defined. If we want to give more boosts to the documents that have been published in the past10 days, it would be best to define the origin as the current timestamp and the scale as 10d. The offset specifies that the decay function will only compute the decay function of the  documents with a distance greater that the defined offset. The default is 0. Finally, the decay option alters how severely the document is demoted based on its position. The default decay value is 0.5. All three decay functions work only on numeric, date, and geo-point fields. GET profiles/candidate/_search {   "query": {     "function_score": {       "query": {         "match_all": {}       },       "functions": [         {           "exp": {             "geo_code": {               "origin": {                 "lat": 28.66,                 "lon": 77.22               },               "scale": "100km"             }           }         }       ],"boost_mode": "multiply"     }   } } In the preceding query, we have used the exponential decay function that tells Elasticsearch to start decaying the score calculation after a distance of 100 km from the given origin. So, the candidates who are at a distance of greater than 100km from the given origin will be ranked low, but not discarded. These candidates can still get a higher rank if we combine other functions score queries such as weight or field_value_factor with the decay function and combine the result of all the functions together. Java example: The preceding query can be implemented in Java in the following way: First, you need to import the following classes into your code: import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.index.query.functionscore.*; Then, the following code snippets can be used to implement the query: Map<String, Object> origin = new HashMap<String, Object>();     String scale = "100km";     origin.put("lat", "28.66");     origin.put("lon", "77.22"); FunctionScoreQueryBuilder functionQuery = new     FunctionScoreQueryBuilder()     .add(new ExponentialDecayFunctionBuilder("geo_code",origin,     scale)).boostMode("multiply"); //For Linear Decay Function use below syntax //.add(new LinearDecayFunctionBuilder("geo_code",origin,   scale)).boostMode("multiply"); //For Gauss Decay Function use below syntax //.add(new GaussDecayFunctionBuilder("geo_code",origin,   scale)).boostMode("multiply");     SearchResponse response = client.prepareSearch().setIndices(indexName)         .setTypes(docType).setQuery(functionQuery)         .execute().actionGet(); In the preceding example, we have used the exp decay function but, the commented lines show examples of how other decay functions can be used. At last, as always, remember that Elasticsearch lets  you use multiple functions in a single function_score query to calculate a score that combines the results of each function. Summary Overall we covered the most important aspects of search engines, that is, relevancy. We discussed the powerful scoring capabilities available in Elasticsearch and the practical examples to show how you can control the scoring process according to your needs. Despite the relevancy challenges faced while working with search engines, the out–of-the-box features such as functions scores and custom scoring always allow us to tackle challenges with ease. Resources for Article:   Further resources on this subject: An Introduction to Kibana [article] Extending Chef [article] Introduction to Hadoop [article]
Read more
  • 0
  • 0
  • 9544

article-image-integration-hadoop
Packt
18 Jan 2016
18 min read
Save for later

Integration with Hadoop

Packt
18 Jan 2016
18 min read
In this article by Cyrus Dasadia, author of the book, MongoDB Cookbook Second Edition, we will cover the following recipes: Executing our first sample MapReduce job using the mongo-hadoop connector Writing our first Hadoop MapReduce job (For more resources related to this topic, see here.) Hadoop is a well-known open source software for the processing of large datasets. It also has an API for the MapReduce programming model, which is widely used. Nearly all the big data solutions have some sort of support to integrate them with Hadoop in order to use its MapReduce framework. MongoDB too has a connector that integrates with Hadoop and lets us write MapReduce jobs using the Hadoop MapReduce API, process the data residing in the MongoDB/MongoDB dumps, and write the result back to the MongoDB/MongoDB dump files. In this article, we will be looking at some recipes around the basic MongoDB and Hadoop integration. Executing our first sample MapReduce job using the mongo-hadoop connector In this recipe, we will see how to build the Mongo-Hadoop connector from the source and set up Hadoop just for the purpose of running the examples in a standalone mode. The connector is the backbone that runs Hadoop MapReduce jobs on Hadoop using the data in Mongo. Getting ready There are various distributions of Hadoop; however, we will use Apache Hadoop (http://hadoop.apache.org/). The installation will be done on Ubuntu Linux. For production, Apache Hadoop always runs on the Linux environment and Windows is not tested for production systems. For development purposes, however, Windows can be used. If you are a Windows user, I would recommend installing a virtualization environment such as VirtualBox (https://www.virtualbox.org/), set up a Linux environment, and then install Hadoop on it. Setting up VirtualBox and Linux on it is not shown in this recipe, but this is not a tedious task. The prerequisite for this recipe is a machine with a Linux operating system on it and an Internet connection. The version that we will set up here is 2.4.0 of Apache Hadoop. The latest version of Apache Hadoop that is supported by the mongo-hadoop connector is 2.4.0. A Git client is needed to clone the repository of the mongo-hadoop connector to the local filesystem. Refer to http://git-scm.com/book/en/Getting-Started-Installing-Git to install Git. You will also need MongoDB to be installed on your operating system. Refer to http://docs.mongodb.org/manual/installation/ and install it accordingly. Start the mongod instance listening to port 27017. It is not expected for you to be an expert in Hadoop but some familiarity with it will be helpful. Knowing the concept of MapReduce is important and knowing about the Hadoop MapReduce API will be an advantage. In this recipe, we will be explaining what is needed to get the work done. You can get more details on Hadoop and its MapReduce API from other sources. The Wikipedia page, http://en.wikipedia.org/wiki/MapReduce, gives good enough information about the MapReduce programming. How to do it… We will first install Java, Hadoop, and the required packages. We will start with installing JDK on the operating system.  Type the following on the command prompt of the operating system: $ javac –version If the program doesn't execute and you are told about the various packages that contain javac and program, then we need to install them as follows: $ sudo apt-get install default-jdk This is all we need to do to install Java Visit the URL, http://www.apache.org/dyn/closer.cgi/hadoop/common/, and download version 2.4 (or the latest mongo-hadoop connector supports). After the .tar.gz file has been downloaded, execute the following on the command prompt: $ tar –xvzf <name of the downloaded .tar.gz file> $ cd <extracted directory> Open the etc/hadoop/hadoop-env.sh file and replace export JAVA_HOME = ${JAVA_HOME} with export JAVA_HOME = /usr/lib/jvm/default-java. We will now get the mongo-hadoop connector code from GitHub on our local filesystem. Note that you don't need a GitHub account to clone a repository. Clone the git project from the operating system command prompt as follows: $git clone https://github.com/mongodb/mongo-hadoop.git $cd mongo-hadoop Create a soft link; the Hadoop installation directory is the same as the one that we extracted in step 3: $ln –s <hadoop installation directory> ~/hadoop-binaries For example, if Hadoop is extracted/installed in the home directory, then this is the command to be executed: $ln –s ~/hadoop-2.4.0 ~/hadoop-binaries By default, the mongo-hadoop connector will look for a Hadoop distribution in the ~/hadoop-binaries folder. So, even if the Hadoop archive is extracted elsewhere, we can create a soft link to it. Once this link is created, we should have the Hadoop binaries in the ~/hadoop-binaries/hadoop-2.4.0/bin path. We will now build the mongo-hadoop connector from the source for the Apache Hadoop version 2.4.0. The build-by-default builds for the latest version, so as of now, the -Phadoop_version parameter can be left out as 2.4 is the latest anyways. $./gradlew jar –Phadoop_version='2.4' This build process would take some time to get completed. Once the build is completed successfully, we are ready to execute our first MapReduce job. We will be doing it using a treasuryYield sample provided with the mongo-hadoop connector project. The first activity is to import the data to a collection in Mongo. Assuming that the mongod instance is up and running and listening to port 27017 for connections and the current directory is the root of the mongo-hadoop connector code base, execute the following command: $ mongoimport -c yield_historical.in -d mongo_hadoop --drop examples/treasury_yield/src/main/resources/yield_historical_in.json Once the import action is successful, we are left with copying two JAR files to the lib directory. Execute the following from the operating system shell: $ wget http://repo1.maven.org/maven2/org/mongodb/mongo-java-driver/2.12.0/mongo-java-driver-2.12.0.jar $ cp core/build/libs/mongo-hadoop-core-1.2.1-SNAPSHOT-hadoop_2.4.jar ~/hadoop-binaries/hadoop-2.4.0/lib/ $ mv mongo-java-driver-2.12.0.jar ~/hadoop-binaries/hadoop-2.4.0/lib The jar built for the mongo-hadoop core to be copied was named as above for the trunk version of the code and built for hadoop-2.4.0. Change the name of the JAR accordingly when you build it yourselves for a different version of the connector and Hadoop. The Mongo driver can be the latest version. The version 2.12.0 is the latest version. Now, execute the following command on the command prompt of the operating system shell:  ~/hadoop-binaries/hadoop-2.4.0/bin/hadoop     jar     examples/treasury_yield/build/libs/treasury_yield-1.2.1-SNAPSHOT-hadoop_2.4.jar  com.mongodb.hadoop.examples.treasury.TreasuryYieldXMLConfig  -Dmongo.input.split_size=8     -Dmongo.job.verbose=true  -Dmongo.input.uri=mongodb://localhost:27017/mongo_hadoop.yield_historical.in  -Dmongo.output.uri=mongodb://localhost:27017/mongo_hadoop.yield_historical.out The output should print out a lot of things; however, the following line in the output should tell us that the MapReduce job is successful:  14/05/11 21:38:54 INFO mapreduce.Job: Job job_local1226390512_0001 completed successfully Connect the mongod instance running on localhost from the mongo client and execute a find on the following collection: $ mongo > use mongo_hadoop switched to db mongo_hadoop > db.yield_historical.out.find() How it works… Installing Hadoop is not a trivial task and we don't need to get into this to try our samples for the mongo-hadoop connector. To learn about Hadoop and its installation, there are dedicated books and articles available. For the purpose of this article, we will simply be downloading the archive and extracting and running the MapReduce jobs in a standalone mode. This is the quickest way to get going with Hadoop. All the steps up to step 6 are needed to install Hadoop. In the next couple of steps, we will simply clone the mongo-hadoop connector recipe. You can also download a stable, built version for your version of Hadoop from https://github.com/mongodb/mongo-hadoop/releases if you prefer not to build from the source. We will then build the connector for our version of Hadoop (2.4.0) till step 13. From step 14 onwards is what we will do to run the actual MapReduce job in order to work on the data in MongoDB. We imported the data to the yield_historical.in collection, which would be used as an input to the MapReduce job. Go ahead and query the collection from the Mongo shell using the mongo_hadoop database to see a document. Don't worry if you don't understand the contents; we want to see in this example what we intend to do with this data. The next step was to invoke the MapReduce operation on the data. The hadoop command was executed giving one jar's path, (examples/treasury_yield/build/libs/treasury_yield-1.2.1-SNAPSHOT-hadoop_2.4.jar). This is the jar that contains the classes implementing a sample MapReduce operation for the treasury yield. The com.mongodb.hadoop.examples.treasury.TreasuryYieldXMLConfig class in this JAR file is the bootstrap class containing the main method. We will visit this class soon. There are lots of configurations supported by the connector.For now, we will just remember that mongo.input.uri and mongo.output.uri are the collections for the input and output for the MapReduce operations. With the project cloned, you can import it to any Java IDE of your choice. We are particularly interested in the project at /examples/treasury_yield and the core present in the root of the cloned repository. Let's look at the com.mongodb.hadoop.examples.treasury.TreasuryYieldXMLConfig class. This is the entry point to the MapReduce method and has a main method in it. To write MapReduce jobs for mongo using the mongo-hadoop connector, the main class always has to extend from com.mongodb.hadoop.util.MongoTool. This class implements the org.apache.hadoop.Tool interface, which has the run method and is implemented for us by the MongoTool class. All that the main method needs to do is execute this class using the org.apache.hadoop.util.ToolRunner class by invoking its static run method passing the instance of our main class (which is an instance of Tool). There is a static block that loads the configurations from two XML files, hadoop-local.xml and mongo-defaults.xml. The format of these files (or any XML file) is as follows. The root node of the file is the configuration node and multiple property nodes under it. <configuration>   <property>     <name>{property name}</name>     <value>{property value}</value>   </property>   ... </configuration> The property values that make sense in this context are all those that we mentioned in the URL provided earlier. We instantiate com.mongodb.hadoop.MongoConfig wrapping an instance of org.apache.hadoop.conf.Configuration in the constructor of the bootstrap class, TreasuryYieldXmlConfig. The MongoConfig class provides sensible defaults that are enough to satisfy the majority of the use cases. Some of the most important things that we need to set in the MongoConfig instance are to set the output and input format, mapper and reducer classes, output key, and value of the mapper, output key, and reducer. The input format and output format will always be the com.mongodb.hadoop.-MongoInputFormat and com.mongodb.hadoop.MongoOutputFormat classes, which are provided by the mongo-hadoop connector library. For the mapper and reducer output key and its value, we have the org.apache.hadoop.io.Writable implementation. Refer to the Hadoop documentation for different types of writable implementations in the org.apache.hadoop.io package. Apart from these, the mongo-hadoop connector also provides us with some implementations in the com.mongodb.hadoop.io package. For the treasury yield example, we used the BSONWritable instance. These configurable values can either be provided in the XML file that we saw earlier or be programmatically set. Finally, we have the option to provide them as vm arguments that we did for mongo.input.uri and mongo.output.uri. These parameters can be provided either in the XML or invoked directly from the code on the MongoConfig instance; the two methods are setInputURI and setOutputURI, respectively. We will now look at the mapper and reducer class implementation. We will copy the important portion of the class here to analyze. Refer to the cloned project for the entire implementation. public class TreasuryYieldMapper     extends Mapper<Object, BSONObject, IntWritable, DoubleWritable> {       @Override     public void map(final Object pKey,                     final BSONObject pValue,                     final Context pContext)         throws IOException, InterruptedException {         final int year = ((Date) pValue.get("_id")).getYear() + 1900;         double bid10Year = ((Number) pValue.get("bc10Year")).doubleValue();         pContext.write(new IntWritable(year), new DoubleWritable(bid10Year));     } } Our mapper extends the org.apache.hadoop.mapreduce.Mapper class. The four generic parameters are for the key class, type of the input value, type of the output key, and output value. The body of the map method reads the _id value from the input document, which is the date and extracts the year out of it. Then, it gets the double value from the document for the bc10Year field and simply writes to the context key value pair, where the key is the year and the value is the double. The implementation here doesn't rely on the value of the pKey parameter passed, which can be used as the key instead of hardcoding the _id value in the implementation. This value is basically the same field that would be set using the mongo.input.key property in XML or using the MongoConfig.setInputKey method. If none is set, _id is anyways the default value. Let's look at the reducer implementation (with the logging statements removed): public class TreasuryYieldReducer     extends Reducer<IntWritable, DoubleWritable, IntWritable, BSONWritable> {       @Override     public void reduce(final IntWritable pKey, final Iterable<DoubleWritable> pValues, final Context pContext)         throws IOException, InterruptedException {         int count = 0;         double sum = 0;         for (final DoubleWritable value : pValues) {             sum += value.get();             count++;         }         final double avg = sum / count;         BasicBSONObject output = new BasicBSONObject();         output.put("count", count);         output.put("avg", avg);         output.put("sum", sum);         pContext.write(pKey, new BSONWritable(output));     } } This class extends from org.apache.hadoop.mapreduce.Reducer and has four generic parameters again for the input key, input value, output key, and output value. The input to the reducer is the output from the mapper, and if you notice carefully, the type of the first two generic parameters are the same as the last two generic parameters of the mapper that we saw earlier. The third and fourth parameters in this case are the type of the key and the value emitted from the reducer. The type of the value is BSONDocument, and thus we have BSONWritable as the type. We now have the reduce method that has two parameters: the first one is the key, which is same as the key emitted from the map function, and the second parameter is java.lang.Iterable of the values emitted for the same key. This is how standard MapReduce functions work. For instance, if the map function gave the following key value pairs, (1950, 10), (1960, 20), (1950, 20), (1950, 30), then reduce will be invoked with two unique keys, 1950 and 1960, and the values for the key 1950 will be an iterable with (10, 20, 30), where the value of 1960 will be an iterable of a single element, (20). The reducer's reduce function simply iterates though this iterable of doubles, finds the sum and count of these numbers, and writes one key value pair, where the key is the same as the incoming key and the output value is BasicBSONObject with the sum, count, and average in it for the computed values. There are some good samples, including the enron dataset, in the examples of the cloned mongo-hadoop connector. If you would like to play around a bit, I would recommend that you to take a look at these example projects too and run them. There's more… What we saw here is a readymade sample that we executed. There is nothing like writing one MapReduce job ourselves for our understanding. In the next recipe, we will write one sample MapReduce job using the Hadoop API in Java and see it in action. See also If you're wondering what the writable interface is all about and why you should not use plain old serialization instead, then refer to this URL, which gives the explanation by the creator of Hadoop himself: http://www.mail-archive.com/hadoop-user@lucene.apache.org/msg00378.html Writing our first Hadoop MapReduce job In this recipe, we will write our first MapReduce job using the Hadoop MapReduce API and run it using the mongo-hadoop connector getting the data from MongoDB. Getting ready Refer to the previous recipe, Executing our first sample MapReduce job using mongo-hadoop connector, for the setting up of the mongo-hadoop connector. This is a maven project and thus maven needs to be set up and installed. This project, however, is built on Ubuntu Linux and you need to execute the following command from the operating system shell to get maven: $ sudo apt-get install maven How to do it… We have a Java mongo-hadoop-mapreduce-test project that can be downloaded from the Packt site. We invoked that MapReduce job using the Python and Java client on previous occasions. With the current directory at the root of the project where the pom.xml file is present, execute the following command on the command prompt: $ mvn clean package The JAR file, mongo-hadoop-mapreduce-test-1.0.jar, will be built and kept in the target directory. With the assumption that the CSV file is already imported to the postalCodes collection, execute the following command with the current directory still at the root of the mongo-hadoop-mapreduce-test project that we just built: ~/hadoop-binaries/hadoop-2.4.0/bin/hadoop  jar target/mongo-hadoop-mapreduce-test-1.0.jar  com.packtpub.mongo.cookbook.TopStateMapReduceEntrypoint  -Dmongo.input.split_size=8 -Dmongo.job.verbose=true -Dmongo.input.uri=mongodb://localhost:27017/test.postalCodes -Dmongo.output.uri=mongodb://localhost:27017/test.postalCodesHadoopmrOut Once the MapReduce job is completed, open the Mongo shell by typing the following command on the operating system command prompt and execute the following query from the shell: $ mongo > db.postalCodesHadoopmrOut.find().sort({count:-1}).limit(5) Compare the output to the ones that we got earlier when we executed the MapReduce jobs using Mongo's MapReduce framework. How it works… We have kept the classes very simple and with the bare minimum things that we needed. We just have three classes in our project, TopStateMapReduceEntrypoint, TopStateReducer, and TopStatesMapper, all in the same com.packtpub.mongo.cookbook package. The mapper's map function just writes a key value pair to the context, where the key is the name of the state and value is an integer value, 1. The following is the code snippet from the mapper function: context.write(new Text((String)value.get("state")), new IntWritable(1)); What the reducer gets is the same key that is the list of states and an iterable of an integer value, 1. All we do is write the same name of the state and the sum of the iterables to the context. Now, as there is no size method in the iterable that can give the count in constant time, we are left with adding up all the ones that we get in the linear time. The following is the code in the reducer method: int sum = 0; for(IntWritable value : values) {   sum += value.get(); } BSONObject object = new BasicBSONObject(); object.put("count", sum); context.write(text, new BSONWritable(object)); We will write the text string that is the key and the value that is the JSON document containing the count to the context. The mongo-hadoop connector is then responsible for writing to the output collection that we have postalCodesHadoopmrOut, the document with the _id field same as the key emitted. Thus, when we execute the following, we get the top five states with the most number of cities in our database: > db. postalCodesHadoopmrOut.find().sort({count:-1}).limit(5) { "_id" : "Maharashtra", "count" : 6446 } { "_id" : "Kerala", "count" : 4684 } { "_id" : "Tamil Nadu", "count" : 3784 } { "_id" : "Andhra Pradesh", "count" : 3550 } { "_id" : "Karnataka", "count" : 3204 } Finally, the main method of the main entry point class is as follows: Configuration conf = new Configuration(); MongoConfig config = new MongoConfig(conf); config.setInputFormat(MongoInputFormat.class); config.setMapperOutputKey(Text.class); config.setMapperOutputValue(IntWritable.class); config.setMapper(TopStatesMapper.class); config.setOutputFormat(MongoOutputFormat.class); config.setOutputKey(Text.class); config.setOutputValue(BSONWritable.class); config.setReducer(TopStateReducer.class); ToolRunner.run(conf, new TopStateMapReduceEntrypoint(), args); All we do is wrap the org.apache.hadoop.conf.Configuration object with the com.mongodb.hadoop.MongoConfig instance to set the various properties and then submit the MapReduce job for the execution using ToolRunner. See also We executed a simple MapReduce job on Hadoop using the Hadoop API and sourcing the data from MongoDB and writing the data to the MongoDB collection in the recipe. What if we want to write the map and reduce the functions in a different language? Fortunately, this is possible by using a concept called Hadoop streaming, where stdout is used as a means to communicate between the program and the Hadoop MapReduce framework. Summary In this article, you learned about executing our first sample MapReduce job using the mongo-Hadoop connector and writing our first Hadoop MapReduce job. You can also refer to the following books related to MongoDB that are available on our website: MongoDB Cookbook: https://www.packtpub.com/big-data-and-business-intelligence/mongodb-cookbook Instant MongoDB: https://www.packtpub.com/big-data-and-business-intelligence/instant-mongodb-instant MongoDB High Availability: https://www.packtpub.com/big-data-and-business-intelligence/mongodb-high-availability Resources for Article: Further resources on this subject: About MongoDB [article] Ruby with MongoDB for Web Development [article] Sharding in Action [article]
Read more
  • 0
  • 0
  • 2954
Modal Close icon
Modal Close icon