Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-working-vmware-infrastructure
Packt
04 Mar 2015
21 min read
Save for later

Working with VMware Infrastructure

Packt
04 Mar 2015
21 min read
In this article by Daniel Langenhan, the author of VMware vRealize Orchestrator Cookbook, we will take a closer look at how Orchestrator interacts with vCenter Server and vRealize Automation (vRA—formerly known as vCloud Automation Center, vCAC). vRA uses Orchestrator to access and automate infrastructure using Orchestrator plugins. We will take a look at how to make Orchestrator workflows available to vRA. We will investigate the following recipes: Unmounting all the CD-ROMs of all VMs in a cluster Provisioning a VM from a template An approval process for VM provisioning (For more resources related to this topic, see here.) There are quite a lot of plugins for Orchestrator to interact with VMware infrastructure and programs: vCenter Server vCloud Director (vCD) vRealize Automation (vRA—formally known as vCloud Automation Center, vCAC) Site Recovery Manager (SRM) VMware Auto Deploy Horizon (View and Virtual Desktops) vRealize Configuration Manager (earlier known as vCenter Configuration Manager) vCenter Update Manager vCenter Operation Manager, vCOPS (only example packages) VMware, as of writing of this article, is still renaming its products. An overview of all plugins and their names and download links can be found at http://www.vcoteam.info/links/plug-ins.html. There are quite a lot of plugins, and we will not be able to cover all of them, so we will focus on the one that is most used, vCenter. Sadly, vCloud Director is earmarked by VMware to disappear for everyone but service providers, so there is no real need to show any workflow for it. We will also work with vRA and see how it interacts with Orchestrator. vSphere automation The interaction between Orchestrator and vCenter is done using the vCenter API. Here is the explanation of the interaction, which you can refer to in the following figure. A user starts an Orchestrator workflow (1) either in an interactive way via the vSphere Web Client, the Orchestrator Web Operator, the Orchestrator Client, or via the API. The workflow in Orchestrator will then send a job (2) to vCenter and receive a task ID back (type VC:Task). vCenter will then start enacting the job (3). Using the vim3WaitTaskEnd action (4), Orchestrator pauses until the task has been completed. If we do not use the wait task, we can't be certain whether the task has ended or failed. It is extremely important to use the vim3WaitTaskEnd action whenever we send a job to vCenter. When the wait task reports that the job has finished, the workflow will be marked as finished. The vCenter MoRef The MoRef (Managed Object Reference) is a unique ID for every object inside vCenter. MoRefs are basically strings; some examples are shown here: VM Network Datastore ESXi host Data center Cluster vm-301 network-312 dvportgroup-242 datastore-101 host-44 data center-21 domain-c41 The MoRefs are typically stored in the attribute .id or .key of the Orchestrator API object. For example, the MoRef of a vSwitch Network is VC:Network.id. To browse for MoRefs, you can use the Managed Object Browser (MOB), documented at https://pubs.vmware.com/vsphere-55/index.jsp#com.vmware.wssdk.pg.doc/PG_Appx_Using_MOB.20.1.html. The vim3WaitTaskEnd action As already said, vim3WaitTaskEnd is one of the most central actions while interacting with vCenter. The action has the following variables: Category Name Type Usage IN vcTask VC:Task Carries the reconfiguration task from the script to the wait task IN progress Boolean Write to the logs the progress of a task in percentage IN pollRate Number How often the action should be checked for task completion in vCenter OUT ActionResult Any Returns the task's result The wait task will check in regular intervals (pollRate) the status of a task that has been submitted to vCenter. The task can have the following states: State Meaning Queued The task is queued and will be executed as soon as possible. Running The task is currently running. If the progress is set to true, the progress in percentage will be displayed in the logs. Success The task is finished successfully. Error The task has failed and an error will be thrown. Other vCenter wait actions There are actually five waiting tasks that come with the vCenter Server plugin. Here's an overview of the other four: Task Description vim3WaitToolsStarted This task waits until the VMware tools are started on a VM or until a timeout is reached. Vim3WaitForPrincipalIP This task waits until the VMware tools report the primary IP of a VM or until a timeout is reached. This typically indicates that the operating system is ready to receive network traffic. The action will return the primary IP. Vim3WaitDnsNameInTools This task waits until the VMware tools report a given DNS name of a VM or until a timeout is reached. The in-parameter addNumberToName is not used and can be set to Null. WaitTaskEndOrVMQuestion This task waits until a task is finished or if a VM develops a question. A vCenter question is related to user interaction. vRealize Automation (vRA) Automation has changed since the beginning of Orchestrator. Before, tools such as vCloud Director or vCloud Automation Center (vCAC)/vRealize Automation (vRA), Orchestrator was the main tool for automating vCenter resources. With version 6.2 of vCloud Automation Center (vCAC), the product has been renamed vRealize Automation. Now vRA is deemed to become the central cornerstone in the VMware automation effort. vRealize Orchestrator (vRO), is used by vRA to interact with and automate VMware and non-VMware products and infrastructure elements. Throughout the various vCAC/vRA interactions, the role of Orchestrator has changed substantially. Orchestrator started off as an extension to vCAC and became a central part of vRA. In vCAC 5.x, Orchestrator was only an extension of the IaaS life cycle. Orchestrator was tied in using the stubs vCAC 6.0 integrated Orchestrator as an XaaS service (Everything as a Service) using the Advanced Service Designer (ASD) In vCAC 6.1, Orchestrator is used to perform all VMware NSX operations (VMware's new network virtualization and automation), meaning that it became even more of a central part of the IaaS services. With vCAC 6.2, the Advance Service Designer (ASD) was enhanced to allow more complex form of designs, allowing better leverage of Orchestrator workflows. As you can see in the following figure, vRA connects to the vCenter Server using an infrastructure endpoint that allows vRA to conduct basic infrastructure actions, such as power operations, cloning, and so on. It doesn't allow any complex interactions with the vSphere infrastructure, such as HA configurations. Using the Advanced Service Endpoints, vRA integrates the Orchestrator (vRO) plugins as additional services. This allows vRA to offer the entire plugin infrastructure as services to vRA. The vCenter Server, AD, and PowerShell plugins are typical integrations that are used with vRA. Using Advance Service Designer (ASD), you can create integrations that use Orchestrator workflows. ASD allows you to offer Orchestrator workflows as vRA catalog items, making it possible for tenants to access any IT service that can be configured with Orchestrator via its plugins. The following diagram shows an example using the Active Directory plugin. The Orchestrator Plugin provides access to the AD services. By creating a custom resource using the exposed AD infrastructure, we can create a service blueprint and resource actions, both of which are based on Orchestrator workflows that use the AD plugin. The other method of integrating Orchestrator into the IaaS life cycle, which was predominately used in vCAC 5.x was to use the stubs. The build process of a VM has several steps; each step can be assigned a customizable workflow (called a stub). You can configure vRA to run an Orchestrator workflow at these stubs in order to facilitate a few customized actions. Such actions could be taken to change the VMs HA or DRS configuration, or to use the guest integration to install or configure a program on a VM. Installation How to install and configure vRA is out of the scope of this article, but take a look at http://www.kendrickcoleman.com/index.php/Tech-Blog/how-to-install-vcloud-automation-center-vcac-60-part-1-identity-appliance.html for more information. If you don't have the hardware or the time to install vRA yourself, you can use the VMware Hands-on Labs, which can be accessed after clicking on Try for Free at http://hol.vmware.com. The vRA Orchestrator plugin Due to the renaming, the vRA plugin is called vRealize Orchestrator vRA Plug-in 6.2.0, however the file you download and use is named o11nplugin-vcac-6.2.0-2287231.vmoapp. The plugin currently creates a workflow folder called vCloud Automation Center. vRA-integrated Orchestrator The vRA appliance comes with an installed and configured vRO instance; however, the best practice for a production environment is to use a dedicated Orchestrator installation, even better would be an Orchestrator cluster. Dynamic Types or XaaS XaaS means Everything (X) as a Service. The introduction of Dynamic Types in Orchestrator Version 5.5.1 does exactly that; it allows you to build your own plugins and interact with infrastructure that has not yet received its own plugin. Take a look at this article by Christophe Decanini; it integrates Twitter with Orchestrator using Dynamic Types at http://www.vcoteam.info/articles/learn-vco/282-dynamic-types-tutorial-implement-your-own-twitter-plug-in-without-any-scripting.html. Read more… To read more about Orchestrator integration with vRA, please take a look at the official VMware documentation. Please note that the official documentation you need to look at is about vRealize Automation, and not about vCloud Automation Center, but, as of writing this article, the documentation can be found at https://www.vmware.com/support/pubs/vrealize-automation-pubs.html. The document called Advanced Service Design deals with vRO and Advanced Service Designer The document called Machine Extensibility discusses customization using subs Unmounting all the CD-ROMs of all VMs in a cluster This is an easy recipe to start with, but one you can really make it work for your existing infrastructure. The workflow will unmount all CD-ROMs from a running VM. A mounted CD-ROM may block a VM from being vMotioned. Getting ready We need a VM that can mount a CD-ROM either as an ISO from a host or from the client. Before you start the workflow, make sure that the VM is powered on and has an ISO connected to it. How to do it... Create a new workflow with the following variables: Name Type Section Use cluster VC:ClusterComputerResource IN Used to input the cluster clusterVMs Array of VC:VirtualMachine Attribute Use to capture all VMs in a cluster Add the getAllVMsOfCluster action to the schema and assign the cluster in-parameter and the clusterVMs attribute to it as actionResult. Now, add a Foreach element to the schema and assign the workflow Disconnect all detachable devices from a running virtual machine. Assign the Foreach element clusterVMs as a parameter. Save and run the workflow. How it works... This recipe shows how fast and easily you can design solutions that help you with everyday vCenter problems. The problem is that VMs that have CD-ROMs or floppies mounted may experience problems using vMotion, making it impossible for them to be used with DRS. The reality is that a lot of admins mount CD-ROMs and then forget to disconnect them. Scheduling this script every evening just before the nighttime backups will make sure that a production cluster is able to make full use of DRS and is therefore better load-balanced. You can improve this workflow by integrating an exclusion list. See also Refer to the example workflow, 7.01 UnMount CD-ROM from Cluster. Provisioning a VM from a template In this recipe, we will build a deployment workflow for Windows and Linux VMs. We will learn how to create workflows and reduce the amount of input variables. Getting ready We need a Linux or Windows template that we can clone and provision. How to do it… We have split this recipe in two sections. In the first section, we will create a configuration element, and in the second, we will create the workflow. Creating a configuration We will use a configuration for all reusable variables. Build a configuration element that contains the following items: Name Type Use productId String This is the Windows product ID—the licensing code joinDomain String This is the Windows domain FQDN to join domainAdmin Credential These are the credentials to join the domain licenseMode VC:CustomizationLicenseDataMode Example, perServer licenseUsers Number This denotes the number of licensed concurrent users inTimezone Enums:MSTimeZone Time zone fullName String Full name of the user orgName String Organization name newAdminPassword String New admin password dnsServerList Array of String List of DNS servers dnsDomain String DNS domain gateway Array of String List of gateways Creating the base workflow Now we will create the base workflow: Create the workflow as shown in the following figure by adding the given elements:      Clone, Windows with single NIC and credential      Clone, Linux with single NIC      Custom decision Use the Clone, Windows… workflow to create all variables. Link up the ones that you have defined in the configuration as attributes. The rest are defined as follows: Name Type Section Use vmName String IN This is the new virtual machine's name vm VC:VirtualMachine IN Virtual machine to clone folder VC:VmFolder IN This is the virtual machine folder datastore VC:Datastore IN This is the datastore in which you store the virtual machine pool VC:ResourcePool IN This is the resource pool in which you create the virtual machine network VC:Network IN This is the network to which you attach the virtual network interface ipAddress String IN This is the fixed valid IP address subnetMask String IN This is the subnet mask template Boolean Attribute For value No, mark new VM as template powerOn Boolean Attribute For value Yes, power on the VM after creation doSysprep Boolean Attribute For value Yes, run Windows Sysprep dhcp Boolean Attribute For value No, use DHCP newVM VC:VirtualMachine OUT This is the newly-created VM The following sub-workflow in-parameters will be set to special values: Workflow In-parameter value Clone, Windows with single NIC and credential host Null joinWorkgroup Null macAddress Null netBIOS Null primaryWINS Null secondaryWINS Null name vmName clientName vmName Clone, Linux with single NIC host Null macAddress Null name vmName clientName vmName Define the in-parameter VM as input for the Custom decision and add the following script. The script will check whether the name of the OS contains the word Microsoft: guestOS=vm.config.guestFullName; System.log(guestOS);if (guestOS.indexOf("Microsoft") >=0){return true;} else {return false} Save and run the workflow. This workflow will now create a new VM from an existing VM and customize it with a fixed IP. How it works… As you can see, creating workflows to automate vCenter deployments is pretty straightforward. Dealing with the various in-parameters of workflows can be quite overwhelming. The best way to deal with this problem is to hide away variables by defining them centrally using a configuration, or define them locally as attributes. Using configurations has the advantage that you can create them once and reuse them as needed. You can even push the concept a bit further by defining multiple configurations for multiple purposes, such as different environments. While creating a new workflow for automation, a typical approach is as follows: Look for a workflow that you need. Run the workflow normally to check out what it actually does. Either create a new workflow that uses the original or duplicate and edit the one you tried, modifying it until it does what you want. A fast way to deal with a lot of variables is to drag every element you need into the schema and then use the binding to create the variables as needed. You may have noticed that this workflow only lets you select vSwitch networks, not distributed vSwitch networks. You can improve this workflow with the following features: Read the existing Sysprep information stored in your vCenter Server Generate different predefined configurations (for example DEV or Prod) There's more... We can improve the workflow by implementing the ability to change the vCPU and the memory of the VM. Follow these steps to implement it: Move the out-parameter newVM to be an attribute. Add the following variables: Name Type Section Use vCPU Number IN This variable denotes the amount of vCPUs Memory Number IN This variable denotes the amount of VM memory vcTask VC:Task Attribute This variable will carry the reconfiguration task from the script to the wait task progress Boolean Attribute Value NO, vim3WaitTaskEnd pollRate Number Attribute Value 5, vim3WaitTaskEnd ActionResult Any Attribute vim3WaitTaskEnd Add the following actions and workflows according to the next figure:      shutdownVMAndForce      changeVMvCPU      vim3WaitTaskEnd      changeVMRAM      Start virtual machine Bind newVM to all the appropriate input parameters of the added actions and workflows. Bind actionResults (VC:tasks) of the change actions to vim3WaitTasks. See also Refer to the example workflows, 7.02.1 Provision VM (Base), 7.02.2 Provision VM (HW custom), as well as the configuration element, 7 VM provisioning. An approval process for VM provisioning In this recipe, we will see how to create a workflow that waits for an approver to approve the VM creation before provisioning it. We will learn how to combine mail and external events in a workflow to make it interact with different users. Getting ready For this recipe, we first need the provisioning workflow that we have created in the Provisioning a VM from a template recipe. You can use the example workflow, 7.02.1 Provision VM (Base). Additionally, we need a functional e-mail system as well as a workflow to send e-mails. You can use the example workflow, 4.02.1 SendMail as well as its configuration item, 4.2.1 Working with e-mail. How to do it… We will split this recipe in three parts. First, we will create a configuration element then, we will create the workflow, and lastly, we will use a presentation to make the workflow usable. Creating a configuration element We will use a configuration for all reusable variables. Build a configuration element that contains the following items: Name Type Use templates Array/VC:VirtualMachine This contains all the VMs that serve as templates folders Array/VC:VmFolder This contains all the VM folders that are targets for VM provisioning networks Array/VC:Network This contains all VM networks that are targets for VM provisioning resourcePools Array/VC:ResourcePool This contains all resource pools that are targets for VM provisioning datastores Array/VC:Datastore This contains all datastores that are targets for VM provisioning daysToApproval Number These are the number of days the approval should be available for approver String This is the e-mail of the approver Please note that you also have to define or use the configuration elements for SendMail, as well as the Provision VM workflows. You can use the examples contained in the example package. Creating a workflow Create a new workflow and add the following variables: Name Type Section Use mailRequester String IN This is the e-mail address of the requester vmName String IN This is the name of the new virtual machine vm VC:VirtualMachine IN This is the virtual machine to be cloned folder VC:VmFolder IN This is the virtual machine folder datastore VC:Datastore IN This is the datastore in which you store the virtual machine pool VC:ResourcePool IN This is the resource pool in which you create the virtual machine network VC:Network IN This is the network to which you attach the virtual network interface ipAddress String IN This is the fixed valid IP address subnetMask String IN This is the subnet mask isExternalEvent Boolean Attribute A value of true defines this event as external mailApproverSubject String Attribute This is the subject line of the mail sent to the approver mailApproverContent String Attribute This is the content of the mail that is sent to the approver mailRequesterSubject String Attribute This is the subject line of the mail sent to the requester when the VM is provisioned mailRequesterContent String Attribute This is the content of the mail that is sent to the requester when the VM is provisioned mailRequesterDeclinedSubject String Attribute This is the subject line of the mail sent to the requester when the VM is declined mailRequesterDeclinedContent String Attribute This is the content of the mail that is sent to the requester when the VM is declined eventName String Attribute This is the name of the external event endDate Date Attribute This is the end date for the wait of external event approvalSuccess Boolean Attribute This checks whether the VM has been approved Now add all the attributes we defined in the configuration element and link them to the configuration. Create the workflow as shown in the following figure by adding the given elements:      Scriptable task      4.02.1 SendMail (example workflow)       Wait for custom event       Decision       Provision VM (example workflow) Edit the scriptable task and bind the following variables to it: In Out vmName ipAddress mailRequester template approver days to approval mailApproverSubject mailApproverContent mailRequesterSubject mailRequesterContent mailRequesterDeclinedSubject mailRequesterDeclinedContent eventName endDate Add the following script to the scriptable task: //construct event name eventName="provision-"+vmName; //add days to today for approval var today = new Date(); var endDate = new Date(today); endDate.setDate(today.getDate()+daysToApproval); //construct external URL for approval var myURL = new URL() ; myURL=System.customEventUrl(eventName, false); externalURL=myURL.url; //mail to approver mailApproverSubject="Approval needed: "+vmName; mailApproverContent="Dear Approver,n the user "+mailRequester+" would like to provision a VM from template "+template.name+".n To approve please click here: "+externalURL; //VM provisioned mailRequesterSubject="VM ready :"+vmName; mailRequesterContent="Dear Requester,n the VM "+vmName+" has been provisioned and is now available under IP :"+ipAddress; //declined mailRequesterDeclinedSubject="Declined :"+vmName; mailRequesterDeclinedContent="Dear Requester,n the VM "+vmName+" has been declined by "+approver; Bind the out-parameter of Wait for customer event to approvalSuccess. Configure the Decision element with approvalSuccess as true. Bind all the other variables to the workflow elements. Improving with the presentation We will now edit the workflow's presentation in order to make it workable for the requester. To do so, follow the given steps: Click on Presentation and follow the steps to alter the presentation, as seen in the following screenshot: Add the following properties to the in-parameters: In-parameter Property Value template Predefined list of elements #templates folder Predefined list of elements #folders datastore Predefined list of elements #datastores pool Predefined list of elements #resourcePools network Predefined list of elements #networks You can now use the General tab of each in-parameter to change the displayed text. Save and close the workflow. How it works… This is a very simplified example of an approval workflow to create VMs. The aim of this recipe is to introduce you to the method and ideas of how to build such a workflow. This workflow will only give a requester the choices that are configured in the configuration element, making the workflow quite safe for users that have only limited knowhow of the IT environment. When the requester submits the workflow, an e-mail is sent to the approver. The e-mail contains a link, which when clicked, triggers the external event and approves the VM. If the VM is approved it will get provisioned, and when the provisioning has finished an e-mail is sent to the requester stating that the VM is now available. If the VM is not approved within a certain timeframe, the requester will receive an e-mail that the VM was not approved. To make this workflow fully functional, you can add permissions for a requester group to the workflow and Orchestrator so that the user can use the vCenter to request a VM. Things you can do to improve the workflow are as follows: Schedule the provisioning to a future date. Use the resources for the e-mail and replace the content. Add an error workflow in case the provisioning fails. Use AD to read out the current user's e-mail and full name to improve the workflow. Create a workflow that lets an approver configure the configuration elements that a requester can chose from. Reduce the selections by creating, for instance, a development and production configuration that contains the correct folders, datastores, networks, and so on. Create a decommissioning workflow that is automatically scheduled so that the VM is destroyed automatically after a given period of time. See also Refer to the example workflow, 7.03 Approval and the configuration element, 7 approval. Summary In this article, we discussed one of the important aspects of the interaction of Orchestrator with vCenter Server and vRealize Automation, that is VM provisioning. Resources for Article: Further resources on this subject: Importance of Windows RDS in Horizon View [article] Metrics in vRealize Operations [article] Designing and Building a Horizon View 6.0 Infrastructure [article]
Read more
  • 0
  • 0
  • 13128

article-image-ios-security-overview
Packt
04 Mar 2015
20 min read
Save for later

iOS Security Overview

Packt
04 Mar 2015
20 min read
In this article by Allister Banks and Charles S. Edge, the authors of the book, Learning iOS Security, we will go through an overview of the basic security measures followed in an iOS. Out of the box, iOS is one of the most secure operating systems available. There are a number of factors that contribute to the elevated security level. These include the fact that users cannot access the underlying operating system. Apps also have data in a silo (sandbox), so instead of accessing the system's internals they can access the silo. App developers choose whether to store settings such as passwords in the app or on iCloud Keychain, which is a secure location for such data on a device. Finally, Apple has a number of controls in place on devices to help protect users while providing an elegant user experience. However, devices can be made even more secure than they are now. In this article, we're going to get some basic security tasks under our belt in order to get some basic best practices of security. Where we feel more explanation is needed about what we did on devices, we'll explore a part of the technology itself in this article. This article will cover the following topics: Pairing Backing up your device Initial security checklist Safari and built-in app protection Predictive search and spotlight (For more resources related to this topic, see here.) To kick off the overview of iOS security, we'll quickly secure our systems by initially providing a simple checklist of tasks, where we'll configure a few device protections that we feel everyone should use. Then, we'll look at how to take a backup of our devices and finally, at how to use a built-in web browser and protections around a browser. Pairing When you connect a device to a computer that runs iTunes for the first time, you are prompted to enter a password. Doing so allows you to synchronize the device to a computer. Applications that can communicate over this channel include iTunes, iPhoto, Xcode, and others. To pair a device to a Mac, simply plug the device in (if you have a passcode, you'll need to enter that in order to pair the device.) When the device is plugged in, you'll be prompted on both the device and the computer to establish a trust. Simply tap on Trust on the iOS device, as shown in the following screenshot: Trusting a computer For the computer to communicate with the iOS device, you'll also need to accept the pairing on your computer (although, when you use libimobiledevice, which is the command to pair, does not require doing so, because you use the command line to accept). When prompted, click on Continue to establish the pairing, as seen in the following screenshot (the screenshot is the same in Windows): Trusting a device When a device is paired, a file is created in /var/db/lockdown, which is the UDID of the device with a property list (plist) extension. A property list is an Apple XML file that stores a variety of attributes. In Windows, iOS data is stored in the MobileSync folder, which you can access by navigating to Users(username)AppDataRoamingApple ComputerMobileSync. The information in this file sets up a trust between the computers and includes the following attributes: DeviceCertificate: This certificate is unique to each device. EscrowBag: The key bag of EscrowBag contains class keys used to decrypt the device. HostCertificate: This certificate is for the host who's paired with iOS devices (usually, the same for all files that you've paired devices with, on your computer). HostID: This is a generated ID for the host. HostPrivateKey: This is the private key for your Mac (should be the same in all files on a given computer). RootCertificate: This is the certificate used to generate keys (should be the same in all files on a given computer). RootPrivateKey: This is the private key of the computer that runs iTunes for that device. SystemBUID: This refers to the ID of the computer that runs iTunes. WiFiMACAddress: This is the Mac address of the Wi-Fi interface of the device that is paired to the computer. If you do not have an active Wi-Fi interface, MAC is still used while pairing. Why does this matter? It's important to know how a device interfaces with a computer. These files can be moved between computers and contain a variety of information about a device, including private keys. Having keys isn't all that is required for a computer to communicate with a device. When the devices are interfacing with a computer over USB, if you have a passcode enabled on the device, you will be required to enter that passcode in order to unlock the device. Once a computer is able to communicate with a device, you need to be careful as the backups of a device, apps that get synchronized to a device, and other data that gets exchanged with a device can be exposed while at rest on devices. Backing up your device What do most people do to maximize the security of iOS devices? Before we do anything, we need to take a backup of our devices. This protects the device from us by providing a restore point. This also secures the data from the possibility of losing it through a silly mistake. There are two ways, which are most commonly used to take backups: iCloud and iTunes. As the names imply, the first makes backups for the data on Apple's cloud service and the second on desktop computers. We'll cover how to take a backup on iCloud first. iCloud backups An iCloud account comes with free storage, to back up your Apple devices. An iOS device takes a backup to Apple servers and can be restored when a new device is set up from those same servers (it's a screen that appears during the activation process of a new device. Also, it appears as an option in iTunes if you back up to iTunes over USB—covered later in this article). Setting up and checking the status of iCloud backups is a straightforward process. From the Settings app, tap on iCloud and then Backup. As you can see from the Backup screen, you have two options, iCloud Backup, which enables automatic backups of the device to your iCloud account, and Back Up Now, which runs an immediate backup of the device. iCloud backups Allowing iCloud to take backups on devices is optional. You can disable access to iCloud and iCloud backups. However, doing so is rarely a good idea as you are limiting the functionality of the device and putting the data on your device at risk, if that data isn't backed up another way such as through iTunes. Many people have reservations about storing data on public clouds; especially, data as private as phone data (texts, phone call history, and so on). For more information on Apple's security and privacy around iCloud, refer to http://support.apple.com/en-us/HT202303. If you do not trust Apple or it's cloud, then you can also take a backup of your device using iTunes, described in the next section. Taking backups using iTunes Originally, iTunes was used to take backups for iOS devices. You can still use iTunes and it's likely you will have a second backup even if you are using iCloud, simply for a quick restore if nothing else. Backups are usually pretty small. The reason is that the operating system is not part of backups, since users can't edit any of those files. Therefore, you can use an ipsw file (the operating system) to restore a device. These are accessed through Apple Configurator or through iTunes if you have a restore file waiting to be installed. These can be seen in ~/Library/iTunes, and the name of the device and its software updates, as can be seen in the following screenshot: IPSW files Backups are stored in the ~/Library/Application Support/MobileSync/Backup directory. Here, you'll see a number of directories that are associated with the UDID of the devices, and within those, you'll see a number of files that make up the modular incremental backups beyond the initial backup. It's a pretty smart system and allows you to restore a device at different points in time without taking too long to perform each backup. Backups are stored in the Documents and SettingsUSERNAMEApplication DataApple ComputerMobileSyncBackup directory on Windows XP and in the UsersUSERNAMEAppDataRoamingApple ComputerMobileSyncBackup directory for newer operating systems. To enable an iTunes back up, plug a device into a computer, and then open iTunes. Click on the device for it to show the device details screen. The top section of the screen is for Backups (in the following screenshot, you can set a back up to This computer, which takes a backup on the computer you are on). I would recommend you to always choose the Encrypt iPhone backup option as it forces you to save a password in order to restore the back up. Additionally, you can use the Back Up Now button to kick off the first back up, as shown in the following screenshot: iTunes Viewing iOS data in iTunes To show why it's important to encrypt backups, let's look at what can be pulled out of those backups. There are a few tools that can extract backups, provided you have a password. Here, we'll look at iBackup Extractor to view the backup of your browsing history, calendars, call history, contacts, iMessages, notes, photos, and voicemails. To get started, download iBackup Extractor from http://www.wideanglesoftware.com/ibackupextractor. When you open iBackup Extractor for the first time, simply choose the device backup you wish to extract in iBackup Extractor. As you can see in following screenshot, you will be prompted for a password in order to unlock the Backup key bag. Enter the password to unlock the system. Unlock the backups Note that the file tree in the following screenshot gives away some information on the structure of the iOS filesystem, or at least, the data stored in the backups of the iOS device. For now, simply click on Browser to see a list of files that can be extracted from the backup, as you can see in the next screenshot: View Device Contents Using iBackup Extractor Note the prevalence of SQL databases in the files. Most apps use these types of databases to store data on devices. Also, check out the other options such as extracting notes (many that were possibly deleted), texts (some that have been deleted from devices), and other types of data from devices. Now that we've exhausted backups and proven that you should really put a password in place for your back ups, let's finally get to some basic security tasks to be performed on these devices! Initial security checklist Apple has built iOS to be one of the most secure operating systems in the world. This has been made possible by restricting access to much of the operating system by end users, unless you jailbreak a device. In this article, we won't cover jail-breaking devices much due to the fact that securing the devices then becomes a whole new topic. Instead, we have focused on what you need to do, how you can do those tasks, what the impacts are, and, how to manage security settings based on a policy. The basic steps required to secure an iOS device start with encrypting devices, which is done by assigning a passcode to a device. We will then configure how much inactive time before a device requires a PIN and accordingly manage the privacy settings. These settings allow us to get some very basic security features under our belt, and set the stage to explain what some of the features actually do. Configuring a passcode The first thing most of us need to do on an iOS device is configure a passcode for the device. Several things happen when a passcode is enabled, as shown in the following steps: The device is encrypted. The device then requires a passcode to wake up. An idle timeout is automatically set that puts the device to sleep after a few minutes of inactivity. This means that three of the most important things you can do to secure a device are enabled when you set up a passcode. Best of all, Apple recommends setting up a passcode during the initial set up of new devices. You can manage passcode settings using policies (or profiles as Apple likes to call them in iOS). Best of all—you can set a passcode and then use your fingerprint on the Home button instead of that passcode. We have found that by the time our phone is out of our pocket and if our finger is on the home button, the device is unlocked by the time we check it. With iPhone 6 and higher versions, you can now use that same fingerprint to secure payment information. Check whether a passcode has been configured, and if needed, configure a passcode using the Settings app. The Settings app is by default on the Home screen where many settings on the device, including Wi-Fi networks the device has been joined to, app preferences, mail accounts, and other settings are configured. To set a passcode, open the Settings app and tap on Touch ID & Passcode If a passcode has been set, you will see the Turn Passcode Off (as seen in the following screenshot) option If a passcode has not been set, then you can do so at this screen as well Additionally, you can change a passcode that has been set using the Change Passcode button and define a fingerprint or additional fingerprints that can be used with a touch ID There are two options in the USE TOUCH ID FOR section of the screen. You can choose whether, or not, you need to enter the passcode in order to unlock a phone, which you should use unless the device is also used by small children or as a kiosk. In these cases, you don't need to encrypt or take a backup of the device anyway. The second option is to force the entering of a passcode while using the App Store and iTunes. This can cost you money if someone else is using your device, so let the default value remain, which requires you to enter a passcode to unlock the options. Configure a Passcode The passcode settings are very easy to configure; so, they should be configured when possible. Scroll down on this screen and you'll see several other features, as shown in the next screenshot. The first option on the screen is Simple Passcode. Most users want to use a simple pin with an iOS device. Trying to use alphanumeric and long passcodes simply causes most users to try to circumvent the requirement. To add a fingerprint as a passcode, simply tap on Add a Fingerprint…, which you can see in the preceding screenshot, and follow the onscreen instructions. Additionally, the following can be accessed when the device is locked, and you can choose to turn them off: Today: This shows an overview of upcoming calendar items Notifications View: This shows you the recent push notifications (apps that have updates on the device) Siri: This represents the voice control of the device Passbook: This tool is used to make payments and display tickets for concert venues and meetups Reply with Message: This tool allows you to send a text reply to an incoming call (useful if you're on the treadmill) Each organization can decide whether it considers these options to be a security risk and direct users how to deal with them, or they can implement a policy around these options. Passcode Settings There aren't a lot of security options around passcodes and encryption, because by and large, Apple secures the device by giving you fewer options than you'll actually use. Under the hood, (for example, through Apple Configurator and Mobile Device Management) there are a lot of other options, but these aren't exposed to end users of devices. For the most part, a simple four-character passcode will suffice for most environments. When you complicate passcodes, devices become much more difficult to unlock, and users tend to look for ways around passcode enforcement policies. The passcode is only used on the device, so complicating the passcode will only reduce the likelihood that a passcode would be guessed before swiping open a device, which typically occurs within 10 tries. Finally, to disable a passcode and therefore encryption, simply go to the Touch ID & Passcode option in the Settings app and tap on Turn Passcode Off. Configuring privacy settings Once a passcode is set and the device is encrypted, it's time to configure the privacy settings. Third-party apps cannot communicate with one another by default in iOS. Therefore, you must enable communication between them (also between third-party apps and built-in iOS apps that have APIs). This is a fundamental concept when it comes to securing iOS devices. To configure privacy options, open the Settings app and tap on the entry for Privacy. On the Privacy screen, you'll see a list of each app that can be communicated with by other apps, as shown in the following screenshot: Privacy Options As an example, tap on the Location Services entry, as shown in the next screenshot. Here, you can set which apps can communicate with Location Services and when. If an app is set to While Using, the app can communicate with Location Services when the app is open. If an app is set to Always, then the app can only communicate with Location Services when the app is open and not when it runs in the background. Configure Location Services On the Privacy screen, tap on Photos. Here, you have fewer options because unlike the location of a device, you can't access photos when the app is running in the background. Here, you can enable or disable an app by communicating with the photo library on a device, as seen in the next screenshot: Configure What Apps Can Access Your Camera Roll Each app should be configured in such a way that it can communicate with the features of iOS or other apps that are absolutely necessary. Other privacy options which you can consider disabling include Siri and Handoff. Siri has the voice controls of an iOS. Because Siri can be used even when your phone is locked, consider to disable it by opening the Settings app, tapping on General and then on Siri, and you will be able disable the voice controls. To disable Handoff, you should use the General System Preference pane in any OS X computer paired to an iOS device. There, uncheck the Allow Handoff between this Mac and your iCloud devices option. Safari and built-in App protections Web browsers have access to a lot of data. One of the most popular targets on other platforms has been web browsers. The default browser on an iOS device is Safari. Open the Settings app and then tap on Safari. The Safari preferences to secure iOS devices include the following: Passwords & AutoFill: This is a screen that includes contact information, a list of saved passwords and credit cards used in web browsers. This data is stored in an iCloud Keychain if iCloud Keychain has been enabled in your phone. Favorites: This performs the function of bookmark management. This shows bookmarks in iOS. Open Links: This configures how links are managed. Block Pop-ups: This enables a pop-up blocker. Scroll down and you'll see the Privacy & Security options (as seen in the next screenshot). Here, you can do the following: Do Not Track: By this, you can block the tracking of browsing activity by websites. Block Cookies: A cookie is a small piece of data sent from a website to a visitor's browser. Many sites will send cookies to third-party sites, so the management of cookies becomes an obstacle to the privacy of many. By default, Safari only allows cookies from websites that you visit (Allow from Websites I Visit). Set the Cookies option to Always Block in order to disable its ability to accept any cookies; set the option to Always Allow to accept cookies from any source; and set the option to Allow from Current Website Only to only allow cookies from certain websites. Fraudulent Website Warning: This blocks phishing attacks (sites that only exist to steal personal information). Clear History and Website Data: This clears any cached history, web files, and passwords from the Safari browser. Use Cellular Data: When this option is turned off, it disables web traffic over cellular connections (so web traffic will only work when the phone is connected to a Wi-Fi network). Configure Privacy Settings for Safari There are also a number of advanced options that can be accessed by clicking on the Advanced button, as shown in the following screenshot: Configure the Advanced Safari Options These advanced options include the following: Website Data: This option (as you can see in the next screenshot) shows the amount of data stored from each site that caches files on the device, and allows you to swipe left on these entries to access any files saved for the site. Tap on Remove All Website Data to remove data for all the sites at once. JavaScript: This allows you to disable any JavaScripts from running on sites the device browses. Web Inspector: This shows the device in the Develop menu on a computer connected to the device. If the Web Inspector option has been disabled, use Advanced Preferences in the Safari Preferences option of Safari. View Website Data On Devices Browser security is an important aspect of any operating system. Predictive search and spotlight The final aspect of securing the settings on an iOS device that we'll cover in this article includes predictive search and spotlight. When you use the spotlight feature in iOS, usage data is sent to Apple along with the information from Location Services. Additionally, you can search for anything on a device, including items previously blocked from being accessed. The ability to search for blocked content warrants the inclusion in locking down a device. That data is then used to generate future searches. This feature can be disabled by opening the Settings app, tap on Privacy, then Location Services, and then System Services. Simply slide Spotlight Suggestions to Off to disable the location data from going over that connection. To limit the type of data that spotlight sends, open the Settings app, tap on General, and then on Spotlight Search. Uncheck each item you don't want indexed in the Spotlight database. The following screenshot shows the mentioned options: Configure What Spotlight Indexes These were some of the basic tactical tasks that secure devices. Summary This article was a whirlwind of quick changes that secure a device. Here, we paired devices, took a backup, set a passcode, and secured app data and Safari. We showed how to manually do some tasks that are set via policies. Resources for Article: Further resources on this subject: Creating a Brick Breaking Game [article] New iPad Features in iOS 6 [article] Sparrow iOS Game Framework - The Basics of Our Game [article]
Read more
  • 0
  • 0
  • 13184

article-image-your-first-fuelphp-application-7-easy-steps
Packt
04 Mar 2015
12 min read
Save for later

Your first FuelPHP application in 7 easy steps

Packt
04 Mar 2015
12 min read
In this article by Sébastien Drouyer, author of the book FuelPHP Application Development Blueprints we will see that FuelPHP is an open source PHP framework using the latest technologies. Its large community regularly creates and improves packages and extensions, and the framework’s core is constantly evolving. As a result, FuelPHP is a very complete solution for developing web applications. (For more resources related to this topic, see here.) In this article, we will also see how easy it is for developers to create their first website using the PHP oil utility. The target application Suppose you are a zoo manager and you want to keep track of the monkeys you are looking after. For each monkey, you want to save: Its name If it is still in the zoo Its height A description input where you can enter custom information You want a very simple interface with five major features. You want to be able to: Create new monkeys Edit existing ones List all monkeys View a detailed file for each monkey Delete monkeys These preceding five major features, very common in computer applications, are part of the Create, Read, Update and Delete (CRUD) basic operations. Installing the environment The FuelPHP framework needs the three following components: Webserver: The most common solution is Apache PHP interpreter: The 5.3 version or above Database: We will use the most popular one, MySQL The installation and configuration procedures of these components will depend on the operating system you use. We will provide here some directions to get you started in case you are not used to install your development environment. Please note though that these are very generic guidelines. Feel free to search the web for more information, as there are countless resources on the topic. Windows A complete and very popular solution is to install WAMP. This will install Apache, MySQL and PHP, in other words everything you need to get started. It can be accessed at the following URL: http://www.wampserver.com/en/ Mac PHP and Apache are generally installed on the latest version of the OS, so you just have to install MySQL. To do that, you are recommended to read the official documentation: http://dev.mysql.com/doc/refman/5.1/en/macosx-installation.html A very convenient solution for those of you who have the least system administration skills is to install MAMP, the equivalent of WAMP but for the Mac operating system. It can be downloaded through the following URL: http://www.mamp.info/en/downloads/ Ubuntu As this is the most popular Linux distribution, we will limit our instructions to Ubuntu. You can install a complete environment by executing the following command lines: # Apache, MySQL, PHP sudo apt-get install lamp-server^   # PHPMyAdmin allows you to handle the administration of MySQL DB sudo apt-get install phpmyadmin   # Curl is useful for doing web requests sudo apt-get install curl libcurl3 libcurl3-dev php5-curl   # Enabling the rewrite module as it is needed by FuelPHP sudo a2enmod rewrite   # Restarting Apache to apply the new configuration sudo service apache2 restart Getting the FuelPHP framework There are four common ways to download FuelPHP: Downloading and unzipping the compressed package which can be found on the FuelPHP website. Executing the FuelPHP quick command-line installer. Downloading and installing FuelPHP using Composer. Cloning the FuelPHP GitHub repository. It is a little bit more complicated but allows you to select exactly the version (or even the commit) you want to install. The easiest way is to download and unzip the compressed package located at: http://fuelphp.com/files/download/28 You can get more information about this step in Chapter 1 of FuelPHP Application Development Blueprints, which can be accessed freely. It is also well-documented on the website installation instructions page: http://fuelphp.com/docs/installation/instructions.html Installation directory and apache configuration Now that you know how to install FuelPHP in a given directory, we will explain where to install it and how to configure Apache. The simplest way The simplest way is to install FuelPHP in the root folder of your web server (generally the /var/www directory on Linux systems). If you install fuel in the DIR directory inside the root folder (/var/www/DIR), you will be able to access your project on the following URL: http://localhost/DIR/public/ However, be warned that fuel has not been implemented to support this, and if you publish your project this way in the production server, it will introduce security issues you will have to handle. In such cases, you are recommended to use the second way we explained in the section below, although, for instance if you plan to use a shared host to publish your project, you might not have the choice. A complete and up to date documentation about this issue can be found in the Fuel installation instruction page: http://fuelphp.com/docs/installation/instructions.html By setting up a virtual host Another way is to create a virtual host to access your application. You will need a *nix environment and a little bit more apache and system administration skills, but the benefit is that it is more secured and you will be able to choose your working directory. You will need to change two files: Your apache virtual host file(s) in order to link a virtual host to your application Your system host file, in order redirect the wanted URL to your virtual host In both cases, the files location will be very dependent on your operating system and the server environment you are using, so you will have to figure their location yourself (if you are using a common configuration, you won’t have any problem to find instructions on the web). In the following example, we will set up your system to call your application when requesting the my.app URL on your local environment. Let’s first edit the virtual host file(s); add the following code at the end: <VirtualHost *:80>    ServerName my.app    DocumentRoot YOUR_APP_PATH/public    SetEnv FUEL_ENV "development"    <Directory YOUR_APP_PATH/public>        DirectoryIndex index.php        AllowOverride All        Order allow,deny        Allow from all    </Directory> </VirtualHost> Then, open your system host files and add the following line at the end: 127.0.0.1 my.app Depending on your environment, you might need to restart Apache after that. You can now access your website on the following URL: http://my.app/ Checking that everything works Whether you used a virtual host or not, the following should now appear when accessing your website: Congratulation! You just have successfully installed the FuelPHP framework. The welcome page shows some recommended directions to continue your project. Database configuration As we will store our monkeys into a MySQL database, it is time to configure FuelPHP to use our local database. If you open fuel/app/config/db.php, all you will see is an empty array but this configuration file is merged to fuel/app/config/ENV/db.php, ENV being the current Fuel’s environment, which in that case is development. You should therefore open fuel/app/config/development/db.php: <?php //... return array( 'default' => array(    'connection' => array(      'dsn'       => 'mysql:host=localhost;dbname=fuel_dev',      'username'   => 'root',      'password'   => 'root',    ), ), ); You should adapt this array to your local configuration, particularly the database name (currently set to fuel_dev), the username, and password. You must create your project’s database manually. Scaffolding Now that the database configuration is set, we will be able to generate a scaffold. We will use for that the generate feature of the oil utility. Open the command-line utility and go to your website root directory. To generate a scaffold for a new model, you will need to enter the following line: php oil generate scaffold/crud MODEL ATTR_1:TYPE_1 ATTR_2:TYPE_2 ... Where: MODEL is the model name ATTR_1, ATTR_2… are the model’s attributes names TYPE_1, TYPE_2… are each attribute type In our case, it should be: php oil generate scaffold/crud monkey name:string still_here:bool height:float description:text Here we are telling oil to generate a scaffold for the monkey model with the following attributes: name: The name of the monkey. Its type is string and the associated MySQL column type will be VARCHAR(255). still_here: Whether or not the monkey is still in the facility. Its type is boolean and the associated MySQL column type will be TINYINT(1). height: Height of the monkey. Its type is float and its associated MySQL column type will be FLOAT. description: Description of the monkey. Its type is text and its associated MySQL column type will be TEXT. You can do much more using the oil generate feature, as generating models, controllers, migrations, tasks, package and so on. We will see some of these in the FuelPHP Application Development Blueprints book and you are also recommended to take a look at the official documentation: http://fuelphp.com/docs/packages/oil/generate.html When you press Enter, you will see the following lines appear: Creating migration: APPPATH/migrations/001_create_monkeys.php Creating model: APPPATH/classes/model/monkey.php Creating controller: APPPATH/classes/controller/monkey.php Creating view: APPPATH/views/monkey/index.php Creating view: APPPATH/views/monkey/view.php Creating view: APPPATH/views/monkey/create.php Creating view: APPPATH/views/monkey/edit.php Creating view: APPPATH/views/monkey/_form.php Creating view: APPPATH/views/template.php Where APPPATH is your website directory/fuel/app. Oil has generated for us nine files: A migration file, containing all the necessary information to create the model’s associated table The model A controller Five view files and a template file More explanation about these files and how they interact with each other can be accessed in Chapter 1 of the FuelPHP Application Development Blueprints book, freely available. For those of you who are not yet familiar with MVC and HMVC frameworks, don’t worry; the chapter contains an introduction to the most important concepts. Migrating One of the generated files was APPPATH/migrations/001_create_monkeys.php. It is a migration file and contains the required information to create our monkey table. Notice the name is structured as VER_NAME where VER is the version number and NAME is the name of the migration. If you execute the following command line: php oil refine migrate All migrations files that have not been yet executed will be executed from the oldest version to the latest version (001, 002, 003, and so on). Once all files are executed, oil will display the latest version number. Once executed, if you take a look at your database, you will observe that not one, but two tables have been created: monkeys: As expected, a table have been created to handle your monkeys. Notice that the table name is the plural version of the word we typed for generating the scaffold; such a transformation was internally done using the Inflector::pluralize method. The table will contain the specified columns (name, still_here), the id column, but also created_at and updated_at. These columns respectively store the time an object was created and updated, and are added by default each time you generate your models. It is though possible to not generate them with the --no-timestamp argument. migration: This other table was automatically created. It keeps track of the migrations that were executed. If you look into its content, you will see that it already contains one row; this is the migration you just executed. You can notice that the row does not only indicate the name of the migration, but also a type and a name. This is because migrations files can be placed at many places such as modules or packages. The oil utility allows you to do much more. Don’t hesitate to take a look at the official documentation: http://fuelphp.com/docs/packages/oil/intro.html Or, again, to read FuelPHP Application Development Blueprints’ Chapter 1 which is available for free. Using your application Now that we generated the code and migrated the database, our application is ready to be used. Request the following URL: If you created a virtual host: http://my.app/monkey Otherwise (don’t forget to replace DIR): http://localhost/DIR/public/monkey As you can notice, this webpage is intended to display the list of all monkeys, but since none have been added, the list is empty. Then let’s add a new monkey by clicking on the Add new Monkey button. The following webpage should appear: You can enter your monkey’s information here. The form is certainly not perfect - for instance the Still here field use a standard input although a checkbox would be more appropriated - but it is a great start. All we will have to do is refine the code a little bit. Once you have added several monkeys, you can again take a look at the listing page: Again, this is a great start, though we might want to refine it. Each item on the list has three associated actions: View, Edit, and Delete. Let’s first click on View: Again a great start, though we will refine this webpage. You can return back to the listing by clicking on Back or edit the monkey file by clicking on Edit. Either accessed from the listing page or the view page, it will display the same form as when creating a new monkey, except that the form will be prefilled of course. Finally, if you click on Delete, a confirmation box will appear to prevent any miss clicking. Want to learn more ? Don’t hesitate to check out FuelPHP Application Development Blueprints’ Chapter 1 which is freely available in Packt Publishing’s website. In this chapter, you will find a more thorough introduction to FuelPHP and we will show how to improve this first application. You are also recommended to explore FuelPHP website, which contains a lot of useful information and an excellent documentation: http://www.fuelphp.com There is much more to discover about this wonderful framework. Summary In this article we leaned about the installation of the FuelPHP environment and installation of directories in it. Resources for Article: Further resources on this subject: PHP Magic Features [Article] FuelPHP [Article] Building a To-do List with Ajax [Article]
Read more
  • 0
  • 0
  • 7271

article-image-our-app-and-tool-stack
Packt
04 Mar 2015
33 min read
Save for later

Our App and Tool Stack

Packt
04 Mar 2015
33 min read
In this article by Zachariah Moreno, author of the book AngularJS Deployment Essentials, you will learn how to do the following: Minimize efforts and maximize results using a tool stack optimized for AngularJS development Access the krakn app via GitHub Scaffold an Angular app with Yeoman, Grunt, and Bower Set up a local Node.js development server Read through krakn's source code Before NASA or Space X launches a vessel into the cosmos, there is a tremendous amount of planning and preparation involved. The guiding principle when planning for any successful mission is similar to minimizing efforts and resources while retaining maximum return on the mission. Our principles for development and deployment are no exception to this axiom, and you will gain a firmer working knowledge of how to do so in this article. (For more resources related to this topic, see here.) The right tools for the job Web applications can be compared to buildings; without tools, neither would be a pleasure to build. This makes tools an indispensable factor in both development and construction. When tools are combined, they form a workflow that can be repeated across any project built with the same stack, facilitating the practices of design, development, and deployment. The argument can be made that it is just as paramount to document workflow as an application's source code or API. Along with grouping tools into categories based on the phases of building applications, it is also useful to group tools based on the opinions of a respective project—in our case, Angular, Ionic, and Firebase. I call tools grouped into opinionated workflows tool stacks. For example, the remainder of this article discusses the tool stack used to build the application that we will deploy across environments in this book. In contrast, if you were to build a Ruby on Rails application, the tool stack would be completely different because the project's opinions are different. Our app is called krakn, and it functions as a real-time chat application built on top of the opinions of Angular, the Ionic Framework, and Firebase. You can find all of krakn's source code at https://github.com/zachmoreno/krakn. Version control with Git and GitHub Git is a command-line interface (CLI) developed by Linus Torvalds, to use on the famed Linux kernel. Git is mostly popular due to its distributed architecture making it nearly impossible for corruption to occur. Git's distributed architecture means that any remote repository has all of the same information as your local repository. It is useful to think of Git as a free insurance policy for my code. You will need to install Git using the instructions provided at www.git-scm.com/ for your development workstation's operating system. GitHub.com has played a notable role in Git's popularization, turning its functionality into a social network focused on open source code contributions. With a pricing model that incentivizes Open Source contributions and licensing for private, GitHub elevated the use of Git to heights never seen before. If you don't already have an account on GitHub, now is the perfect time to visit github.com to provision a free account. I mentioned earlier that krakn's code is available for forking at github.com/ZachMoreno/krakn. This means that any person with a GitHub account has the ability to view my version of krakn, and clone a copy of their own for further modifications or contributions. In GitHub's web application, forking manifests itself as a button located to the right of the repository's title, which in this case is XachMoreno/krakn. When you click on the button, you will see an animation that simulates the hardcore forking action. This results in a cloned repository under your account that will have a title to the tune of YourName/krakn. Node.js Node.js, commonly known as Node, is a community-driven server environment built on Google Chrome's V8 JavaScript runtime that is entirely event driven and facilitates a nonblocking I/O model. According to www.nodejs.org, it is best suited for: "Data-intensive real-time applications that run across distributed devices." So what does all this boil down to? Node empowers web developers to write JavaScript both on the client and server with bidirectional real-time I/O. The advent of Node has empowered developers to take their skills from the client to the server, evolving from frontend to full stack (like a caterpillar evolving into a butterfly). Not only do these skills facilitate a pay increase, they also advance the Web towards the same functionality as the traditional desktop or native application. For our purposes, we use Node as a tool; a tool to build real-time applications in the fewest number of keystrokes, videos watched, and words read as possible. Node is, in fact, a modular tool through its extensible package interface, called Node Package Manager (NPM). You will use NPM as a means to install the remainder of our tool stack. NPM The NPM is a means to install Node packages on your local or remote server. NPM is how we will install the majority of the tools and software used in this book. This is achieved by running the $ npm install –g [PackageName] command in your command line or terminal. To search the full list of Node packages, visit www.npmjs.org or run $ npm search [Search Term] in your command line or terminal as shown in the following screenshot: Yeoman's workflow Yeoman is a CLI that is the glue that holds your tools into your opinionated workflow. Although the term opinionated might sound off-putting, you must first consider the wisdom and experience of the developers and community before you who maintain Yeoman. In this context, opinionated means a little more than a collection of the best practices that are all aimed at improving your developer's experience of building static websites, single page applications, and everything in between. Opinionated does not mean that you are locked into what someone else feels is best for you, nor does it mean that you must strictly adhere to the opinions or best practices included. Yeoman is general enough to help you build nearly anything for the Web as well as improving your workflow while developing it. The tools that make up Yeoman's workflow are Yo, Grunt.js, Bower, and a few others that are more-or-less optional, but are probably worth your time. Yo Apart from having one of the hippest namespaces, Yo is a powerful code generator that is intelligent enough to scaffold most sites and applications. By default, instantiating a yo command assumes that you mean to scaffold something at a project level, but yo can also be scoped more granularly by means of sub-generators. For example, the command for instantiating a new vanilla Angular project is as follows: $ yo angular radicalApp Yo will not finish your request until you provide some further information about your desired Angular project. This is achieved by asking you a series of relevant questions, and based on your answers, yo will scaffold a familiar application folder/file structure, along with all the boilerplate code. Note that if you have worked with the angular-seed project, then the Angular application that yo generates will look very familiar to you. Once you have an Angular app scaffolded, you can begin using sub-generator commands. The following command scaffolds a new route, radicalRoute, within radicalApp: $ yo angular:route radicalRoute The :route sub-generator is a very powerful command, as it automates all of the following key tasks: It creates a new file, radicalApp/scripts/controllers/radicalRoute.js, that contains the controller logic for the radicalRoute view It creates another new file, radicalApp/views/radicalRoute.html, that contains the associated view markup and directives Lastly, it adds an additional route within, radicalApp/scripts/app.js, that connects the view to the controller Additionally, the sub-generators for yo angular include the following: :controller :directive :filter :service :provider :factory :value :constant :decorator :view All the sub-generators allow you to execute finer detailed commands for scaffolding smaller components when compared to :route, which executes a combination of sub-generators. Installing Yo Within your workstation's terminal or command-line application type, insert the following command, followed by a return: $ npm install -g yo If you are a Linux or Mac user, you might want to prefix the command with sudo, as follows: $ sudo npm install –g yo Grunt Grunt.js is a task runner that enhances your existing and/or Yeoman's workflow by automating repetitive tasks. Each time you generate a new project with yo, it creates a /Gruntfile.js file that wires up all of the curated tasks. You might have noticed that installing Yo also installs all of Yo's dependencies. Reading through /Gruntfile.js should incite a fair amount of awe, as it gives you a snapshot of what is going on under the hood of Yeoman's curated Grunt tasks and its dependencies. Generating a vanilla Angular app produces a /Gruntfile.js file, as it is responsible for performing the following tasks: It defines where Yo places Bower packages, which is covered in the next section It defines the path where the grunt build command places the production-ready code It initializes the watch task to run: JSHint when JavaScript files are saved Karma's test runner when JavaScript files are saved Compass when SCSS or SASS files are saved The saved /Gruntfile.js file It initializes LiveReload when any HTML or CSS files are saved It configures the grunt server command to run a Node.js server on localhost:9000, or to show test results on localhost:9001 It autoprefixes CSS rules on LiveReload and grunt build It renames files for optimizing browser caching It configures the grunt build command to minify images, SVG, HTML, and CSS files or to safely minify Angular files Let us pause for a moment to reflect on the amount of time it would take to find, learn, and implement each dependency into our existing workflow for each project we undertake. Ok, we should now have a greater appreciation for Yeoman and its community. For the vast majority of the time, you will likely only use a few Grunt commands, which include the following: $ grunt server $ grunt test $ grunt build Bower If Yo scaffolds our application's structure and files, and Grunt automates repetitive tasks for us, then what does Bower bring to the party? Bower is web development's missing package manager. Its functionality parallels that of Ruby Gems for the Ruby on Rails MVC framework, but is not limited to any single framework or technology stack. The explicit use of Bower is not required by the Yeoman workflow, but as I mentioned previously, the use of Bower is configured automatically for you in your project's /Gruntfile.js file. How does managing packages improve our development workflow? With all of the time we've been spending in our command lines and terminals, it is handy to have the ability to automate the management of third-party dependencies within our application. This ability manifests itself in a few simple commands, the most ubiquitous being the following command: $ bower install [PackageName] --save With this command, Bower will automate the following steps: First, search its packages for the specified package name Download the latest stable version of the package if found Move the package to the location defined in your project's /Gruntfile.js file, typically a folder named /bower_components Insert dependencies in the form of <link> elements for CSS files in the document's <head> element, and <script> elements for JavaScript files right above the document's closing </body> tag, to the package's files within your project's /index.html file This process is one that web developers are more than familiar with because adding a JavaScript library or new dependency happens multiple times within every project. Bower speeds up our existing manual process through automation and improves it by providing the latest stable version of a package and then notifying us of an update if one is available. This last part, "notifying us of an update if … available", is important because as a web developer advances from one project to the next, it is easy to overlook keeping dependencies as up to date as possible. This is achieved by running the following command: $ bower update This command returns all the available updates, if available, and will go through the same process of inserting new references where applicable. Bower.io includes all of the documentation on how to use Bower to its fullest potential along with the ability to search through all of the available Bower packages. Searching for available Bower packages can also be achieved by running the following command: $ bower search [SearchTerm] If you cannot find the specific dependency for which you search, and the project is on GitHub, consider contributing a bower.json file to the project's root and inviting the owner to register it by running the following command: $ bower register [ThePackageName] [GitEndpoint] Registration allows you to install your dependency by running the next command: $ bower install [ThePackageName] The Ionic framework The Ionic framework is a truly remarkable advancement in bridging the gap between web applications and native mobile applications. In some ways, Ionic parallels Yeoman where it assembles tools that were already available to developers into a neat package, and structures a workflow around them, inherently improving our experience as developers. If Ionic is analogous to Yeoman, then what are the tools that make up Ionic's workflow? The tools that, when combined, make Ionic noteworthy are Apache Cordova, Angular, Ionic's suite of Angular directives, and Ionic's mobile UI framework. Batarang An invaluable piece to our Angular tool stack is the Google Chrome Developer Tools extension, Batarang, by Brian Ford. Batarang adds a third-party panel (on the right-hand side of Console) to DevTools that facilitates Angular's specific inspection in the event of debugging. We can view data in the scopes of each model, analyze each expression's performance, and view a beautiful visualization of service dependencies all from within Batarang. Because Angular augments the DOM with ng- attributes, it also provides a Properties pane within the Elements panel, to inspect the models attached to a given element's scope. The extension is easy to install from either the Chrome Web Store or the project's GitHub repository and inspection can be enabled by performing the following steps: Firstly, open the Chrome Developer Tools. You should then navigate to the AngularJS panel. Finally, select the Enable checkbox on the far right tab. Your active Chrome tab will then be reloaded automatically, and the AngularJS panel will begin populating the inspection data. In addition, you can leverage the Angular pane with the Elements panel to view Angular-specific properties at an elemental level, and observe the $scope variable from within the Console panel. Sublime Text and Editor Integration While developing any Angular app, it is helpful to augment our workflow further with Angular-specific syntax completion, snippets, go to definition, and quick panel search in the form of a Sublime Text package. Perform the following steps: If you haven't installed Sublime Text already, you need to first install Package Control. Otherwise, continue with the next step. Once installed, press command + Shift + P in Sublime. Then, you need to select the Package Control: Install Package option. Finally, type angularjs and press Enter on your keyboard. In addition to support within Sublime, Angular enhancements exist for lots of popular editors, including WebStorm, Coda, and TextMate. Krakn As a quick refresher, krakn was constructed using all of the tools that are covered in this article. These include Git, GitHub, Node.js, NPM, Yeoman's workflow, Yo, Grunt, Bower, Batarang, and Sublime Text. The application builds on Angular, Firebase, the Ionic Framework, and a few other minor dependencies. The workflow I used to develop krakn went something like the following. Follow these steps to achieve the same thing. Note that you can skip the remainder of this section if you'd like to get straight to the deployment action, and feel free to rename things where necessary. Setting up Git and GitHub The workflow I followed while developing krakn begins with initializing our local Git repository and connecting it to our remote master repository on GitHub. In order to install and set up both, perform the following steps: Firstly, install all the tool stack dependencies, and create a folder called krakn. Following this, run $ git init, and you will create a README.md file. You should then run $ git add README.md and commit README.md to the local master branch. You then need to create a new remote repository on GitHub called XachMoreno/krakn. Following this, run the following command: $ git remote add origin git@github.com:[YourGitHubUserName] /krakn.git Conclude the setup by running $ git push –u origin master. Scaffolding the app with Yo Scaffolding our app couldn't be easier with the yo ionic generator. To do this, perform the following steps: Firstly, install Yo by running $ npm install -g yo. After this, install generator-ionicjs by running $ npm install -g generator-ionicjs. To conclude the scaffolding of your application, run the yo ionic command. Development After scaffolding the folder structure and boilerplate code, our workflow advances to the development phase, which is encompassed in the following steps: To begin, run grunt server. You are now in a position to make changes, for example, these being deletions or additions. Once these are saved, LiveReload will automatically reload your browser. You can then review the changes in the browser. Repeat steps 2-4 until you are ready to advance to the predeployment phase. Views, controllers, and routes Being a simple chat application, krakn has only a handful of views/routes. They are login, chat, account, menu, and about. The menu view is present in all the other views in the form of an off-canvas menu. The login view The default view/route/controller is named login. The login view utilizes the Firebase's Simple Login feature to authenticate users before proceeding to the rest of the application. Apart from logging into krakn, users can register a new account by entering their desired credentials. An interesting part of the login view is the use of the ng-show directive to toggle the second password field if the user selects the register button. However, the ng-model directive is the first step here, as it is used to pass the input text from the view to the controller and ultimately, the Firebase Simple Login. Other than the Angular magic, this view uses the ion-view directive, grid, and buttons that are all core to Ionic. Each view within an Ionic app is wrapped within an ion-view directive that contains a title attribute as follows: <ion-view title="Login"> The login view uses the standard input elements that contain a ng-model attribute to bind the input's value back to the controller's $scope as follows:   <input type="text" placeholder="you@email.com" ng-model= "data.email" />     <input type="password" placeholder=  "embody strength" ng-model="data.pass" />     <input type="password" placeholder=  "embody strength" ng-model="data.confirm" /> The Log In and Register buttons call their respective functions using the ng-click attribute, with the value set to the function's name as follows:   <button class="button button-block button-positive" ng-  click="login()" ng-hide="createMode">Log In</button> The Register and Cancel buttons set the value of $scope.createMode to true or false to show or hide the correct buttons for either action:   <button class="button button-block button-calm" ng-  click="createMode = true" ng-hide=  "createMode">Register</button>   <button class="button button-block button-calm" ng-  show="createMode" ng-click=  "createAccount()">Create Account</button>     <button class="button button-block button-  assertive" ng-show="createMode" ng-click="createMode =   false">Cancel</button> $scope.err is displayed only when you want to show the feedback to the user:   <p ng-show="err" class="assertive text-center">{{err}}</p>   </ion-view> The login controller is dependent on Firebase's loginService module and Angular's core $location module: controller('LoginCtrl', ['$scope', 'loginService', '$location',   function($scope, loginService, $location) { Ionic's directives tend to create isolated scopes, so it was useful here to wrap our controller's variables within a $scope.data object to avoid issues within the isolated scope as follows:     $scope.data = {       "email"   : null,       "pass"   : null,       "confirm"  : null,       "createMode" : false     } The login() function easily checks the credentials before authentication and sends feedback to the user if needed:     $scope.login = function(cb) {       $scope.err = null;       if( !$scope.data.email ) {         $scope.err = 'Please enter an email address';       }       else if( !$scope.data.pass ) {         $scope.err = 'Please enter a password';       } If the credentials are sound, we send them to Firebase for authentication, and when we receive a success callback, we route the user to the chat view using $location.path() as follows:       else {         loginService.login($scope.data.email,         $scope.data.pass, function(err, user) {          $scope.err = err? err + '' : null;          if( !err ) {           cb && cb(user);           $location.path('krakn/chat');          }        });       }     }; The createAccount() function works in much the same way as login(), except that it ensures that the users don't already exist before adding them to your Firebase and logging them in:     $scope.createAccount = function() {       $scope.err = null;       if( assertValidLoginAttempt() ) {        loginService.createAccount($scope.data.email,    $scope.data.pass,          function(err, user) {           if( err ) {             $scope.err = err? err + '' : null;           }           else {             // must be logged in before I can write to     my profile             $scope.login(function() {              loginService.createProfile(user.uid,     user.email);              $location.path('krakn/account');             });           }          });       }     }; The assertValidLoginAttempt() function is a function used to ensure that no errors are received through the account creation and authentication flows:     function assertValidLoginAttempt() {       if( !$scope.data.email ) {        $scope.err = 'Please enter an email address';       }       else if( !$scope.data.pass ) {        $scope.err = 'Please enter a password';       }       else if( $scope.data.pass !== $scope.data.confirm ) {        $scope.err = 'Passwords do not match';       }       return !$scope.err;     }    }]) The chat view Keeping vegan practices aside, the meat and potatoes of krakn's functionality lives within the chat view/controller/route. The design is similar to most SMS clients, with the input in the footer of the view and messages listed chronologically in the main content area. The ng-repeat directive is used to display a message every time a message is added to the messages collection in Firebase. If you submit a message successfully, unsuccessfully, or without any text, feedback is provided via the placeholder attribute of the message input. There are two filters being utilized within the chat view: orderByPriority and timeAgo. The orderByPriority filter is defined within the firebase module that uses the Firebase object IDs that ensure objects are always chronological. The timeAgo filter is an open source Angular module that I found. You can access it at JS Fiddle. The ion-view directive is used once again to contain our chat view: <ion-view title="Chat"> Our list of messages is composed using the ion-list and ion-item directives, in addition to a couple of key attributes. The ion-list directive gives us some nice interactive controls using the option-buttons and can-swipe attributes. This results in each list item being swipable to the left, revealing our option-buttons as follows:    <ion-list option-buttons="itemButtons" can-swipe=     "true" ng-show="messages"> Our workhorse in the chat view is the trusty ng-repeat directive, responsible for persisting our data from Firebase to our service to our controller and into our view and back again:    <ion-item ng-repeat="message in messages |      orderByPriority" item="item" can-swipe="true"> Then, we bind our data into vanilla HTML elements that have some custom styles applied to them:     <h2 class="user">{{ message.user }}</h2> The third-party timeago filter converts the time into something such as, "5 min ago", similar to Instagram or Facebook:     <small class="time">{{ message.receivedTime |       timeago }}</small>     <p class="message">{{ message.text }}</p>    </ion-item>   </ion-list> A vanilla input element is used to accept chat messages from our users. The input data is bound to $scope.data.newMessage for sending data to Firebase and $scope.feedback is used to keep our users informed:   <input type="text" class="{{ feeling }}" placeholder=    "{{ feedback }}" ng-model="data.newMessage" /> When you click on the send/submit button, the addMessage() function sends the message to your Firebase, and adds it to the list of chat messages, in real time:   <button type="submit" id="chat-send" class="button button-small button-clear" ng-click="addMessage()"><span class="ion-android-send"></span></button> </ion-view> The ChatCtrl controller is dependant on a few more modules other than our LoginCtrl, including syncData, $ionicScrollDelegate, $ionicLoading, and $rootScope: controller('ChatCtrl', ['$scope', 'syncData', '$ionicScrollDelegate', '$ionicLoading', '$rootScope',    function($scope, syncData, $ionicScrollDelegate, $ionicLoading, $rootScope) { The userName variable is derived from the authenticated user's e-mail address (saved within the application's $rootScope) by splitting the e-mail and using everything before the @ symbol: var userEmail = $rootScope.auth.user.e-mail       userName = userEmail.split('@'); Avoid isolated scope issue in the same fashion, as we did in LoginCtrl:     $scope.data = {       newMessage   : null,       user      : userName[0]     } Our view will only contain the latest 20 messages that have been synced from Firebase:     $scope.messages = syncData('messages', 20); When a new message is saved/synced, it is added to the bottom of the ng-repeated list, so we use the $ionicScrollDeligate variable to automatically scroll the new message into view on the display as follows: $ionicScrollDelegate.scrollBottom(true); Our default chat input placeholder text is something on your mind?:     $scope.feedback = 'something on your mind?';     // displays as class on chat input placeholder     $scope.feeling = 'stable'; If we have a new message and a valid username (shortened), then we can call the $add() function, which syncs the new message to Firebase and our view is as follows:     $scope.addMessage = function() {       if(  $scope.data.newMessage         && $scope.data.user ) {        // new data elements cannot be synced without adding          them to FB Security Rules        $scope.messages.$add({                    text    : $scope.data.newMessage,                    user    : $scope.data.user,                    receivedTime : Number(new Date())                  });        // clean up        $scope.data.newMessage = null; On a successful sync, the feedback updates say Done! What's next?, as shown in the following code snippet:        $scope.feedback = 'Done! What's next?';        $scope.feeling = 'stable';       }       else {        $scope.feedback = 'Please write a message before sending';        $scope.feeling = 'assertive';       }     };       $ionicScrollDelegate.scrollBottom(true); ]) The account view The account view allows the logged in users to view their current name and e-mail address along with providing them with the ability to update their password and e-mail address. The input fields interact with Firebase in the same way as the chat view does using the syncData method defined in the firebase module: <ion-view title="'Account'" left-buttons="leftButtons"> The $scope.user object contains our logged in user's account credentials, and we bind them into our view as follows:   <p>{{ user.name }}</p>  …   <p>{{ user.email }}</p> The basic account management functionality is provided within this view; so users can update their e-mail address and or password if they choose to, using the following code snippet:   <input type="password" ng-keypress=    "reset()" ng-model="oldpass"/>  …   <input type="password" ng-keypress=    "reset()" ng-model="newpass"/>  …   <input type="password" ng-keypress=    "reset()" ng-model="confirm"/> Both the updatePassword() and updateEmail() functions work in much the same fashion as our createAccount() function within the LoginCtrl controller. They check whether the new e-mail or password is not the same as the old, and if all is well, it syncs them to Firebase and back again:   <button class="button button-block button-calm" ng-click=    "updatePassword()">update password</button>  …    <p class="error" ng-show="err">{{err}}</p>   <p class="good" ng-show="msg">{{msg}}</p>  …   <input type="text" ng-keypress="reset()" ng-model="newemail"/>  …   <input type="password" ng-keypress="reset()" ng-model="pass"/>  …   <button class="button button-block button-calm" ng-click=    "updateEmail()">update email</button>  …   <p class="error" ng-show="emailerr">{{emailerr}}</p>   <p class="good" ng-show="emailmsg">{{emailmsg}}</p>  … </ion-view> The menu view Within krakn/app/scripts/app.js, the menu route is defined as the only abstract state. Because of its abstract state, it can be presented in the app along with the other views by the ion-side-menus directive provided by Ionic. You might have noticed that only two menu options are available before signing into the application and that the rest appear only after authenticating. This is achieved using the ng-show-auth directive on the chat, account, and log out menu items. The majority of the options for Ionic's directives are available through attributes making them simple to use. For example, take a look at the animation="slide-left-right" attribute. You will find Ionic's use of custom attributes within the directives as one of the ways that the Ionic Framework is setting itself apart from other options within this space. The ion-side-menu directive contains our menu list similarly to the one we previously covered, the ion-view directive, as follows: <ion-side-menus>  <ion-pane ion-side-menu-content>   <ion-nav-bar class="bar-positive"> Our back button is displayed by including the ion-nav-back-button directive within the ion-nav-bar directive:    <ion-nav-back-button class="button-clear"><i class=     "icon ion-chevron-left"></i> Back</ion-nav-back-button>   </ion-nav-bar> Animations within Ionic are exposed and used through the animation attribute, which is built atop the ngAnimate module. In this case, we are doing a simple animation that replicates the experience of a native mobile app:   <ion-nav-view name="menuContent" animation="slide-left-right"></ion-nav-view>  </ion-pane>    <ion-side-menu side="left">   <header class="bar bar-header bar-positive">    <h1 class="title">Menu</h1>   </header>   <ion-content class="has-header"> A simple ion-list directive/element is used to display our navigation items in a vertical list. The ng-show attribute handles the display of menu items before and after a user has authenticated. Before a user logs in, they can access the navigation, but only the About and Log In views are available until after successful authentication.    <ion-list>     <ion-item nav-clear menu-close href=      "#/app/chat" ng-show-auth="'login'">      Chat     </ion-item>       <ion-item nav-clear menu-close href="#/app/about">      About     </ion-item>       <ion-item nav-clear menu-close href=      "#/app/login" ng-show-auth="['logout','error']">      Log In     </ion-item> The Log Out navigation item is only displayed once logged in, and upon a click, it calls the logout() function in addition to navigating to the login view:     <ion-item nav-clear menu-close href="#/app/login" ng-click=      "logout()" ng-show-auth="'login'">      Log Out     </ion-item>    </ion-list>   </ion-content>  </ion-side-menu> </ion-side-menus> The MenuCtrl controller is the simplest controller in this application, as all it contains is the toggleMenu() and logout() functions: controller("MenuCtrl", ['$scope', 'loginService', '$location',   '$ionicScrollDelegate', function($scope, loginService,   $location, $ionicScrollDelegate) {   $scope.toggleMenu = function() {    $scope.sideMenuController.toggleLeft();   };     $scope.logout = function() {     loginService.logout();     $scope.toggleMenu();  };  }]) The about view The about view is 100 percent static, and its only real purpose is to present the credits for all the open source projects used in the application. Global controller constants All of krakn's controllers share only two dependencies: ionic and ngAnimate. Because Firebase's modules are defined within /app/scripts/app.js, they are available for consumption by all the controllers without the need to define them as dependencies. Therefore, the firebase service's syncData and loginService are available to ChatCtrl and LoginCtrl for use. The syncData service is how krakn utilizes three-way data binding provided by krakenjs.com. For example, within the ChatCtrl controller, we use syncData( 'messages', 20 ) to bind the latest twenty messages within the messages collection to $scope for consumption by the chat view. Conversely, when a ng-click user clicks the submit button, we write the data to the messages collection by use of the syncData.$add() method inside the $scope.addMessage() function: $scope.addMessage = function() {   if(...) { $scope.messages.$add({ ... });   } }; Models and services The model for krakn is www.krakn.firebaseio.com. The services that consume krakn's Firebase API are as follows: The firebase service in krakn/app/scripts/service.firebase.js The login service in krakn/app/scripts/service.login.js The changeEmail service in krakn/app/scripts/changeEmail.firebase.js The firebase service defines the syncData service that is responsible for routing data bidirectionally between krakn/app/bower_components/angularfire.js and our controllers. Please note that the reason I have not mentioned angularfire.js until this point is that it is basically an abstract data translation layer between firebaseio.com and Angular applications that intend on consuming data as a service. Predeployment Once the majority of an application's development phase has been completed, at least for the initial launch, it is important to run all of the code through a build process that optimizes the file size through compression of images and minification of text files. This piece of the workflow was not overlooked by Yeoman and is available through the use of the $ grunt build command. As mentioned in the section on Grunt, the /Gruntfile.js file defines where built code is placed once it is optimized for deployment. Yeoman's default location for built code is the /dist folder, which might or might not exist depending on whether you have run the grunt build command before. Summary In this article, we discussed the tool stack and workflow used to build the app. Together, Git and Yeoman formed a solid foundation for building krakn. Git and GitHub provided us with distributed version control and a platform for sharing the application's source code with you and the world. Yeoman facilitated the remainder of the workflow: scaffolding with Yo, automation with Grunt, and package management with Bower. With our app fully scaffolded, we were able to build our interface with the directives provided by the Ionic Framework, and wire up the real-time data synchronization forged by our Firebase instance. With a few key tools, we were able to minimize our development time while maximizing our return. Resources for Article: Further resources on this subject: Role of AngularJS? [article] AngularJS Project [article] Creating Our First Animation AngularJS [article]
Read more
  • 0
  • 0
  • 2688

article-image-prototyping-arduino-projects-using-python
Packt
04 Mar 2015
18 min read
Save for later

Prototyping Arduino Projects using Python

Packt
04 Mar 2015
18 min read
In this article by Pratik Desai, the author of Python Programming for Arduino, we will cover the following topics: Working with pyFirmata methods Servomotor – moving the motor to a certain angle The Button() widget – interfacing GUI with Arduino and LEDs (For more resources related to this topic, see here.) Working with pyFirmata methods The pyFirmata package provides useful methods to bridge the gap between Python and Arduino's Firmata protocol. Although these methods are described with specific examples, you can use them in various different ways. This section also provides detailed description of a few additional methods. Setting up the Arduino board To set up your Arduino board in a Python program using pyFirmata, you need to specifically follow the steps that we have written down. We have distributed the entire code that is required for the setup process into small code snippets in each step. While writing your code, you will have to carefully use the code snippets that are appropriate for your application. You can always refer to the example Python files containing the complete code. Before we go ahead, let's first make sure that your Arduino board is equipped with the latest version of the StandardFirmata program and is connected to your computer: Depending upon the Arduino board that is being utilized, start by importing the appropriate pyFirmata classes to the Python code. Currently, the inbuilt pyFirmata classes only support the Arduino Uno and Arduino Mega boards: from pyfirmata import Arduino In case of Arduino Mega, use the following line of code: from pyfirmata import ArduinoMega Before we start executing any methods that is associated with handling pins, it is required to properly set the Arduino board. To perform this task, we have to first identify the USB port to which the Arduino board is connected and assign this location to a variable in the form of a string object. For Mac OS X, the port string should approximately look like this: port = '/dev/cu.usbmodemfa1331' For Windows, use the following string structure: port = 'COM3' In the case of the Linux operating system, use the following line of code: port = '/dev/ttyACM0' The port's location might be different according to your computer configuration. You can identify the correct location of your Arduino USB port by using the Arduino IDE. Once you have imported the Arduino class and assigned the port to a variable object, it's time to engage Arduino with pyFirmata and associate this relationship to another variable: board = Arduino(port) Similarly, for Arduino Mega, use this: board = ArduinoMega(port) The synchronization between the Arduino board and pyFirmata requires some time. Adding sleep time between the preceding assignment and the next set of instructions can help to avoid any issues that are related to serial port buffering. The easiest way to add sleep time is to use the inbuilt Python method, sleep(time): from time import sleep sleep(1) The sleep() method takes seconds as the parameter and a floating-point number can be used to provide the specific sleep time. For example, for 200 milliseconds, it will be sleep(0.2). At this point, you have successfully synchronized your Arduino Uno or Arduino Mega board to the computer using pyFirmata. What if you want to use a different variant (other than Arduino Uno or ArduinoMega) of the Arduino board? Any board layout in pyFirmata is defined as a dictionary object. The following is a sample of the dictionary object for the Arduino board: arduino = {     'digital' : tuple(x for x in range(14)),     'analog' : tuple(x for x in range(6)),     'pwm' : (3, 5, 6, 9, 10, 11),     'use_ports' : True,     'disabled' : (0, 1) # Rx, Tx, Crystal     } For your variant of the Arduino board, you have to first create a custom dictionary object. To create this object, you need to know the hardware layout of your board. For example, an Arduino Nano board has a layout similar to a regular Arduino board, but it has eight instead of six analog ports. Therefore, the preceding dictionary object can be customized as follows: nano = {     'digital' : tuple(x for x in range(14)),     'analog' : tuple(x for x in range(8)),     'pwm' : (3, 5, 6, 9, 10, 11),     'use_ports' : True,     'disabled' : (0, 1) # Rx, Tx, Crystal     } As you have already synchronized the Arduino board earlier, modify the layout of the board using the setup_layout(layout) method: board.setup_layout(nano) This command will modify the default layout of the synchronized Arduino board to the Arduino Nano layout or any other variant for which you have customized the dictionary object. Configuring Arduino pins Once your Arduino board is synchronized, it is time to configure the digital and analog pins that are going to be used as part of your program. Arduino board has digital I/O pins and analog input pins that can be utilized to perform various operations. As we already know, some of these digital pins are also capable of PWM. The direct method Now before we start writing or reading any data to these pins, we have to first assign modes to these pins. In the Arduino sketch-based, we use the pinMode function, that is, pinMode(11, INPUT) for this operation. Similarly, in pyFirmata, this assignment operation is performed using the mode method on the board object as shown in the following code snippet: from pyfirmata import Arduino from pyfirmata import INPUT, OUTPUT, PWM   # Setting up Arduino board port = '/dev/cu.usbmodemfa1331' board = Arduino(port)   # Assigning modes to digital pins board.digital[13].mode = OUTPUT board.analog[0].mode = INPUT The pyFirmata library includes classes for the INPUT and OUTPUT modes, which are required to be imported before you utilized them. The preceding example shows the delegation of digital pin 13 as an output and the analog pin 0 as an input. The mode method is performed on the variable assigned to the configured Arduino board using the digital[] and analog[] array index assignment. The pyFirmata library also supports additional modes such as PWM and SERVO. The PWM mode is used to get analog results from digital pins, while SERVO mode helps a digital pin to set the angle of the shaft between 0 to 180 degrees. If you are using any of these modes, import their appropriate classes from the pyFirmata library. Once these classes are imported from the pyFirmata package, the modes for the appropriate pins can be assigned using the following lines of code: board.digital[3].mode = PWM board.digital[10].mode = SERVO Assigning pin modes The direct method of configuring pin is mostly used for a single line of execution calls. In a project containing a large code and complex logic, it is convenient to assign a pin with its role to a variable object. With an assignment like this, you can later utilize the assigned variable throughout the program for various actions, instead of calling the direct method every time you need to use that pin. In pyFirmata, this assignment can be performed using the get_pin(pin_def) method: from pyfirmata import Arduino port = '/dev/cu.usbmodemfa1311' board = Arduino(port)   # pin mode assignment ledPin = board.get_pin('d:13:o') The get_pin() method lets you assign pin modes using the pin_def string parameter, 'd:13:o'. The three components of pin_def are pin type, pin number, and pin mode separated by a colon (:) operator. The pin types ( analog and digital) are denoted with a and d respectively. The get_pin() method supports three modes, i for input, o for output, and p for PWM. In the previous code sample, 'd:13:o' specifies the digital pin 13 as an output. In another example, if you want to set up the analog pin 1 as an input, the parameter string will be 'a:1:i'. Working with pins As you have configured your Arduino pins, it's time to start performing actions using them. Two different types of methods are supported while working with pins: reporting methods and I/O operation methods. Reporting data When pins get configured in a program as analog input pins, they start sending input values to the serial port. If the program does not utilize this incoming data, the data starts getting buffered at the serial port and quickly overflows. The pyFirmata library provides the reporting and iterator methods to deal with this phenomenon. The enable_reporting() method is used to set the input pin to start reporting. This method needs to be utilized before performing a reading operation on the pin: board.analog[3].enable_reporting() Once the reading operation is complete, the pin can be set to disable reporting: board.analog[3].disable_reporting() In the preceding example, we assumed that you have already set up the Arduino board and configured the mode of the analog pin 3 as INPUT. The pyFirmata library also provides the Iterator() class to read and handle data over the serial port. While working with analog pins, we recommend that you start an iterator thread in the main loop to update the pin value to the latest one. If the iterator method is not used, the buffered data might overflow your serial port. This class is defined in the util module of the pyFirmata package and needs to be imported before it is utilized in the code: from pyfirmata import Arduino, util # Setting up the Arduino board port = 'COM3' board = Arduino(port) sleep(5)   # Start Iterator to avoid serial overflow it = util.Iterator(board) it.start() Manual operations As we have configured the Arduino pins to suitable modes and their reporting characteristic, we can start monitoring them. The pyFirmata provides the write() and read() methods for the configured pins. The write() method The write() method is used to write a value to the pin. If the pin's mode is set to OUTPUT, the value parameter is a Boolean, that is, 0 or 1: board.digital[pin].mode = OUTPUT board.digital[pin].write(1) If you have used an alternative method of assigning the pin's mode, you can use the write() method as follows: ledPin = board.get_pin('d:13:o') ledPin.write(1) In case of the PWM signal, the Arduino accepts a value between 0 and 255 that represents the length of the duty cycle between 0 and 100 percent. The PyFiramta library provides a simplified method to deal with the PWM values as instead of values between 0 and 255, as you can just provide a float value between 0 and 1.0. For example, if you want a 50 percent duty cycle (2.5V analog value), you can specify 0.5 with the write() method. The pyFirmata library will take care of the translation and send the appropriate value, that is, 127, to the Arduino board via the Firmata protocol: board.digital[pin].mode = PWM board.digital[pin].write(0.5) Similarly, for the indirect method of assignment, you can use code similar to the following one: pwmPin = board.get_pin('d:13:p') pwmPin.write(0.5) If you are using the SERVO mode, you need to provide the value in degrees between 0 and 180. Unfortunately, the SERVO mode is only applicable for direct assignment of the pins and will be available in future for indirect assignments: board.digital[pin].mode = SERVO board.digital[pin].write(90) The read() method The read() method provides an output value at the specified Arduino pin. When the Iterator() class is being used, the value received using this method is the latest updated value at the serial port. When you read a digital pin, you can get only one of the two inputs, HIGH or LOW, which will translate to 1 or 0 in Python: board.digital[pin].read() The analog pins of Arduino linearly translate the input voltages between 0 and +5V to 0 and 1023. However, in pyFirmata, the values between 0 and +5V are linearly translated into the float values of 0 and 1.0. For example, if the voltage at the analog pin is 1V, an Arduino program will measure a value somewhere around 204, but you will receive the float value as 0.2 while using pyFirmata's read() method in Python. Servomotor – moving the motor to certain angle Servomotors are widely used electronic components in applications such as pan-tilt camera control, robotics arm, mobile robot movements, and so on where precise movement of the motor shaft is required. This precise control of the motor shaft is possible because of the position sensing decoder, which is an integral part of the servomotor assembly. A standard servomotor allows the angle of the shaft to be set between 0 and 180 degrees. The pyFirmata provides the SERVO mode that can be implemented on every digital pin. This prototyping exercise provides a template and guidelines to interface a servomotor with Python. Connections Typically, a servomotor has wires that are color-coded red, black and yellow, respectively to connect with the power, ground, and signal of the Arduino board. Connect the power and the ground of the servomotor to the 5V and the ground of the Arduino board. As displayed in the following diagram, connect the yellow signal wire to the digital pin 13: If you want to use any other digital pin, make sure that you change the pin number in the Python program in the next section. Once you have made the appropriate connections, let's move on to the Python program. The Python code The Python file consisting this code is named servoCustomAngle.py and is located in the code bundle of this book, which can be downloaded from https://www.packtpub.com/books/content/support/19610. Open this file in your Python editor. Like other examples, the starting section of the program contains the code to import the libraries and set up the Arduino board: from pyfirmata import Arduino, SERVO from time import sleep   # Setting up the Arduino board port = 'COM5' board = Arduino(port) # Need to give some time to pyFirmata and Arduino to synchronize sleep(5) Now that you have Python ready to communicate with the Arduino board, let's configure the digital pin that is going to be used to connect the servomotor to the Arduino board. We will complete this task by setting the mode of pin 13 to SERVO: # Set mode of the pin 13 as SERVO pin = 13 board.digital[pin].mode = SERVO The setServoAngle(pin,angle) custom function takes the pins on which the servomotor is connected and the custom angle as input parameters. This function can be used as a part of various large projects that involve servos: # Custom angle to set Servo motor angle def setServoAngle(pin, angle):   board.digital[pin].write(angle)   sleep(0.015) In the main logic of this template, we want to incrementally move the motor shaft in one direction until it achieves the maximum achievable angle (180 degrees) and then move it back to the original position with the same incremental speed. In the while loop, we will ask the user to provide inputs to continue this routine, which will be captured using the raw_input() function. The user can enter character y to continue this routine or enter any other character to abort the loop: # Testing the function by rotating motor in both direction while True:   for i in range(0, 180):     setServoAngle(pin, i)   for i in range(180, 1, -1):     setServoAngle(pin, i)     # Continue or break the testing process   i = raw_input("Enter 'y' to continue or Enter to quit): ")   if i == 'y':     pass   else:     board.exit()     break While working with all these prototyping examples, we used the direct communication method by using digital and analog pins to connect the sensor with Arduino. Now, let's get familiar with another widely used communication method between Arduino and the sensors. This is called I2C communication. The Button() widget – interfacing GUI with Arduino and LEDs Now that you have had your first hands-on experience in creating a Python graphical interface, let's integrate Arduino with it. Python makes it easy to interface various heterogeneous packages within each other and that is what you are going to do. In the next coding exercise, we will use Tkinter and pyFirmata to make the GUI work with Arduino. In this exercise, we are going to use the Button() widget to control the LEDs interfaced with the Arduino board. Before we jump to the exercises, let's build the circuit that we will need for all upcoming programs. The following is a Fritzing diagram of the circuit where we use two different colored LEDs with pull up resistors. Connect these LEDs to digital pins 10 and 11 on your Arduino Uno board, as displayed in the following diagram: While working with the code provided in this section, you will have to replace the Arduino port that is used to define the board variable according to your operating system. Also, make sure that you provide the correct pin number in the code if you are planning to use any pins other than 10 and 11. For some exercises, you will have to use the PWM pins, so make sure that you have correct pins. You can use the entire code snippet as a Python file and run it. But, this might not be possible in the upcoming exercises due to the length of the program and the complexity involved. For the Button() widget exercise, open the exampleButton.py file. The code contains three main components: pyFirmata and Arduino configurations Tkinter widget definitions for a button The LED blink function that gets executed when you press the button As you can see in the following code snippet, we have first imported libraries and initialized the Arduino board using the pyFirmata methods. For this exercise, we are only going to work with one LED and we have initialized only the ledPin variable for it: import Tkinter import pyfirmata from time import sleep port = '/dev/cu.usbmodemfa1331' board = pyfirmata.Arduino(port) sleep(5) ledPin = board.get_pin('d:11:o') As we are using the pyFirmata library for all the exercises in this article, make sure that you have uploaded the latest version of the standard Firmata sketch on your Arduino board. In the second part of the code, we have initialized the root Tkinter widget as top and provided a title string. We have also fixed the size of this window using the minsize() method. In order to get more familiar with the root widget, you can play around with the minimum and maximum size of the window: top = Tkinter.Tk() top.title("Blink LED using button") top.minsize(300,30) The Button() widget is a standard Tkinter widget that is mostly used to obtain the manual, external input stimulus from the user. Like the Label() widget, the Button() widget can be used to display text or images. Unlike the Label() widget, it can be associated with actions or methods when it is pressed. When the button is pressed, Tkinter executes the methods or commands specified by the command option: startButton = Tkinter.Button(top,                              text="Start",                              command=onStartButtonPress) startButton.pack() In this initialization, the function associated with the button is onStartButtonPress and the "Start" string is displayed as the title of the button. Similarly, the top object specifies the parent or the root widget. Once the button is instantiated, you will need to use the pack() method to make it available in the main window. In the preceding lines of code, the onStartButonPress() function includes the scripts that are required to blink the LEDs and change the state of the button. A button state can have the state as NORMAL, ACTIVE, or DISABLED. If it is not specified, the default state of any button is NORMAL. The ACTIVE and DISABLED states are useful in applications when repeated pressing of the button needs to be avoided. After turning the LED on using the write(1) method, we will add a time delay of 5 seconds using the sleep(5) function before turning it off with the write(0) method: def onStartButtonPress():   startButton.config(state=Tkinter.DISABLED)   ledPin.write(1)   # LED is on for fix amount of time specified below   sleep(5)   ledPin.write(0)   startButton.config(state=Tkinter.ACTIVE) At the end of the program, we will execute the mainloop() method to initiate the Tkinter loop. Until this function is executed, the main window won't appear. To run the code, make appropriate changes to the Arduino board variable and execute the program. The following screenshot with a button and title bar will appear as the output of the program. Clicking on the Start button will turn on the LED on the Arduino board for the specified time delay. Meanwhile, when the LED is on, you will not be able to click on the Start button again. Now, in this particular program, we haven't provided sufficient code to safely disengage the Arduino board and it will be covered in upcoming exercises. Summary In this article, we learned about the Python library pyFirmata to interface Arduino to your computer using the Firmata protocol. We build a prototype using pyFirmata and Arduino to control servomotor and also developed another one with GUI, based on the Tkinter library, to control LEDs. Resources for Article: Further resources on this subject: Python Functions : Avoid Repeating Code? [article] Python 3 Designing Tasklist Application [article] The Five Kinds Of Python Functions Python 3.4 Edition [article]
Read more
  • 0
  • 0
  • 24158

article-image-writing-consumers
Packt
04 Mar 2015
20 min read
Save for later

Writing Consumers

Packt
04 Mar 2015
20 min read
This article by Nishant Garg, the author of the book Learning Apache Kafka Second Edition, focuses on the details of Writing Consumers. Consumers are the applications that consume the messages published by Kafka producers and process the data extracted from them. Like producers, consumers can also be different in nature, such as applications doing real-time or near real-time analysis, applications with NoSQL or data warehousing solutions, backend services, consumers for Hadoop, or other subscriber-based solutions. These consumers can also be implemented in different languages such as Java, C, and Python. (For more resources related to this topic, see here.) In this article, we will focus on the following topics: The Kafka Consumer API Java-based Kafka consumers Java-based Kafka consumers consuming partitioned messages At the end of the article, we will explore some of the important properties that can be set for a Kafka consumer. So, let's start. The preceding diagram explains the high-level working of the Kafka consumer when consuming the messages. The consumer subscribes to the message consumption from a specific topic on the Kafka broker. The consumer then issues a fetch request to the lead broker to consume the message partition by specifying the message offset (the beginning position of the message offset). Therefore, the Kafka consumer works in the pull model and always pulls all available messages after its current position in the Kafka log (the Kafka internal data representation). While subscribing, the consumer connects to any of the live nodes and requests metadata about the leaders for the partitions of a topic. This allows the consumer to communicate directly with the lead broker receiving the messages. Kafka topics are divided into a set of ordered partitions and each partition is consumed by one consumer only. Once a partition is consumed, the consumer changes the message offset to the next partition to be consumed. This represents the states about what has been consumed and also provides the flexibility of deliberately rewinding back to an old offset and re-consuming the partition. In the next few sections, we will discuss the API provided by Kafka for writing Java-based custom consumers. All the Kafka classes referred to in this article are actually written in Scala. Kafka consumer APIs Kafka provides two types of API for Java consumers: High-level API Low-level API The high-level consumer API The high-level consumer API is used when only data is needed and the handling of message offsets is not required. This API hides broker details from the consumer and allows effortless communication with the Kafka cluster by providing an abstraction over the low-level implementation. The high-level consumer stores the last offset (the position within the message partition where the consumer left off consuming the message), read from a specific partition in Zookeeper. This offset is stored based on the consumer group name provided to Kafka at the beginning of the process. The consumer group name is unique and global across the Kafka cluster and any new consumers with an in-use consumer group name may cause ambiguous behavior in the system. When a new process is started with the existing consumer group name, Kafka triggers a rebalance between the new and existing process threads for the consumer group. After the rebalance, some messages that are intended for a new process may go to an old process, causing unexpected results. To avoid this ambiguous behavior, any existing consumers should be shut down before starting new consumers for an existing consumer group name. The following are the classes that are imported to write Java-based basic consumers using the high-level consumer API for a Kafka cluster: ConsumerConnector: Kafka provides the ConsumerConnector interface (interface ConsumerConnector) that is further implemented by the ZookeeperConsumerConnector class (kafka.javaapi.consumer.ZookeeperConsumerConnector). This class is responsible for all the interaction a consumer has with ZooKeeper. The following is the class diagram for the ConsumerConnector class: KafkaStream: Objects of the kafka.consumer.KafkaStream class are returned by the createMessageStreams call from the ConsumerConnector implementation. This list of the KafkaStream objects is returned for each topic, which can further create an iterator over messages in the stream. The following is the Scala-based class declaration: class KafkaStream[K,V](private val queue:                       BlockingQueue[FetchedDataChunk],                       consumerTimeoutMs: Int,                       private val keyDecoder: Decoder[K],                       private val valueDecoder: Decoder[V],                       val clientId: String) Here, the parameters K and V specify the type for the partition key and message value, respectively. In the create call from the ConsumerConnector class, clients can specify the number of desired streams, where each stream object is used for single-threaded processing. These stream objects may represent the merging of multiple unique partitions. ConsumerConfig: The kafka.consumer.ConsumerConfig class encapsulates the property values required for establishing the connection with ZooKeeper, such as ZooKeeper URL, ZooKeeper session timeout, and ZooKeeper sink time. It also contains the property values required by the consumer such as group ID and so on. A high-level API-based working consumer example is discussed after the next section. The low-level consumer API The high-level API does not allow consumers to control interactions with brokers. Also known as "simple consumer API", the low-level consumer API is stateless and provides fine grained control over the communication between Kafka broker and the consumer. It allows consumers to set the message offset with every request raised to the broker and maintains the metadata at the consumer's end. This API can be used by both online as well as offline consumers such as Hadoop. These types of consumers can also perform multiple reads for the same message or manage transactions to ensure the message is consumed only once. Compared to the high-level consumer API, developers need to put in extra effort to gain low-level control within consumers by keeping track of offsets, figuring out the lead broker for the topic and partition, handling lead broker changes, and so on. In the low-level consumer API, consumers first query the live broker to find out the details about the lead broker. Information about the live broker can be passed on to the consumers either using a properties file or from the command line. The topicsMetadata() method of the kafka.javaapi.TopicMetadataResponse class is used to find out metadata about the topic of interest from the lead broker. For message partition reading, the kafka.api.OffsetRequest class defines two constants: EarliestTime and LatestTime, to find the beginning of the data in the logs and the new messages stream. These constants also help consumers to track which messages are already read. The main class used within the low-level consumer API is the SimpleConsumer (kafka.javaapi.consumer.SimpleConsumer) class. The following is the class diagram for the SimpleConsumer class:   A simple consumer class provides a connection to the lead broker for fetching the messages from the topic and methods to get the topic metadata and the list of offsets. A few more important classes for building different request objects are FetchRequest (kafka.api.FetchRequest), OffsetRequest (kafka.javaapi.OffsetRequest), OffsetFetchRequest (kafka.javaapi.OffsetFetchRequest), OffsetCommitRequest (kafka.javaapi.OffsetCommitRequest), and TopicMetadataRequest (kafka.javaapi.TopicMetadataRequest). All the examples in this article are based on the high-level consumer API. For examples based on the low-level consumer API, refer tohttps://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example. Simple Java consumers Now we will start writing a single-threaded simple Java consumer developed using the high-level consumer API for consuming the messages from a topic. This SimpleHLConsumer class is used to fetch a message from a specific topic and consume it, assuming that there is a single partition within the topic. Importing classes As a first step, we need to import the following classes: import kafka.consumer.ConsumerConfig; import kafka.consumer.ConsumerIterator; import kafka.consumer.KafkaStream; import kafka.javaapi.consumer.ConsumerConnector; Defining properties As a next step, we need to define properties for making a connection with Zookeeper and pass these properties to the Kafka consumer using the following code: Properties props = new Properties(); props.put("zookeeper.connect", "localhost:2181"); props.put("group.id", "testgroup"); props.put("zookeeper.session.timeout.ms", "500"); props.put("zookeeper.sync.time.ms", "250"); props.put("auto.commit.interval.ms", "1000"); new ConsumerConfig(props); Now let us see the major properties mentioned in the code: zookeeper.connect: This property specifies the ZooKeeper <node:port> connection detail that is used to find the Zookeeper running instance in the cluster. In the Kafka cluster, Zookeeper is used to store offsets of messages consumed for a specific topic and partition by this consumer group. group.id: This property specifies the name for the consumer group shared by all the consumers within the group. This is also the process name used by Zookeeper to store offsets. zookeeper.session.timeout.ms: This property specifies the Zookeeper session timeout in milliseconds and represents the amount of time Kafka will wait for Zookeeper to respond to a request before giving up and continuing to consume messages. zookeeper.sync.time.ms: This property specifies the ZooKeeper sync time in milliseconds between the ZooKeeper leader and the followers. auto.commit.interval.ms: This property defines the frequency in milliseconds at which consumer offsets get committed to Zookeeper. Reading messages from a topic and printing them As a final step, we need to read the message using the following code: Map<String, Integer> topicMap = new HashMap<String, Integer>(); // 1 represents the single thread topicCount.put(topic, new Integer(1));   Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreamsMap = consumer.createMessageStreams(topicMap);   // Get the list of message streams for each topic, using the default decoder. List<KafkaStream<byte[], byte[]>>streamList =  consumerStreamsMap.get(topic);   for (final KafkaStream <byte[], byte[]> stream : streamList) { ConsumerIterator<byte[], byte[]> consumerIte = stream.iterator();   while (consumerIte.hasNext())     System.out.println("Message from Single Topic :: "     + new String(consumerIte.next().message())); } So the complete program will look like the following code: package kafka.examples.ch5;   import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Properties;   import kafka.consumer.ConsumerConfig; import kafka.consumer.ConsumerIterator; import kafka.consumer.KafkaStream; import kafka.javaapi.consumer.ConsumerConnector;   public class SimpleHLConsumer {   private final ConsumerConnector consumer;   private final String topic;     public SimpleHLConsumer(String zookeeper, String groupId, String topic) {     consumer = kafka.consumer.Consumer         .createJavaConsumerConnector(createConsumerConfig(zookeeper,             groupId));     this.topic = topic;   }     private static ConsumerConfig createConsumerConfig(String zookeeper,         String groupId) {     Properties props = new Properties();     props.put("zookeeper.connect", zookeeper);     props.put("group.id", groupId);     props.put("zookeeper.session.timeout.ms", "500");     props.put("zookeeper.sync.time.ms", "250");     props.put("auto.commit.interval.ms", "1000");       return new ConsumerConfig(props);     }     public void testConsumer() {       Map<String, Integer> topicMap = new HashMap<String, Integer>();       // Define single thread for topic     topicMap.put(topic, new Integer(1));       Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreamsMap =         consumer.createMessageStreams(topicMap);       List<KafkaStream<byte[], byte[]>> streamList = consumerStreamsMap         .get(topic);       for (final KafkaStream<byte[], byte[]> stream : streamList) {       ConsumerIterator<byte[], byte[]> consumerIte = stream.iterator();       while (consumerIte.hasNext())         System.out.println("Message from Single Topic :: "           + new String(consumerIte.next().message()));     }     if (consumer != null)       consumer.shutdown();   }     public static void main(String[] args) {       String zooKeeper = args[0];     String groupId = args[1];     String topic = args[2];     SimpleHLConsumer simpleHLConsumer = new SimpleHLConsumer(           zooKeeper, groupId, topic);     simpleHLConsumer.testConsumer();   }   } Before running this, make sure you have created the topic kafkatopic from the command line: [root@localhost kafka_2.9.2-0.8.1.1]#bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic kafkatopic Before compiling and running a Java-based Kafka program in the console, make sure you download the slf4j-1.7.7.tar.gz file from http://www.slf4j.org/download.html and copy slf4j-log4j12-1.7.7.jar contained within slf4j-1.7.7.tar.gz to the /opt/kafka_2.9.2-0.8.1.1/libs directory. Also add all the libraries available in /opt/kafka_2.9.2-0.8.1.1/libs to the classpath using the following commands: [root@localhost kafka_2.9.2-0.8.1.1]# export KAFKA_LIB=/opt/kafka_2.9.2-0.8.1.1/libs [root@localhost kafka_2.9.2-0.8.1.1]# export CLASSPATH=.:$KAFKA_LIB/jopt-simple-3.2.jar:$KAFKA_LIB/kafka_2.9.2-0.8.1.1.jar:$KAFKA_LIB/log4j-1.2.15.jar:$KAFKA_LIB/metrics-core-2.2.0.jar:$KAFKA_LIB/scala-library-2.9.2.jar:$KAFKA_LIB/slf4j-api-1.7.2.jar:$KAFKA_LIB/slf4j-log4j12-1.7.7.jar:$KAFKA_LIB/snappy-java-1.0.5.jar:$KAFKA_LIB/zkclient-0.3.jar:$KAFKA_LIB/zookeeper-3.3.4.jar Multithreaded Java consumers The previous example is a very basic example of a consumer that consumes messages from a single broker with no explicit partitioning of messages within the topic. Let's jump to the next level and write another program that consumes messages from multiple partitions connecting to single/multiple topics. A multithreaded, high-level, consumer-API-based design is usually based on the number of partitions in the topic and follows a one-to-one mapping approach between the thread and the partitions within the topic. For example, if four partitions are defined for any topic, as a best practice, only four threads should be initiated with the consumer application to read the data; otherwise, some conflicting behavior, such as threads never receiving a message or a thread receiving messages from multiple partitions, may occur. Also, receiving multiple messages will not guarantee that the messages will be placed in order. For example, a thread may receive two messages from the first partition and three from the second partition, then three more from the first partition, followed by some more from the first partition, even if the second partition has data available. Let's move further on. Importing classes As a first step, we need to import the following classes: import kafka.consumer.ConsumerConfig; import kafka.consumer.ConsumerIterator; import kafka.consumer.KafkaStream; import kafka.javaapi.consumer.ConsumerConnector; Defining properties As the next step, we need to define properties for making a connection with Zookeeper and pass these properties to the Kafka consumer using the following code: Properties props = new Properties(); props.put("zookeeper.connect", "localhost:2181"); props.put("group.id", "testgroup"); props.put("zookeeper.session.timeout.ms", "500"); props.put("zookeeper.sync.time.ms", "250"); props.put("auto.commit.interval.ms", "1000"); new ConsumerConfig(props); The preceding properties have already been discussed in the previous example. For more details on Kafka consumer properties, refer to the last section of this article. Reading the message from threads and printing it The only difference in this section from the previous section is that we first create a thread pool and get the Kafka streams associated with each thread within the thread pool, as shown in the following code: // Define thread count for each topic topicMap.put(topic, new Integer(threadCount));   // Here we have used a single topic but we can also add // multiple topics to topicCount MAP Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreamsMap            = consumer.createMessageStreams(topicMap);   List<KafkaStream<byte[], byte[]>> streamList = consumerStreamsMap.get(topic);   // Launching the thread pool executor = Executors.newFixedThreadPool(threadCount); The complete program listing for the multithread Kafka consumer based on the Kafka high-level consumer API is as follows: package kafka.examples.ch5;   import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Properties; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors;   import kafka.consumer.ConsumerConfig; import kafka.consumer.ConsumerIterator; import kafka.consumer.KafkaStream; import kafka.javaapi.consumer.ConsumerConnector;   public class MultiThreadHLConsumer {     private ExecutorService executor;   private final ConsumerConnector consumer;   private final String topic;     public MultiThreadHLConsumer(String zookeeper, String groupId, String topic) {     consumer = kafka.consumer.Consumer         .createJavaConsumerConnector(createConsumerConfig(zookeeper, groupId));     this.topic = topic;   }     private static ConsumerConfig createConsumerConfig(String zookeeper,         String groupId) {     Properties props = new Properties();     props.put("zookeeper.connect", zookeeper);     props.put("group.id", groupId);     props.put("zookeeper.session.timeout.ms", "500");     props.put("zookeeper.sync.time.ms", "250");     props.put("auto.commit.interval.ms", "1000");       return new ConsumerConfig(props);     }     public void shutdown() {     if (consumer != null)       consumer.shutdown();     if (executor != null)       executor.shutdown();   }     public void testMultiThreadConsumer(int threadCount) {       Map<String, Integer> topicMap = new HashMap<String, Integer>();       // Define thread count for each topic     topicMap.put(topic, new Integer(threadCount));       // Here we have used a single topic but we can also add     // multiple topics to topicCount MAP     Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreamsMap =         consumer.createMessageStreams(topicMap);       List<KafkaStream<byte[], byte[]>> streamList = consumerStreamsMap         .get(topic);       // Launching the thread pool     executor = Executors.newFixedThreadPool(threadCount);       // Creating an object messages consumption     int count = 0;     for (final KafkaStream<byte[], byte[]> stream : streamList) {       final int threadNumber = count;       executor.submit(new Runnable() {       public void run() {       ConsumerIterator<byte[], byte[]> consumerIte = stream.iterator();       while (consumerIte.hasNext())         System.out.println("Thread Number " + threadNumber + ": "         + new String(consumerIte.next().message()));         System.out.println("Shutting down Thread Number: " +         threadNumber);         }       });       count++;     }     if (consumer != null)       consumer.shutdown();     if (executor != null)       executor.shutdown();   }     public static void main(String[] args) {       String zooKeeper = args[0];     String groupId = args[1];     String topic = args[2];     int threadCount = Integer.parseInt(args[3]);     MultiThreadHLConsumer multiThreadHLConsumer =         new MultiThreadHLConsumer(zooKeeper, groupId, topic);     multiThreadHLConsumer.testMultiThreadConsumer(threadCount);     try {       Thread.sleep(10000);     } catch (InterruptedException ie) {       }     multiThreadHLConsumer.shutdown();     } } Compile the preceding program, and before running it, read the following tip. Before we run this program, we need to make sure our cluster is running as a multi-broker cluster (comprising either single or multiple nodes).  Once your multi-broker cluster is up, create a topic with four partitions and set the replication factor to 2 before running this program using the following command: [root@localhost kafka-0.8]# bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic kafkatopic --partitions 4 --replication-factor 2 The Kafka consumer property list The following lists of a few important properties that can be configured for high-level, consumer-API-based Kafka consumers. The Scala class kafka.consumer.ConsumerConfig provides implementation-level details for consumer configurations. For a complete list, visit http://kafka.apache.org/documentation.html#consumerconfigs. Property name Description Default value group.id This property defines a unique identity for the set of consumers within the same consumer group.   consumer.id This property is specified for the Kafka consumer and generated automatically if not defined. null zookeeper.connect This property specifies the Zookeeper connection string, < hostname:port/chroot/path>. Kafka uses Zookeeper to store offsets of messages consumed for a specific topic and partition by the consumer group. /chroot/path defines the data location in a global zookeeper namespace.   client.id The client.id value is specified by the Kafka client with each request and is used to identify the client making the requests. ${group.id} zookeeper.session.timeout.ms This property defines the time (in milliseconds) for a Kafka consumer to wait for a Zookeeper pulse before it is declared dead and rebalance is initiated. 6000 zookeeper.connection.timeout.ms This value defines the maximum waiting time (in milliseconds) for the client to establish a connection with ZooKeeper. 6000 zookeeper.sync.time.ms This property defines the time it takes to sync a Zookeeper follower with the Zookeeper leader (in milliseconds). 2000 auto.commit.enable This property enables a periodical commit of message offsets to the Zookeeper that are already fetched by the consumer. In the event of consumer failures, these committed offsets are used as a starting position by the new consumers. true auto.commit.interval.ms This property defines the frequency (in milliseconds) for the consumed offsets to get committed to ZooKeeper. 60 * 1000 auto.offset.reset This property defines the offset value if an initial offset is available in Zookeeper or the offset is out of range. Possible values are: largest: reset to largest offset smallest: reset to smallest offset anything else: throw an exception largest consumer.timeout.ms This property throws an exception to the consumer if no message is available for consumption after the specified interval. -1 Summary In this article, we have learned how to write basic consumers and learned about some advanced levels of Java consumers that consume messages from partitions. Resources for Article: Further resources on this subject: Introducing Kafka? [article] Introduction To Apache Zookeeper [article] Creating Apache Jmeter™ Test Workbench [article]
Read more
  • 0
  • 0
  • 3687
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-test-driving-uitableviews-cedar
Joe Masilotti
04 Mar 2015
8 min read
Save for later

Test Driving UITableViews with Cedar

Joe Masilotti
04 Mar 2015
8 min read
One of the first things a developer does when learning iOS development is to display a list of items to the user. In iOS we use UITableViews to show one-dimensional tables of information. In practice they look like a long list of data and should be used in that way. UITableViews get their information from a UITableViewDataSource, which responds to a few delegate methods for a number of cells and what information the cells contain. This post will follow a step-by-step guide to test driving UITableViews in iOS. All code samples will use the behavior-driven testing framework Cedar. Cedar can be installed as a Cocoapod by adding the following to your Podfile: target Specs do pod Cedar end Follow this guide for installation and configuration instructions if you are having trouble or want a crash course on the framework. Unit-Style Approach One way to test table views is to follow a unit-style approach on the data source. The goal there is to call single public methods and assert that the correct state was altered or the return value was configured correctly. The target for unit testing a UITableView is its UITableViewDataSource property. The tests for this are fairly straightforward as they call -tableView:cellForRowAtIndexPath: and -tableView:numberOfCellsInSection: directly. For example, let's say we want our controller to display a table with the current list of iPhones. Our mental assertions are that this table should show a single section with nine items, one for each of the iPhone, iPhone 3G, iPhone 3GS, iPhone 4, iPhone 4s, iPhone 5, iPhone 5s, iPhone 6, and iPhone 6 Plus. The unit tests will follow a very similar pattern. Since a table defaults to one section we don't need to write a test asserting the number of sections. We can just go about testing that there are nine cells and assuming that the first and last cells text is correct, everything is working. describe(@"ViewController", ^{ __block ViewController *subject; beforeEach(^{ subject = [[ViewController alloc] init]; }); describe(@"-tableView:numberOfRowsInSection:", ^{ it(@"should have nine cells", ^{ [subject tableView:subject.tableView numberOfRowsInSection:0] should equal(9); }); }); describe(@"-tableView:cellForRowAtIndexPath:", ^{ __block UITableViewCell *cell; context(@"the first cell", ^{ beforeEach(^{ NSIndexPath *indexPath = [NSIndexPath indexPathForRow:0 inSection:0]; cell = [subject tableView:subject.tableView cellForRowAtIndexPath:indexPath]; }); it(@"should display 'iPhone'", ^{ cell.textLabel.text should equal(@"iPhone"); }); }); context(@"the last cell", ^{ beforeEach(^{ NSIndexPath *indexPath = [NSIndexPath indexPathForRow:8 inSection:0]; cell = [subject tableView:subject.tableView cellForRowAtIndexPath:indexPath]; }); it(@"should display 'iPhone 6 Plus'", ^{ cell.textLabel.text should equal(@"iPhone 6 Plus"); }); }); }); }); Now the good part about these tests is that they are easy to follow and straight to the point. When we ask how many items there are we expect the right amount. And when we want to ensure the first cell is set up correctly we test just that. Issues Unfortunately there are a few problems with this approach. The biggest issue is that we can get these tests to pass without actually displaying anything on the screen. A simple implementation of these two methods in our controller will make everything green but has no guarantee that a table view is on the screen (or that one even exists!). The first step in remedying this is to write a test asserting that the table view is a subview. Another, albeit minor, issue is we are breaking encapsulation; we are exposing that our controller conforms to the UITableViewDataSource protocol. Let's see what we can do about these two problems. Benefits Don't think that unit-style is bad, it just has different uses. If you have an app that uses multiple instances you will see benefits from this approach. This is because all you would need in your controller is to ensure the right type of data source was configured. You could take this one step farther by injecting the array of items to display and unit testing that. Then you have a repeatable unit of code that shows a list of data conforming to your app's specifications, which is quite powerful. Behavior-Driven Approach Let's take a more behavioral approach to our problem. Our goal is to display to the user the list of iPhones. If we care about what the user sees what is the closest way of replicating that? How about what cells are visible to the user? From Apple's documentation, -visibleCells on UITableView: Returns the table cells that are visible in the receiver. This sounds interesting. Let's restructure our tests to run assertions on the cells that the user sees, not some made up world of delegates and data sources. describe(@"when the view loads", ^{ beforeEach(^{ subject.view should_not be_nil; [subject.view layoutIfNeeded]; }); it(@"should display the first iPhone, first", ^{ UITableViewCell *firstCell = subject.tableView.visibleCells.firstObject; firstCell.textLabel.text should equal(@"iPhone"); }); it(@"display the iPhone 6 Plus, last", ^{ UITableViewCell *lastCell = subject.tableView.visibleCells.lastObject; lastCell.textLabel.text should equal(@"iPhone 6 Plus"); }); }); Note that in the beforeEach we assert that the view should exist. This is to kick off the controller's view lifecycle methods, namely -loadView and -viewDidLoad. We then tell its view to layout its subviews if need be. This ensures that anything we add as subviews have their layout constraints configured and applied. To get this to pass we have a few things to take care of. Create the backing array of iPhones Create the table view and add it as a subview Become the data source and respond to the calls The first one is easy so let's knock that out first. @interface ViewController () <UITableViewDataSource> @property (nonatomic) UITableView *tableView; @property (nonatomic, strong) NSArray *iPhones; @end @implementation ViewController - (instancetype)init { if (self = [super init]) { self.iPhones = @[ @"iPhone", @"iPhone 3G", @"iPhone 3GS", @"iPhone 4", @"iPhone 4s", @"iPhone 5", @"iPhone 5s", @"iPhone 6", @"iPhone 6 Plus" ]; } return self; } Note the opening up of the -tableView property in the interface extension. This allows us to keep it private in the header and the outside world while still being able to modify it internally. Next let's add the table view and its auto layout constraints. - (void)viewDidLoad { [super viewDidLoad]; self.tableView = [[UITableView alloc] init]; [self.view addSubview:self.tableView]; [self addTableViewConstraints]; } #pragma mark - Private - (void)addTableViewConstraints { self.tableView.translatesAutoresizingMaskIntoConstraints = NO; NSDictionary *views = @{ @"tableView": self.tableView }; [self.view addConstraints:[NSLayoutConstraint constraintsWithVisualFormat:@"V:|[tableView]|" options:kNilOptions metrics:nil views:views]]; [self.view addConstraints:[NSLayoutConstraint constraintsWithVisualFormat:@"H:|[tableView]|" options:kNilOptions metrics:nil views:views]]; } Since we aren't working with Storyboards or xibs/nibs we create the table view manually and add it as a subview. We also will need to add some simple auto layout constraints to have it fill the screen. Check out Apple's Auto Layout by Example guide if you would like a deeper explanation. Finally let's get to the meat of the issue and respond to the data source methods. #pragma mark - <UITableViewDataSource> - (NSInteger)tableView:(UITableView *)tableView numberOfRowsInSection:(NSInteger)section { return self.iPhones.count; } - (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath { UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:kCellIdentifier forIndexPath:indexPath]; cell.textLabel.text = self.iPhones[indexPath.row]; return cell; } We also need to become the data source of the table so do that and register the cell in -viewDidLoad. [self.tableView registerClass:[UITableViewCell class] forCellReuseIdentifier:kCellIdentifier]; self.tableView.dataSource = self; Finally add the constant to the top of the file. NSString * const kCellIdentifier = @"CellIdentifier"; What's interesting with this approach is that not until you have every line correct with the tests pass. This helps ensure that what is happening under spec is closer to the real experience of the app. For example, having a table view on the screen, responding to the delegate calls, but not assigning the delegate won't get you anywhere. In the unit approach you could have done just that but still seen your tests go green. Benefits of Behavior Testing When testing behavior you put yourself in a world that more closely represents the state when a user is interacting with it. It also enables you to test collaboration between objects without having to single very simple piece of the architecture out. This means it can be easy to get carried away and start writing full integration tests from controllers. If you keep to only testing one or two layers of abstraction, in this case the table view through the delegate, your code and specs remain easy to read and understand. A side effect of this approach enabled us to hide some implementation details in the production code. This means we are more freely to do a green-to-green refactor without having to change our specs. For example, we could extract the UITableViewDataSource into its own object and know that it works correctly when all of the existing tests still pass. If we wanted to then reuse that collaborator we could then extract the specs and have it stand on its own. Or if our backing array turned into an NSDictionary and found everything by key nothing in our tests would have to change. There are many styles of testing and even more ways to test Objective-C code and the Cocoa Touch framework. Behavior testing is just one approach that has proved to be the most flexible and easy to understand for me. What other techniques and methods have you implemented to ensure code coverage on your own iOS apps? About the author Joe Masilotti is a test-driven iOS developer living in Brooklyn, NY. He contributes to open-source testing tools on GitHub and talks about development, cooking, and craft beer on Twitter.
Read more
  • 0
  • 0
  • 2975

Packt
04 Mar 2015
22 min read
Save for later

Python functions – Avoid repeating code

Packt
04 Mar 2015
22 min read
In this article by Silas Toms, author of the book ArcPy and ArcGIS – Geospatial Analysis with Python we will see how programming languages share a concept that has aided programmers for decades: functions. The idea of a function, loosely speaking, is to create blocks of code that will perform an action on a piece of data, transforming it as required by the programmer and returning the transformed data back to the main body of code. Functions are used because they solve many different needs within programming. Functions reduce the need to write repetitive code, which in turn reduces the time needed to create a script. They can be used to create ranges of numbers (the range() function), or to determine the maximum value of a list (the max function), or to create a SQL statement to select a set of rows from a feature class. They can even be copied and used in another script or included as part of a module that can be imported into scripts. Function reuse has the added bonus of making programming more useful and less of a chore. When a scripter starts writing functions, it is a major step towards making programming part of a GIS workflow. (For more resources related to this topic, see here.) Technical definition of functions Functions, also called subroutines or procedures in other programming languages, are blocks of code that have been designed to either accept input data and transform it, or provide data to the main program when called without any input required. In theory, functions will only transform data that has been provided to the function as a parameter; it should not change any other part of the script that has not been included in the function. To make this possible, the concept of namespaces is invoked. Namespaces make it possible to use a variable name within a function, and allow it to represent a value, while also using the same variable name in another part of the script. This becomes especially important when importing modules from other programmers; within that module and its functions, the variables that it contains might have a variable name that is the same as a variable name within the main script. In a high-level programming language such as Python, there is built-in support for functions, including the ability to define function names and the data inputs (also known as parameters). Functions are created using the keyword def plus a function name, along with parentheses that may or may not contain parameters. Parameters can also be defined with default values, so parameters only need to be passed to the function when they differ from the default. The values that are returned from the function are also easily defined. A first function Let's create a function to get a feel for what is possible when writing functions. First, we need to invoke the function by providing the def keyword and providing a name along with the parentheses. The firstFunction() will return a string when called: def firstFunction():    'a simple function returning a string'    return "My First Function" >>>firstFunction() The output is as follows: 'My First Function' Notice that this function has a documentation string or doc string (a simple function returning a string) that describes what the function does; this string can be called later to find out what the function does, using the __doc__ internal function: >>>print firstFunction.__doc__ The output is as follows: 'a simple function returning a string' The function is defined and given a name, and then the parentheses are added followed by a colon. The following lines must then be indented (a good IDE will add the indention automatically). The function does not have any parameters, so the parentheses are empty. The function then uses the keyword return to return a value, in this case a string, from the function. Next, the function is called by adding parentheses to the function name. When it is called, it will do what it has been instructed to do: return the string. Functions with parameters Now let's create a function that accepts parameters and transforms them as needed. This function will accept a number and multiply it by 3: def secondFunction(number):    'this function multiples numbers by 3'    return number *3 >>> secondFunction(4) The output is as follows: 12 The function has one flaw, however; there is no assurance that the value passed to the function is a number. We need to add a conditional to the function to make sure it does not throw an exception: def secondFunction(number):    'this function multiples numbers by 3'    if type(number) == type(1) or type(number) == type(1.0):        return number *3 >>> secondFunction(4.0) The output is as follows: 12.0 >>>secondFunction(4) The output is as follows: 12 >>>secondFunction("String") >>> The function now accepts a parameter, checks what type of data it is, and returns a multiple of the parameter whether it is an integer or a function. If it is a string or some other data type, as shown in the last example, no value is returned. There is one more adjustment to the simple function that we should discuss: parameter defaults. By including default values in the definition of the function, we avoid having to provide parameters that rarely change. If, for instance, we wanted a different multiplier than 3 in the simple function, we would define it like this: def thirdFunction(number, multiplier=3):    'this function multiples numbers by 3'    if type(number) == type(1) or type(number) == type(1.0):        return number *multiplier >>>thirdFunction(4) The output is as follows: 12 >>>thirdFunction(4,5) The output is as follows: 20 The function will work when only the number to be multiplied is supplied, as the multiplier has a default value of 3. However, if we need another multiplier, the value can be adjusted by adding another value when calling the function. Note that the second value doesn't have to be a number as there is no type checking on it. Also, the default value(s) in a function must follow the parameters with no defaults (or all parameters can have a default value and the parameters can be supplied to the function in order or by name). Using functions to replace repetitive code One of the main uses of functions is to ensure that the same code does not have to be written over and over. The first portion of the script that we could convert into a function is the three ArcPy functions. Doing so will allow the script to be applicable to any of the stops in the Bus Stop feature class and have an adjustable buffer distance: bufferDist = 400 buffDistUnit = "Feet" lineName = '71 IB' busSignage = 'Ferry Plaza' sqlStatement = "NAME = '{0}' AND BUS_SIGNAG = '{1}'" def selectBufferIntersect(selectIn,selectOut,bufferOut,     intersectIn, intersectOut, sqlStatement,   bufferDist, buffDistUnit, lineName, busSignage):    'a function to perform a bus stop analysis'    arcpy.Select_analysis(selectIn, selectOut, sqlStatement.format(lineName, busSignage))    arcpy.Buffer_analysis(selectOut, bufferOut, "{0} {1}".format(bufferDist), "FULL", "ROUND", "NONE", "")    arcpy.Intersect_analysis("{0} #;{1} #".format(bufferOut, intersectIn), intersectOut, "ALL", "", "INPUT")    return intersectOut This function demonstrates how the analysis can be adjusted to accept the input and output feature class variables as parameters, along with some new variables. The function adds a variable to replace the SQL statement and variables to adjust the bus stop, and also tweaks the buffer distance statement so that both the distance and the unit can be adjusted. The feature class name variables, defined earlier in the script, have all been replaced with local variable names; while the global variable names could have been retained, it reduces the portability of the function. The next function will accept the result of the selectBufferIntersect() function and search it using the Search Cursor, passing the results into a dictionary. The dictionary will then be returned from the function for later use: def createResultDic(resultFC):    'search results of analysis and create results dictionary' dataDictionary = {}      with arcpy.da.SearchCursor(resultFC, ["STOPID","POP10"]) as cursor:        for row in cursor:            busStopID = row[0]            pop10 = row[1]            if busStopID not in dataDictionary.keys():                dataDictionary[busStopID] = [pop10]            else:                dataDictionary[busStopID].append(pop10)    return dataDictionary This function only requires one parameter: the feature class returned from the searchBufferIntersect() function. The results holding dictionary is first created, then populated by the search cursor, with the busStopid attribute used as a key, and the census block population attribute added to a list assigned to the key. The dictionary, having been populated with sorted data, is returned from the function for use in the final function, createCSV(). This function accepts the dictionary and the name of the output CSV file as a string: def createCSV(dictionary, csvname): 'a function takes a dictionary and creates a CSV file'    with open(csvname, 'wb') as csvfile:        csvwriter = csv.writer(csvfile, delimiter=',')        for busStopID in dictionary.keys():            popList = dictionary[busStopID]            averagePop = sum(popList)/len(popList)            data = [busStopID, averagePop]            csvwriter.writerow(data) The final function creates the CSV using the csv module. The name of the file, a string, is now a customizable parameter (meaning the script name can be any valid file path and text file with the extension .csv). The csvfile parameter is passed to the CSV module's writer method and assigned to the variable csvwriter, and the dictionary is accessed and processed, and passed as a list to csvwriter to be written to the CSV file. The csv.writer() method processes each item in the list into the CSV format and saves the final result. Open the CSV file with Excel or a text editor such as Notepad. To run the functions, we will call them in the script following the function definitions: analysisResult = selectBufferIntersect(Bus_Stops,Inbound71, Inbound71_400ft_buffer, CensusBlocks2010, Intersect71Census, bufferDist, lineName,                busSignage ) dictionary = createResultDic(analysisResult) createCSV(dictionary,r'C:\Projects\Output\Averages.csv') Now, the script has been divided into three functions, which replace the code of the first modified script. The modified script looks like this: # -*- coding: utf-8 -*- # --------------------------------------------------------------------------- # 8662_Chapter4Modified1.py # Created on: 2014-04-22 21:59:31.00000 #   (generated by ArcGIS/ModelBuilder) # Description: # Adjusted by Silas Toms # 2014 05 05 # ---------------------------------------------------------------------------   # Import arcpy module import arcpy import csv   # Local variables: Bus_Stops = r"C:\Projects\PacktDB.gdb\SanFrancisco\Bus_Stops" CensusBlocks2010 = r"C:\Projects\PacktDB.gdb\SanFrancisco\CensusBlocks2010" Inbound71 = r"C:\Projects\PacktDB.gdb\Chapter3Results\Inbound71" Inbound71_400ft_buffer = r"C:\Projects\PacktDB.gdb\Chapter3Results\Inbound71_400ft_buffer" Intersect71Census = r"C:\Projects\PacktDB.gdb\Chapter3Results\Intersect71Census" bufferDist = 400 lineName = '71 IB' busSignage = 'Ferry Plaza' def selectBufferIntersect(selectIn,selectOut,bufferOut,intersectIn,                          intersectOut, bufferDist,lineName, busSignage ):    arcpy.Select_analysis(selectIn,                          selectOut,                           "NAME = '{0}' AND BUS_SIGNAG = '{1}'".format(lineName, busSignage))    arcpy.Buffer_analysis(selectOut,                          bufferOut,                          "{0} Feet".format(bufferDist),                          "FULL", "ROUND", "NONE", "")    arcpy.Intersect_analysis("{0} #;{1} #".format(bufferOut,intersectIn),                              intersectOut, "ALL", "", "INPUT")    return intersectOut   def createResultDic(resultFC):    dataDictionary = {}       with arcpy.da.SearchCursor(resultFC,                                ["STOPID","POP10"]) as cursor:        for row in cursor:            busStopID = row[0]            pop10 = row[1]            if busStopID not in dataDictionary.keys():                dataDictionary[busStopID] = [pop10]            else:                dataDictionary[busStopID].append(pop10)    return dataDictionary   def createCSV(dictionary, csvname):    with open(csvname, 'wb') as csvfile:        csvwriter = csv.writer(csvfile, delimiter=',')        for busStopID in dictionary.keys():            popList = dictionary[busStopID]            averagePop = sum(popList)/len(popList)            data = [busStopID, averagePop]            csvwriter.writerow(data) analysisResult = selectBufferIntersect(Bus_Stops,Inbound71, Inbound71_400ft_buffer,CensusBlocks2010,Intersect71Census, bufferDist,lineName, busSignage ) dictionary = createResultDic(analysisResult) createCSV(dictionary,r'C:\Projects\Output\Averages.csv') print "Data Analysis Complete" Further generalization of the functions, while we have created functions from the original script that can be used to extract more data about bus stops in San Francisco, our new functions are still very specific to the dataset and analysis for which they were created. This can be very useful for long and laborious analysis for which creating reusable functions is not necessary. The first use of functions is to get rid of the need to repeat code. The next goal is to then make that code reusable. Let's discuss some ways in which we can convert the functions from one-offs into reusable functions or even modules. First, let's examine the first function: def selectBufferIntersect(selectIn,selectOut,bufferOut,intersectIn,                          intersectOut, bufferDist,lineName, busSignage ):    arcpy.Select_analysis(selectIn,                          selectOut,                          "NAME = '{0}' AND BUS_SIGNAG = '{1}'".format(lineName, busSignage))    arcpy.Buffer_analysis(selectOut,                          bufferOut,                          "{0} Feet".format(bufferDist),                          "FULL", "ROUND", "NONE", "")    arcpy.Intersect_analysis("{0} #;{1} #".format(bufferOut,intersectIn),                              intersectOut, "ALL", "", "INPUT")    return intersectOut This function appears to be pretty specific to the bus stop analysis. It's so specific, in fact, that while there are a few ways in which we can tweak it to make it more general (that is, useful in other scripts that might not have the same steps involved), we should not convert it into a separate function. When we create a separate function, we introduce too many variables into the script in an effort to simplify it, which is a counterproductive effort. Instead, let's focus on ways to generalize the ArcPy tools themselves. The first step will be to split the three ArcPy tools and examine what can be adjusted with each of them. The Select tool should be adjusted to accept a string as the SQL select statement. The SQL statement can then be generated by another function or by parameters accepted at runtime. For instance, if we wanted to make the script accept multiple bus stops for each run of the script (for example, the inbound and outbound stops for each line), we could create a function that would accept a list of the desired stops and a SQL template, and would return a SQL statement to plug into the Select tool. Here is an example of how it would look: def formatSQLIN(dataList, sqlTemplate):    'a function to generate a SQL statement'    sql = sqlTemplate #"OBJECTID IN "    step = "("    for data in dataList:        step += str(data)    sql += step + ")"    return sql   def formatSQL(dataList, sqlTemplate):    'a function to generate a SQL statement'    sql = ''    for count, data in enumerate(dataList):        if count != len(dataList)-1:            sql += sqlTemplate.format(data) + ' OR '        else:            sql += sqlTemplate.format(data)    return sql   >>> dataVals = [1,2,3,4] >>> sqlOID = "OBJECTID = {0}" >>> sql = formatSQL(dataVals, sqlOID) >>> print sql The output is as follows: OBJECTID = 1 OR OBJECTID = 2 OR OBJECTID = 3 OR OBJECTID = 4 This new function, formatSQL(), is a very useful function. Let's review what it does by comparing the function to the results following it. The function is defined to accept two parameters: a list of values and a SQL template. The first local variable is the empty string sql, which will be added to using string addition. The function is designed to insert the values into the variable sql, creating a SQL statement by taking the SQL template and using string formatting to add them to the template, which in turn is added to the SQL statement string (note that sql += is equivelent to sql = sql +). Also, an operator (OR) is used to make the SQL statement inclusive of all data rows that match the pattern. This function uses the built-in enumerate function to count the iterations of the list; once it has reached the last value in the list, the operator is not added to the SQL statement. Note that we could also add one more parameter to the function to make it possible to use an AND operator instead of OR, while still keeping OR as the default: def formatSQL2(dataList, sqlTemplate, operator=" OR "):    'a function to generate a SQL statement'    sql = ''    for count, data in enumerate(dataList):        if count != len(dataList)-1:            sql += sqlTemplate.format(data) + operator        else:            sql += sqlTemplate.format(data)    return sql   >>> sql = formatSQL2(dataVals, sqlOID," AND ") >>> print sql The output is as follows: OBJECTID = 1 AND OBJECTID = 2 AND OBJECTID = 3 AND OBJECTID = 4 While it would make no sense to use an AND operator on ObjectIDs, there are other cases where it would make sense, hence leaving OR as the default while allowing for AND. Either way, this function can now be used to generate our bus stop SQL statement for multiple stops (ignoring, for now, the bus signage field): >>> sqlTemplate = "NAME = '{0}'" >>> lineNames = ['71 IB','71 OB'] >>> sql = formatSQL2(lineNames, sqlTemplate) >>> print sql The output is as follows: NAME = '71 IB' OR NAME = '71 OB' However, we can't ignore the Bus Signage field for the inbound line, as there are two starting points for the line, so we will need to adjust the function to accept multiple values: def formatSQLMultiple(dataList, sqlTemplate, operator=" OR "):    'a function to generate a SQL statement'    sql = ''    for count, data in enumerate(dataList):        if count != len(dataList)-1:            sql += sqlTemplate.format(*data) + operator        else:            sql += sqlTemplate.format(*data)    return sql   >>> sqlTemplate = "(NAME = '{0}' AND BUS_SIGNAG = '{1}')" >>> lineNames = [('71 IB', 'Ferry Plaza'),('71 OB','48th Avenue')] >>> sql = formatSQLMultiple(lineNames, sqlTemplate) >>> print sql The output is as follows: (NAME = '71 IB' AND BUS_SIGNAG = 'Ferry Plaza') OR (NAME = '71 OB' AND BUS_SIGNAG = '48th Avenue') The slight difference in this function, the asterisk before the data variable, allows the values inside the data variable to be correctly formatted into the SQL template by exploding the values within the tuple. Notice that the SQL template has been created to segregate each conditional by using parentheses. The function(s) are now ready for reuse, and the SQL statement is now ready for insertion into the Select tool: sql = formatSQLMultiple(lineNames, sqlTemplate) arcpy.Select_analysis(Bus_Stops, Inbound71, sql) Next up is the Buffer tool. We have already taken steps towards making it generalized by adding a variable for the distance. In this case, we will only add one more variable to it, a unit variable that will make it possible to adjust the buffer unit from feet to meter or any other allowed unit. We will leave the other defaults alone. Here is an adjusted version of the Buffer tool: bufferDist = 400 bufferUnit = "Feet" arcpy.Buffer_analysis(Inbound71,                      Inbound71_400ft_buffer,                      "{0} {1}".format(bufferDist, bufferUnit),                      "FULL", "ROUND", "NONE", "") Now, both the buffer distance and buffer unit are controlled by a variable defined in the previous script, and this will make it easily adjustable if it is decided that the distance was not sufficient and the variables might need to be adjusted. The next step towards adjusting the ArcPy tools is to write a function, which will allow for any number of feature classes to be intersected together using the Intersect tool. This new function will be similar to the formatSQL functions as previous, as they will use string formatting and addition to allow for a list of feature classes to be processed into the correct string format for the Intersect tool to accept them. However, as this function will be built to be as general as possible, it must be designed to accept any number of feature classes to be intersected: def formatIntersect(features):    'a function to generate an intersect string'    formatString = ''    for count, feature in enumerate(features):        if count != len(features)-1:            formatString += feature + " #;"        else:            formatString += feature + " #"        return formatString >>> shpNames = ["example.shp","example2.shp"] >>> iString = formatIntersect(shpNames) >>> print iString The output is as follows: example.shp #;example2.shp # Now that we have written the formatIntersect() function, all that needs to be created is a list of the feature classes to be passed to the function. The string returned by the function can then be passed to the Intersect tool: intersected = [Inbound71_400ft_buffer, CensusBlocks2010] iString = formatIntersect(intersected) # Process: Intersect arcpy.Intersect_analysis(iString,                          Intersect71Census, "ALL", "", "INPUT") Because we avoided creating a function that only fits this script or analysis, we now have two (or more) useful functions that can be applied in later analyses, and we know how to manipulate the ArcPy tools to accept the data that we want to supply to them. Summary In this article, we discussed how to take autogenerated code and make it generalized, while adding functions that can be reused in other scripts and will make the generation of the necessary code components, such as SQL statements, much easier. Resources for Article: Further resources on this subject: Enterprise Geodatabase [article] Adding Graphics to the Map [article] Image classification and feature extraction from images [article]
Read more
  • 0
  • 0
  • 27288

article-image-native-ms-security-tools-and-configuration
Packt
04 Mar 2015
19 min read
Save for later

Native MS Security Tools and Configuration

Packt
04 Mar 2015
19 min read
This article, written by Santhosh Sivarajan, the author of Getting Started with Windows Server Security, will introduce another powerful Microsoft tool called Microsoft Security Compliance Manager (SCM). As its name suggests, it is a platform for managing and maintaining your security and compliance polices. At this point, we have established baseline security based on your business requirement, using Microsoft SCW. These polices can be a pure reflection of your business requirements. However, in an enterprise world, you have to consider compliance, regulations, other industry standards, and best practices to maximize the effectiveness of the security policy. That's where Microsoft SCM can provide more business value. We will talk more about the included SCM baselines later in the article. The goal of the article is to walk you through the configuration and administration process of Microsoft SCM and explain how it can be used in an enterprise environment to support your security needs. Then we will talk about a method to maintain the desired state of the server using a Microsoft tool called Attack Surface Analyzer (ASA). At the end of the article, you will see an option to add more security restrictions using another Microsoft tool called AppLocker. (For more resources related to this topic, see here.) Microsoft SCM Microsoft SCM is a centralized security and compliance policy manager product from Microsoft. It is a standalone application. Microsoft develops these baselines and best practice recommendations based on customer feedback and other agency's recommendations. These polices are consistently reviewed and updated. So, it is important that you are using the latest policy baseline. If there is a new policy, you will be able to download and update the baseline from the Microsoft SCM console itself. Since Microsoft SCM supports multiple input and output formats such as XML, Group Policy Objects (GPO), Desired Configuration Management (DCM), Security Content Automation Protocol (SCAP), and so on, it can be a centralized platform for your network infrastructure and other security and compliance products. It is also possible to integrate SCM with Microsoft System Center 2012 Process Pack for IT GRC. More details can be found at http://technet.microsoft.com/en-us/library/dd206732.aspx. Installing Microsoft SCM We will start with the installation process. As mentioned earlier, it is a standalone product. It uses Microsoft SQL Server 2008 or higher as the database. If you don't have a SQL database already installed on your system, the SCM installation process will automatically install Microsoft SQL Server 2008 Express Edition. You can perform the following steps to install Microsoft SCM: Download Microsoft Security Compliance Manager from http://www.microsoft.com/en-us/download/details.aspx?id=16776. Double-click on Security_Compliance_Manager_Setup.exe to start the installation process. Click on Next on the welcome window. Make sure to select the Always check for SCM and baseline updates option. Accept the License Agreement option and click on Next. Select the installation folder from the Installation Folder window by clicking on the Browse button. Click on Next. On the Microsoft SQL Server 2008 Express window, click on Next to install Microsoft SQL Server 2008 Express Edition. If you have Microsoft SQL Server already installed on your system, you can select the correct server details from this window. Accept the License Agreement option for SQL Server 2008 Express and click on Next. Click on Install on the Ready to Install window to begin the installation. You will see the progress in the Installing the Microsoft Security Compliance Manager window. If it asks you to restart the computer, click on OK. Click on Finish to complete the installation. This section provides a high level overview of the product before starting the administration and management process. The left pane of the SCMconsole provides the list of all available baselines. This is the baseline library inside SCM. The center pane displays more information based on your policy section from the baseline library. The right pane, also called the Actions pane, provides commands and options to manage your policies. As you can see in the following screenshot, it provides a few options to export these policies into different formats. So, if you have a different compliance manager tool, you can use these files with your existing tool.  SCM – Export options In compliance with other products, Microsoft SCM supports different severity levels—critical, optional, important, and none. As you can see in the following screenshot, on a custom policy, the severity levels can be changed to None, Important, Optional, or Critical based on your requirements:   For each of these events, you will see additional details and reference articles (CCE, OVAL, and so on) in the Setting Details section. Administering Microsoft SCM This section provides you with an overview of Microsoft SCM and some administration procedures to create and manage policies. These tasks can be achieved by performing the following steps: Open Security Compliance Manager. If you see a Download Updates popup window, click on the Download button to start the download and complete the database update process. Security Compliance Manager consists of mainly two sections: Custom Baselines and Microsoft Baselines. We will go through the details later in this article. SCM - Baselines Expand Microsoft Baselines. Since we are focusing more on Windows Server 2012, I will start with this section. Select the Windows Server 2012 node. This node contains predefined security polices based on Microsoft and industry best practices. I will use the predefined WS2012 Web Server Security template for this exercise. You will not be able to make changes to the settings in the default template. If you need to make changes, you can make a copy of the template and make changes there. Select the WS2012 Web Server Security template. From the right pane, select the Duplicate option. In the Duplicate window, enter the name for this new security policy. Click on Save. The new template will be saved under the Custom Baselines node. You can review the policy and make necessary changes in the newly created policy. Creating and implementing security policies At this point, you have installed SCM and are familiar with the basic administration tasks. From this section onwards, you will be working on a real-world scenario where you will be exporting a policy from Active Directory, importing into SCM, merging with an SCM baseline, and importing back into Active Directory. In this section, our goal is to export this web server policy and merge it with an SCM baseline and import it back into Active Directory. Exporting GPO from Active Directory We will start by exporting the existing web server policy from Active Directory. The following steps can be performed to export (backup) an Active Directory GPO-based policy: Open the Group Policy Manager console. Expand Forest | Domain | Domain Name | Group Policy Objects. Right-click on the appropriate GPO and select Back Up. GPO – Back up In the Back Up Group Policy Object window, enter the Location and Description details for the backup file. Click on the Back Up button to start the backup operation. You will see the progress in the Backup window. Click on OK when it completes the backup operation. GPO can also be backed up using the Backup-GPO PowerShell cmdlet. The following is an example:Backup-Gpo. Name- "WebServerbaselineV2.0". Path- D:Backup -Comment "Baseline Backup" The backup folder name will be the GUID of the GPO itself. Importing GPO into SCM An exported GPO-based policy can be imported directly into SCM. An administrator can perform the following steps to complete this task: Open Microsoft Compliance Security Manager. From the Import section on the right pane, select the GPO Backup (Folder) option. SCM – Import In the Browse For Folder window, select the GPO backup folder. Click on OK. In the GPO Name window, confirm or change the baseline name. Click on OK. In the SCM Log window, you will see the status. Click on OK to close the window. You will see the imported policy under Custom Baselines | GPO Import | Policy Name. Currently, SCM supports importing from GPO backup and SCM CAB files. If you have some other policy or baseline (for example, DISA STIGs) that you would like to import into SCM, you need to import these polices into Active Directory first, and then export/backup to GPO before you can import into SCM. Merging imported GPO with the SCM baseline policy The third step in this process is to merge the imported policy with the SCM baseline policy. Keep in mind that some configurations and settings will be lost when you merge an existing GPO with the SCM baseline policy. For example, service-related or ACL configurations may not be preserved when you associate and merge with an SCM baseline policy. If you have these types of configuration in your GPO and want to retain them, you may need to split the GPO and use two separate GPOs. Inside the SCM, the import process is to map these configurations with the SCM library to preserve these settings. If it doesn't match or map, these settings will be dropped from the new baseline policy. For this exercise, my assumption is that you don't have a custom configuration or settings in the imported policy. The following steps can be used to Associate and Merge a GPO-based policy into an SCM-based policy: Select the imported policy in Microsoft Compliance Security Manager. From the right pane, select the Associate option from the Baseline section.Selecting the Associate option From the Associate Product with GPO window, select the appropriate baseline policy. Since we are working with a Windows Server 2012 policy, I will be selecting Windows Server 2012 as the product. If you have a different operating system, select the correct policy from the product list. Click on Associate. Your custom policy must have unique settings in the baseline policy in order to associate a custom policy with the SCM baseline policy; otherwise, the Associate button will be grayed out. Enter a name for this policy in the Baseline Policy window. You will see this policy in the Custom Baselines | Windows Server 2012 section. Select this policy. From the right pane, select the Compare/Merge option from the Baseline section. Selecting the Compare / Merge option Now you have associated your policy with an SCM baseline policy. The next step is to compare and merge your policy with a baseline SCM policy. From the Compare Baseline window, select the appropriate baseline policy. Since we are working with a web server baseline, we will be selecting WS2012 Web Server Security 1.0 as the policy. Click on OK. You will see the result in the Compare Baselines window. You can review the differ and match details here. Since we are planning to merge these two polices, we will be selecting the Merge Baselines option. You will see the summary report in the Merge Baselines window. Click on OK. In the Specify a name for the merged baseline window, enter a new name for this policy. Click on OK. This merged policy will be stored in the Custom Baselines– Windows Server 2012 section. Exporting the SCM baseline policy At this point, you have created a new policy that contains your custom policy and best practices provided by SCM. The next step is to export this policy to a supported format. Since we are dealing with Active Directory and GPO, we will be exporting it into a GPO-based policy. You can perform the following steps to export an SCM policy to a GPO-based backup policy: Select the policy from Microsoft Compliance Security Manager. From the Export section, select the GPO Backup (Folder) option. GPO Backup (Folder) From the Browse for Folder window, select the folder to store this policy in. Click on OK. Importing a policy into Active Directory The final step in this process is to import these settings back to Active Directory. This can be achieved by using Group Policy Management Console (GPMC). The following steps can be used to import an SCM-based policy into Active Directory: Open Group Policy Manager Console. Expand Forest | Domain | Domain Name | Group Policy Objects. Right-click on the appropriate policy. Select the Import Settings option. The Import Settings option Click on Next in the Welcome window. It is always a best practice to back up the existing settings. Click on Backup to continue with the backup operation. Once you have completed the backup, click on Next in the Backup GPO window. In the Backup Location window, select the backup location folder. Click on Next. Confirm the GPO name in the Source GPO window. Click on Next. You will see the scanning settings in the Scanning Backup window. Click on Next to continue. Click on Finish in the Completing the Import Settings Wizard window to complete the import operation. Click on OK in the Import window. Maintaining and monitoring the integrity of a baseline policy Once you have baseline security in place, whether it is a true business policy or a combination of business and industry practices, you will need to maintain this state to ensure the security and integrity. The whole idea is to compare your baseline image with the current image in order to validate the settings. There are many ways to achieve this. Microsoft has a free tool called Attack Surface Analyzer (ASA) that can be used to compare the two states of the system. The details and capabilities of this tool can found at http://www.microsoft.com/en-us/download/details.aspx?id=24487. Microsoft ASA An administrator can perform the following steps to install, configure, and generate an Attack Surface Report using Microsoft ASA: Download Attack Surface Analyzer from http://www.microsoft.com/en-us/download/details.aspx?id=24487. Complete the installation. It is a standalone, simple MSI installation process. Open the Attack Surface Analyzer tool. The first step is to create the baseline state. Select the Run New Scan option and enter a name for the CAB file. Click on Run Scan to start the scanning process. You will see the status and progress in the Collecting Data window. When it completes, it will create a CAB file with the result. The second step in this process is to analyze the baseline state against the existing server so as to identify the differences. You will need to create another report (Product CAB) to compare the CAB file with the baseline CAB. Select the Run New Scan option again and enter a name for the product CAB file. Click on Run Scan to start the scanning process. Complete the CAB creation process. The third step in the process is to compare the baseline CAB with the product CAB to get the delta. Select the Generate Standard Attack Surface Report option. In the Select Options section, select the baseline CAB name, select the product CAB name, and enter a name for the attack report. Click on Generate to start the process. You will see the status in the Running Analysis window. The report will be opened automatically in the web browser. This report has three sections: Report Summary, Security Issues, and Attack Surface. The following is an example of a Security Issues report Application control and management At this point, you have a baseline policy for your server platform. Now we can add more restrictions based on your requirements to provide a more secure environment. In the following section, my plan is to introduce an option to "blacklist" and "whitelist" some of the applications using a built-in native option called AppLocker. The details of the AppLocker application can be found at http://technet.microsoft.com/en-us/library/hh831409.aspx. AppLocker AppLocker polices are part of Application Control Policies in GPOs. There are four types of built-in rules: Executable, Windows Installable, Script, and Packed App rules. Before you create or enforce a policy, you need to perform an inventory check to identify the current usage of these applications in your environment. AppLocker has an inventory process called Auditing that helps you to achieve this. In this scenario, our goal is to block unauthorized access of the NLTEST application from all servers. Creating a policy As the first step, you need to identify the current usage of the application in your environment. The following steps can be performed to create a new AppLocker policy in an Active Directory environment: Open Group Policy Manager Console. Expand Forest | Domain | Domain Name. Right-click on the Group Policy Object node and select New. Enter a name for the GPO in the New GPO window. Leave Source Starter GPO as (none). Click on OK. This will create a new blank GPO in the Group Policy Object node. We will be using this GPO to configure the AppLocker settings. Right-click on the newly created GPO and select Edit. This will open the Group Policy Management Editor window. Expand Policies | Windows Settings | Security Settings | AppLocker. Right-click on Executable Polices and select Create Default Rules. These default rules allow users and built-in administrators to run default programs and administrators to run files and applications. Based on your requirements, you can modify and delete these rules. The default AppLocker rule allows everyone to run files located only in the Windows folder, and the administrator can run all files. The default AppLocker rule Expand Policies | Windows Settings | Security Settings | AppLocker. Right-click on Executable Polices and select Create New Rules. Click on Next in the Create Executable Rules window. In the Permission window, select Deny. In the User or Group section, click on Select and select the Server Admins group. Here, I have created a security group with all server administrators in that group. In the Conditions window, select the File Hash option. Click on Next. In the File Hash window, select the correct file name using the Browse File option. In this scenario, I will be selecting the NLTEST.exe file. Click on Next. In the Name and Description window, select or enter an appropriate name for this rule. Click on Create. Auditing a policy The next step in this process is to audit the previously created polices to ensure that there will not be any adverse effects to your environment. An administrator can perform the following steps to audit an existing policy in an Active Directory environment: Right-click on AppLocker (Policies | Windows Settings | Security Settings) and go to Properties. On the Enforcement tab, select appropriate rule types as Configured. From the drop-down list, select the rule as Audit only. Click on OK. GPO – AppLocker policy You can see the application usage and history in the Event log. Open Event Viewer. Navigate to Applications and Services Logs | Microsoft | Windows | AppLocker. Based on your policy configuration, you will see the appropriate event information in the AppLocker section. In an enterprise world, manually checking the items in an event log is not going to be a viable option. You have a few options available to automate this process. You can forward the event log to a central server (Event Forwarding) and verify from that single console, or you can use the Get-WinEvent PowerShell cmdlet to collect these events remotely. The following section provides an option to evaluate these logs using the Get-WinEvent PowerShell cmdlet. By default, AppLocker events are located in the Applications and Services Logs | Microsoft | Windows | AppLocker section of the Event Viewer. The Get-WinEvent -ComputerName "SERVER01.MYINFRALAB.COM" –LogName *AppLocker* | fl | out-file Server01.txt cmdlet filters all AppLocker-related events from Server01 and puts them in the output file Server01.txt. Here are some of the events that you will see in the event log: If you have multiple computers to evaluate, you can create a simple PowerShell script to automatically input the computer names. The following is a sample PowerShell script. The Servers.txt file will be your input file that contains all of the server names: $OutPut = "C:InputOutput.txt" Get-Content "C:InputServers.txt" | Foreach-Object { $_| out-file $OutPut -Append -Encoding ascii Get-WinEvent -ComputerName "Infralab01.MYINFRALAB.COM" –LogName *AppLocker* | fl | out-file $OutPut -Append -Encoding ascii } Implementing the policy Once you have verified the audit result, you can enforce the policy using the AppLockerGPO. The following steps can be used to implement the AppLocker GPO in an Active Directory environment: Open Group Policy Manager Console. Expand the Forest | Domain | Domain Name | Group Policy Object node. Right-click on the Server Application Restriction GPO and select Edit. This will open a Group Policy Management Editor MMC window. Opening the Group Policy Management Editor MMC window From Group Policy Management Editor, expand Policies | Windows Settings | Security Settings. Right-click on AppLocker and select Properties. In the AppLocker Properties window, change Executable rules to Enforce rules. Click on OK: Close the Group Policy Management Editor MMC window. The new policy will apply to the server based on your Active Directory replication interval and GPO refresh cycle. You can use the GPUPDATE/Force command to force the GPOon to a local server. Two different results are shown in the following screenshots. As you can see in the following screenshot, the user Johndoe was denied the execution of the NLTEST.exe application:   Since the following user was part of the Server Admins group, the user was allowed to execute the NLTEST.exe application:   Some additional security recommendations to consider when installing and configuring AppLocker are included at http://technet.microsoft.com/en-us/library/ee844118(WS.10).aspx. AppLocker and PowerShell AppLocker supports PowerShell, and it has a PowerShell module called AppLocker. An administrator can create, test, and troubleshoot the AppLocker policies using these cmdlets. You need to import the AppLocker module before these cmdlets can be used. The following are the supported cmdlets in the module: Summary We started this article with baseline security for your server platform, which was originally created using Microsoft SCW. In this article, you learned how to incorporate this policy with the baseline and best practice recommendations using MicrosoftSCM. Then you used AppLocker to enforce more application-based security. We also learned how to monitor the state of the server and compare it with the baseline to identify the security vulnerabilities and issues using Microsoft ASA. Resources for Article:  Further resources on this subject: Active Directory migration [article] Microsoft DAC 2012 [article] Insight into Hyper-V Storage [article]
Read more
  • 0
  • 0
  • 2075

article-image-knockoutjs-templates
Packt
04 Mar 2015
38 min read
Save for later

KnockoutJS Templates

Packt
04 Mar 2015
38 min read
 In this article by Jorge Ferrando, author of the book KnockoutJS Essentials, we are going talk about how to design our templates with the native engine and then we will speak about mechanisms and external libraries we can use to improve the Knockout template engine. When our code begins to grow, it's necessary to split it in several parts to keep it maintainable. When we split JavaScript code, we are talking about modules, classes, function, libraries, and so on. When we talk about HTML, we call these parts templates. KnockoutJS has a native template engine that we can use to manage our HTML. It is very simple, but also has a big inconvenience: templates, it should be loaded in the current HTML page. This is not a problem if our app is small, but it could be a problem if our application begins to need more and more templates. (For more resources related to this topic, see here.) Preparing the project First of all, we are going to add some style to the page. Add a file called style.css into the css folder. Add a reference in the index.html file, just below the bootstrap reference. The following is the content of the file: .container-fluid { margin-top: 20px; } .row { margin-bottom: 20px; } .cart-unit { width: 80px; } .btn-xs { font-size:8px; } .list-group-item { overflow: hidden; } .list-group-item h4 { float:left; width: 100px; } .list-group-item .input-group-addon { padding: 0; } .btn-group-vertical > .btn-default { border-color: transparent; } .form-control[disabled], .form-control[readonly] { background-color: transparent !important; } Now remove all the content from the body tag except for the script tags and paste in these lines: <div class="container-fluid"> <div class="row" id="catalogContainer">    <div class="col-xs-12"       data-bind="template:{name:'header'}"></div>    <div class="col-xs-6"       data-bind="template:{name:'catalog'}"></div>    <div id="cartContainer" class="col-xs-6 well hidden"       data-bind="template:{name:'cart'}"></div> </div> <div class="row hidden" id="orderContainer"     data-bind="template:{name:'order'}"> </div> <div data-bind="template: {name:'add-to-catalog-modal'}"></div> <div data-bind="template: {name:'finish-order-modal'}"></div> </div> Let's review this code. We have two row classes. They will be our containers. The first container is named with the id value as catalogContainer and it will contain the catalog view and the cart. The second one is referenced by the id value as orderContainer and we will set our final order there. We also have two more <div> tags at the bottom that will contain the modal dialogs to show the form to add products to our catalog and the other one will contain a modal message to tell the user that our order is finished. Along with this code you can see a template binding inside the data-bind attribute. This is the binding that Knockout uses to bind templates to the element. It contains a name parameter that represents the ID of a template. <div class="col-xs-12" data-bind="template:{name:'header'}"></div> In this example, this <div> element will contain the HTML that is inside the <script> tag with the ID header. Creating templates Template elements are commonly declared at the bottom of the body, just above the <script> tags that have references to our external libraries. We are going to define some templates and then we will talk about each one of them: <!-- templates --> <script type="text/html" id="header"></script> <script type="text/html" id="catalog"></script> <script type="text/html" id="add-to-catalog-modal"></script> <script type="text/html" id="cart-widget"></script> <script type="text/html" id="cart-item"></script> <script type="text/html" id="cart"></script> <script type="text/html" id="order"></script> <script type="text/html" id="finish-order-modal"></script> Each template name is descriptive enough by itself, so it's easy to know what we are going to set inside them. Let's see a diagram showing where we dispose each template on the screen:   Notice that the cart-item template will be repeated for each item in the cart collection. Modal templates will appear only when a modal dialog is displayed. Finally, the order template is hidden until we click to confirm the order. In the header template, we will have the title and the menu of the page. The add-to-catalog-modal template will contain the modal that shows the form to add a product to our catalog. The cart-widget template will show a summary of our cart. The cart-item template will contain the template of each item in the cart. The cart template will have the layout of the cart. The order template will show the final list of products we want to buy and a button to confirm our order. The header template Let's begin with the HTML markup that should contain the header template: <script type="text/html" id="header"> <h1>    Catalog </h1>   <button class="btn btn-primary btn-sm" data-toggle="modal"     data-target="#addToCatalogModal">    Add New Product </button> <button class="btn btn-primary btn-sm" data-bind="click:     showCartDetails, css:{ disabled: cart().length < 1}">    Show Cart Details </button> <hr/> </script> We define a <h1> tag, and two <button> tags. The first button tag is attached to the modal that has the ID #addToCatalogModal. Since we are using Bootstrap as the CSS framework, we can attach modals by ID using the data-target attribute, and activate the modal using the data-toggle attribute. The second button will show the full cart view and it will be available only if the cart has items. To achieve this, there are a number of different ways. The first one is to use the CSS-disabled class that comes with Twitter Bootstrap. This is the way we have used in the example. CSS binding allows us to activate or deactivate a class in the element depending on the result of the expression that is attached to the class. The other method is to use the enable binding. This binding enables an element if the expression evaluates to true. We can use the opposite binding, which is named disable. There is a complete documentation on the Knockout website http://knockoutjs.com/documentation/enable-binding.html: <button class="btn btn-primary btn-sm" data-bind="click:   showCartDetails, enable: cart().length > 0"> Show Cart Details </button>   <button class="btn btn-primary btn-sm" data-bind="click:   showCartDetails, disable: cart().length < 1"> Show Cart Details </button> The first method uses CSS classes to enable and disable the button. The second method uses the HTML attribute, disabled. We can use a third option, which is to use a computed observable. We can create a computed observable variable in our view-model that returns true or false depending on the length of the cart: //in the viewmodel. Remember to expose it var cartHasProducts = ko.computed(function(){ return (cart().length > 0); }); //HTML <button class="btn btn-primary btn-sm" data-bind="click:   showCartDetails, enable: cartHasProducts"> Show Cart Details </button> To show the cart, we will use the click binding. Now we should go to our viewmodel.js file and add all the information we need to make this template work: var cart = ko.observableArray([]); var showCartDetails = function () { if (cart().length > 0) {    $("#cartContainer").removeClass("hidden"); } }; And you should expose these two objects in the view-model: return {    searchTerm: searchTerm,    catalog: filteredCatalog,    newProduct: newProduct,    totalItems:totalItems,    addProduct: addProduct,    cart: cart,    showCartDetails: showCartDetails, }; The catalog template The next step is to define the catalog template just below the header template: <script type="text/html" id="catalog"> <div class="input-group">    <span class="input-group-addon">      <i class="glyphicon glyphicon-search"></i> Search    </span>    <input type="text" class="form-control" data-bind="textInput:       searchTerm"> </div> <table class="table">    <thead>    <tr>      <th>Name</th>      <th>Price</th>      <th>Stock</th>      <th></th>    </tr>    </thead>    <tbody data-bind="foreach:catalog">    <tr data-bind="style:color:stock() < 5?'red':'black'">      <td data-bind="text:name"></td>      <td data-bind="text:price"></td>      <td data-bind="text:stock"></td>      <td>        <button class="btn btn-primary"          data-bind="click:$parent.addToCart">          <i class="glyphicon glyphicon-plus-sign"></i> Add        </button>      </td>    </tr>    </tbody>    <tfoot>    <tr>      <td colspan="3">        <strong>Items:</strong><span           data-bind="text:catalog().length"></span>      </td>      <td colspan="1">        <span data-bind="template:{name:'cart-widget'}"></span>      </td>    </tr>    </tfoot> </table> </script> Now, each line uses the style binding to alert the user, while they are shopping, that the stock is reaching the maximum limit. The style binding works the same way that CSS binding does with classes. It allows us to add style attributes depending on the value of the expression. In this case, the color of the text in the line must be black if the stock is higher than five, and red if it is four or less. We can use other CSS attributes, so feel free to try other behaviors. For example, set the line of the catalog to green if the element is inside the cart. We should remember that if an attribute has dashes, you should wrap it in single quotes. For example, background-color will throw an error, so you should write 'background-color'. When we work with bindings that are activated depending on the values of the viewmodel, it is good practice to use computed observables. Therefore, we can create a computed value in our product model that returns the value of the color that should be displayed: //In the Product.js var _lineColor = ko.computed(function(){ return (_stock() < 5)? 'red' : 'black'; }); return { lineColor:_lineColor }; //In the template <tr data-bind="style:lineColor"> ... </tr> It would be even better if we create a class in our style.css file that is called stock-alert and we use the CSS binding: //In the style file .stock-alert { color: #f00; } //In the Product.js var _hasStock = ko.computed(function(){ return (_stock() < 5);   }); return { hasStock: _hasStock }; //In the template <tr data-bind="css: hasStock"> ... </tr> Now, look inside the <tfoot> tag. <td colspan="1"> <span data-bind="template:{name:'cart-widget'}"></span> </td> As you can see, we can have nested templates. In this case, we have the cart-widget template inside our catalog template. This give us the possibility of having very complex templates, splitting them into very small pieces, and combining them, to keep our code clean and maintainable. Finally, look at the last cell of each row: <td> <button class="btn btn-primary"     data-bind="click:$parent.addToCart">    <i class="glyphicon glyphicon-plus-sign"></i> Add </button> </td> Look at how we call the addToCart method using the magic variable $parent. Knockout gives us some magic words to navigate through the different contexts we have in our app. In this case, we are in the catalog context and we want to call a method that lies one level up. We can use the magical variable called $parent. There are other variables we can use when we are inside a Knockout context. There is complete documentation on the Knockout website http://knockoutjs.com/documentation/binding-context.html. In this project, we are not going to use all of them. But we are going quickly explain these binding context variables, just to understand them better. If we don't know how many levels deep we are, we can navigate to the top of the view-model using the magic word $root. When we have many parents, we can get the magic array $parents and access each parent using indexes, for example, $parents[0], $parents[1]. Imagine that you have a list of categories where each category contains a list of products. These products are a list of IDs and the category has a method to get the name of their products. We can use the $parents array to obtain the reference to the category: <ul data-bind="foreach: {data: categories}"> <li data-bind="text: $data.name"></li> <ul data-bind="foreach: {data: $data.products, as: 'prod'}>    <li data-bind="text:       $parents[0].getProductName(prod.ID)"></li> </ul> </ul> Look how helpful the as attribute is inside the foreach binding. It makes code more readable. But if you are inside a foreach loop, you can also access each item using the $data magic variable, and you can access the position index that each element has in the collection using the $index magic variable. For example, if we have a list of products, we can do this: <ul data-bind="foreach: cart"> <li><span data-bind="text:$index">    </span> - <span data-bind="text:$data.name"></span> </ul> This should display: 0 – Product 1 1 – Product 2 2 – Product 3 ...  KnockoutJS magic variables to navigate through contexts Now that we know more about what binding variables are, let's go back to our code. We are now going to write the addToCart method. We are going to define the cart items in our js/models folder. Create a file called CartProduct.js and insert the following code in it: //js/models/CartProduct.js var CartProduct = function (product, units) { "use strict";   var _product = product,    _units = ko.observable(units);   var subtotal = ko.computed(function(){    return _product.price() * _units(); });   var addUnit = function () {    var u = _units();    var _stock = _product.stock();    if (_stock === 0) {      return;    } _units(u+1);    _product.stock(--_stock); };   var removeUnit = function () {    var u = _units();    var _stock = _product.stock();    if (u === 0) {      return;    }    _units(u-1);    _product.stock(++_stock); };   return {    product: _product,    units: _units,    subtotal: subtotal,    addUnit : addUnit,    removeUnit: removeUnit, }; }; Each cart product is composed of the product itself and the units of the product we want to buy. We will also have a computed field that contains the subtotal of the line. We should give the object the responsibility for managing its units and the stock of the product. For this reason, we have added the addUnit and removeUnit methods. These methods add one unit or remove one unit of the product if they are called. We should reference this JavaScript file into our index.html file with the other <script> tags. In the viewmodel, we should create a cart array and expose it in the return statement, as we have done earlier: var cart = ko.observableArray([]); It's time to write the addToCart method: var addToCart = function(data) { var item = null; var tmpCart = cart(); var n = tmpCart.length; while(n--) {    if (tmpCart[n].product.id() === data.id()) {      item = tmpCart[n];    } } if (item) {    item.addUnit(); } else {    item = new CartProduct(data,0);    item.addUnit();    tmpCart.push(item);       } cart(tmpCart); }; This method searches the product in the cart. If it exists, it updates its units, and if not, it creates a new one. Since the cart is an observable array, we need to get it, manipulate it, and overwrite it, because we need to access the product object to know if the product is in the cart. Remember that observable arrays do not observe the objects they contain, just the array properties. The add-to-cart-modal template This is a very simple template. We just wrap the code to add a product to a Bootstrap modal: <script type="text/html" id="add-to-catalog-modal"> <div class="modal fade" id="addToCatalogModal">    <div class="modal-dialog">      <div class="modal-content">        <form class="form-horizontal" role="form"           data-bind="with:newProduct">          <div class="modal-header">            <button type="button" class="close"               data-dismiss="modal">              <span aria-hidden="true">&times;</span>              <span class="sr-only">Close</span>            </button><h3>Add New Product to the Catalog</h3>          </div>          <div class="modal-body">            <div class="form-group">              <div class="col-sm-12">                <input type="text" class="form-control"                  placeholder="Name" data-bind="textInput:name">              </div>            </div>            <div class="form-group">              <div class="col-sm-12">                <input type="text" class="form-control"                   placeholder="Price" data-bind="textInput:price">              </div>            </div>            <div class="form-group">              <div class="col-sm-12">                <input type="text" class="form-control"                   placeholder="Stock" data-bind="textInput:stock">              </div>            </div>          </div>          <div class="modal-footer">            <div class="form-group">              <div class="col-sm-12">                <button type="submit" class="btn btn-default"                  data-bind="{click:$parent.addProduct}">                  <i class="glyphicon glyphicon-plus-sign">                  </i> Add Product                </button>              </div>            </div>          </div>        </form>      </div><!-- /.modal-content -->    </div><!-- /.modal-dialog --> </div><!-- /.modal --> </script> The cart-widget template This template gives the user information quickly about how many items are in the cart and how much all of them cost: <script type="text/html" id="cart-widget"> Total Items: <span data-bind="text:totalItems"></span> Price: <span data-bind="text:grandTotal"></span> </script> We should define totalItems and grandTotal in our viewmodel: var totalItems = ko.computed(function(){ var tmpCart = cart(); var total = 0; tmpCart.forEach(function(item){    total += parseInt(item.units(),10); }); return total; }); var grandTotal = ko.computed(function(){ var tmpCart = cart(); var total = 0; tmpCart.forEach(function(item){    total += (item.units() * item.product.price()); }); return total; }); Now you should expose them in the return statement, as we always do. Don't worry about the format now, you will learn how to format currency or any kind of data in the future. Now you must focus on learning how to manage information and how to show it to the user. The cart-item template The cart-item template displays each line in the cart: <script type="text/html" id="cart-item"> <div class="list-group-item" style="overflow: hidden">    <button type="button" class="close pull-right" data-bind="click:$root.removeFromCart"><span>&times;</span></button>    <h4 class="" data-bind="text:product.name"></h4>    <div class="input-group cart-unit">      <input type="text" class="form-control" data-bind="textInput:units" readonly/>        <span class="input-group-addon">          <div class="btn-group-vertical">            <button class="btn btn-default btn-xs"               data-bind="click:addUnit">              <i class="glyphicon glyphicon-chevron-up"></i>            </button>            <button class="btn btn-default btn-xs"               data-bind="click:removeUnit">              <i class="glyphicon glyphicon-chevron-down"></i>            </button>          </div>        </span>    </div> </div> </script> We set an x button in the top-right of each line to easily remove a line from the cart. As you can see, we have used the $root magic variable to navigate to the top context because we are going to use this template inside a foreach loop, and it means this template will be in the loop context. If we consider this template as an isolated element, we can't be sure how deep we are in the context navigation. To be sure, we go to the right context to call the removeFormCart method. It's better to use $root instead of $parent in this case. The code for removeFromCart should lie in the viewmodel context and should look like this: var removeFromCart = function (data) { var units = data.units(); var stock = data.product.stock(); data.product.stock(units+stock); cart.remove(data); }; Notice that in the addToCart method, we get the array that is inside the observable. We did that because we need to navigate inside the elements of the array. In this case, Knockout observable arrays have a method called remove that allows us to remove the object that we pass as a parameter. If the object is in the array, it will be removed. Remember that the data context is always passed as the first parameter in the function we use in the click events. The cart template The cart template should display the layout of the cart: <script type="text/html" id="cart"> <button type="button" class="close pull-right"     data-bind="click:hideCartDetails">    <span>&times;</span> </button> <h1>Cart</h1> <div data-bind="template: {name: 'cart-item', foreach:cart}"     class="list-group"></div> <div data-bind="template:{name:'cart-widget'}"></div> <button class="btn btn-primary btn-sm"     data-bind="click:showOrder">    Confirm Order </button> </script> It's important that you notice the template binding that we have just below <h1>Cart</h1>. We are binding a template with an array using the foreach argument. With this binding, Knockout renders the cart-item template for each element inside the cart collection. This considerably reduces the code we write in each template and in addition makes them more readable. We have once again used the cart-widget template to show the total items and the total amount. This is one of the good features of templates, we can reuse content over and over. Observe that we have put a button at the top-right of the cart to close it when we don't need to see the details of our cart, and the other one to confirm the order when we are done. The code in our viewmodel should be as follows: var hideCartDetails = function () { $("#cartContainer").addClass("hidden"); }; var showOrder = function () { $("#catalogContainer").addClass("hidden"); $("#orderContainer").removeClass("hidden"); }; As you can see, to show and hide elements we use jQuery and CSS classes from the Bootstrap framework. The hidden class just adds the display: none style to the elements. We just need to toggle this class to show or hide elements in our view. Expose these two methods in the return statement of your view-model. We will come back to this when we need to display the order template. This is the result once we have our catalog and our cart:   The order template Once we have clicked on the Confirm Order button, the order should be shown to us, to review and confirm if we agree. <script type="text/html" id="order"> <div class="col-xs-12">    <button class="btn btn-sm btn-primary"       data-bind="click:showCatalog">      Back to catalog    </button>    <button class="btn btn-sm btn-primary"       data-bind="click:finishOrder">      Buy & finish    </button> </div> <div class="col-xs-6">    <table class="table">      <thead>      <tr>        <th>Name</th>        <th>Price</th>        <th>Units</th>        <th>Subtotal</th>      </tr>      </thead>      <tbody data-bind="foreach:cart">      <tr>        <td data-bind="text:product.name"></td>        <td data-bind="text:product.price"></td>        <td data-bind="text:units"></td>        <td data-bind="text:subtotal"></td>      </tr>      </tbody>      <tfoot>      <tr>        <td colspan="3"></td>        <td>Total:<span data-bind="text:grandTotal"></span></td>      </tr>      </tfoot>    </table> </div> </script> Here we have a read-only table with all cart lines and two buttons. One is to confirm, which will show the modal dialog saying the order is completed, and the other gives us the option to go back to the catalog and keep on shopping. There is some code we need to add to our viewmodel and expose to the user: var showCatalog = function () { $("#catalogContainer").removeClass("hidden"); $("#orderContainer").addClass("hidden"); }; var finishOrder = function() { cart([]); hideCartDetails(); showCatalog(); $("#finishOrderModal").modal('show'); }; As we have done in previous methods, we add and remove the hidden class from the elements we want to show and hide. The finishOrder method removes all the items of the cart because our order is complete; hides the cart and shows the catalog. It also displays a modal that gives confirmation to the user that the order is done.  Order details template The finish-order-modal template The last template is the modal that tells the user that the order is complete: <script type="text/html" id="finish-order-modal"> <div class="modal fade" id="finishOrderModal">    <div class="modal-dialog">            <div class="modal-content">        <div class="modal-body">        <h2>Your order has been completed!</h2>        </div>        <div class="modal-footer">          <div class="form-group">            <div class="col-sm-12">              <button type="submit" class="btn btn-success"                 data-dismiss="modal">Continue Shopping              </button>            </div>          </div>        </div>      </div><!-- /.modal-content -->    </div><!-- /.modal-dialog --> </div><!-- /.modal --> </script> The following screenshot displays the output:   Handling templates with if and ifnot bindings You have learned how to show and hide templates with the power of jQuery and Bootstrap. This is quite good because you can use this technique with any framework you want. The problem with this type of code is that since jQuery is a DOM manipulation library, you need to reference elements to manipulate them. This means you need to know over which element you want to apply the action. Knockout gives us some bindings to hide and show elements depending on the values of our view-model. Let's update the show and hide methods and the templates. Add both the control variables to your viewmodel and expose them in the return statement. var visibleCatalog = ko.observable(true); var visibleCart = ko.observable(false); Now update the show and hide methods: var showCartDetails = function () { if (cart().length > 0) {    visibleCart(true); } };   var hideCartDetails = function () { visibleCart(false); };   var showOrder = function () { visibleCatalog(false); };   var showCatalog = function () { visibleCatalog(true); }; We can appreciate how the code becomes more readable and meaningful. Now, update the cart template, the catalog template, and the order template. In index.html, consider this line: <div class="row" id="catalogContainer"> Replace it with the following line: <div class="row" data-bind="if: visibleCatalog"> Then consider the following line: <div id="cartContainer" class="col-xs-6 well hidden"   data-bind="template:{name:'cart'}"></div> Replace it with this one: <div class="col-xs-6" data-bind="if: visibleCart"> <div class="well" data-bind="template:{name:'cart'}"></div> </div> It is important to know that the if binding and the template binding can't share the same data-bind attribute. This is why we go from one element to two nested elements in this template. In other words, this example is not allowed: <div class="col-xs-6" data-bind="if:visibleCart,   template:{name:'cart'}"></div> Finally, consider this line: <div class="row hidden" id="orderContainer"   data-bind="template:{name:'order'}"> Replace it with this one: <div class="row" data-bind="ifnot: visibleCatalog"> <div data-bind="template:{name:'order'}"></div> </div> With the changes we have made, showing or hiding elements now depends on our data and not on our CSS. This is much better because now we can show and hide any element we want using the if and ifnot binding. Let's review, roughly speaking, how our files are now: We have our index.html file that has the main container, templates, and libraries: <!DOCTYPE html> <html> <head> <title>KO Shopping Cart</title> <meta name="viewport" content="width=device-width,     initial-scale=1"> <link rel="stylesheet" type="text/css"     href="css/bootstrap.min.css"> <link rel="stylesheet" type="text/css" href="css/style.css"> </head> <body>   <div class="container-fluid"> <div class="row" data-bind="if: visibleCatalog">    <div class="col-xs-12"       data-bind="template:{name:'header'}"></div>    <div class="col-xs-6"       data-bind="template:{name:'catalog'}"></div>    <div class="col-xs-6" data-bind="if: visibleCart">      <div class="well" data-bind="template:{name:'cart'}"></div>    </div> </div> <div class="row" data-bind="ifnot: visibleCatalog">    <div data-bind="template:{name:'order'}"></div> </div> <div data-bind="template: {name:'add-to-catalog-modal'}"></div> <div data-bind="template: {name:'finish-order-modal'}"></div> </div>   <!-- templates --> <script type="text/html" id="header"> ... </script> <script type="text/html" id="catalog"> ... </script> <script type="text/html" id="add-to-catalog-modal"> ... </script> <script type="text/html" id="cart-widget"> ... </script> <script type="text/html" id="cart-item"> ... </script> <script type="text/html" id="cart"> ... </script> <script type="text/html" id="order"> ... </script> <script type="text/html" id="finish-order-modal"> ... </script> <!-- libraries --> <script type="text/javascript"   src="js/vendors/jquery.min.js"></script> <script type="text/javascript"   src="js/vendors/bootstrap.min.js"></script> <script type="text/javascript"   src="js/vendors/knockout.debug.js"></script> <script type="text/javascript"   src="js/models/product.js"></script> <script type="text/javascript"   src="js/models/cartProduct.js"></script> <script type="text/javascript" src="js/viewmodel.js"></script> </body> </html> We also have our viewmodel.js file: var vm = (function () { "use strict"; var visibleCatalog = ko.observable(true); var visibleCart = ko.observable(false); var catalog = ko.observableArray([...]); var cart = ko.observableArray([]); var newProduct = {...}; var totalItems = ko.computed(function(){...}); var grandTotal = ko.computed(function(){...}); var searchTerm = ko.observable(""); var filteredCatalog = ko.computed(function () {...}); var addProduct = function (data) {...}; var addToCart = function(data) {...}; var removeFromCart = function (data) {...}; var showCartDetails = function () {...}; var hideCartDetails = function () {...}; var showOrder = function () {...}; var showCatalog = function () {...}; var finishOrder = function() {...}; return {    searchTerm: searchTerm,    catalog: filteredCatalog,    cart: cart,    newProduct: newProduct,    totalItems:totalItems,    grandTotal:grandTotal,    addProduct: addProduct,    addToCart: addToCart,    removeFromCart:removeFromCart,    visibleCatalog: visibleCatalog,    visibleCart: visibleCart,    showCartDetails: showCartDetails,    hideCartDetails: hideCartDetails,    showOrder: showOrder,    showCatalog: showCatalog,    finishOrder: finishOrder }; })(); ko.applyBindings(vm); It is useful to debug to globalize the view-model. It is not good practice in production environments, but it is good when you are debugging your application. Window.vm = vm; Now you have easy access to your view-model from the browser debugger or from your IDE debugger. In addition to the product model, we have created a new model called CartProduct: var CartProduct = function (product, units) { "use strict"; var _product = product,    _units = ko.observable(units); var subtotal = ko.computed(function(){...}); var addUnit = function () {...}; var removeUnit = function () {...}; return {    product: _product,    units: _units,    subtotal: subtotal,    addUnit : addUnit,    removeUnit: removeUnit }; }; You have learned how to manage templates with Knockout, but maybe you have noticed that having all templates in the index.html file is not the best approach. We are going to talk about two mechanisms. The first one is more home-made and the second one is an external library used by lots of Knockout developers, created by Jim Cowart, called Knockout.js-External-Template-Engine (https://github.com/ifandelse/Knockout.js-External-Template-Engine). Managing templates with jQuery Since we want to load templates from different files, let's move all our templates to a folder called views and make one file per template. Each file will have the same name the template has as an ID. So if the template has the ID, cart-item, the file should be called cart-item.html and will contain the full cart-item template: <script type="text/html" id="cart-item"></script>  The views folder with all templates Now in the viewmodel.js file, remove the last line (ko.applyBindings(vm)) and add this code: var templates = [ 'header', 'catalog', 'cart', 'cart-item', 'cart-widget', 'order', 'add-to-catalog-modal', 'finish-order-modal' ];   var busy = templates.length; templates.forEach(function(tpl){ "use strict"; $.get('views/'+ tpl + '.html').then(function(data){    $('body').append(data);    busy--;    if (!busy) {      ko.applyBindings(vm);    } }); }); This code gets all the templates we need and appends them to the body. Once all the templates are loaded, we call the applyBindings method. We should do it this way because we are loading templates asynchronously and we need to make sure that we bind our view-model when all templates are loaded. This is good enough to make our code more maintainable and readable, but is still problematic if we need to handle lots of templates. Further more, if we have nested folders, it becomes a headache listing all our templates in one array. There should be a better approach. Managing templates with koExternalTemplateEngine We have seen two ways of loading templates, both of them are good enough to manage a low number of templates, but when lines of code begin to grow, we need something that allows us to forget about template management. We just want to call a template and get the content. For this purpose, Jim Cowart's library, koExternalTemplateEngine, is perfect. This project was abandoned by the author in 2014, but it is still a good library that we can use when we develop simple projects. We just need to download the library in the js/vendors folder and then link it in our index.html file just below the Knockout library. <script type="text/javascript" src="js/vendors/knockout.debug.js"></script> <script type="text/javascript"   src="js/vendors/koExternalTemplateEngine_all.min.js"></script> Now you should configure it in the viewmodel.js file. Remove the templates array and the foreach statement, and add these three lines of code: infuser.defaults.templateSuffix = ".html"; infuser.defaults.templateUrl = "views"; ko.applyBindings(vm); Here, infuser is a global variable that we use to configure the template engine. We should indicate which suffix will have our templates and in which folder they will be. We don't need the <script type="text/html" id="template-id"></script> tags any more, so we should remove them from each file. So now everything should be working, and the code we needed to succeed was not much. KnockoutJS has its own template engine, but you can see that adding new ones is not difficult. If you have experience with other template engines such as jQuery Templates, Underscore, or Handlebars, just load them in your index.html file and use them, there is no problem with that. This is why Knockout is beautiful, you can use any tool you like with it. You have learned a lot of things in this article, haven't you? Knockout gives us the CSS binding to activate and deactivate CSS classes according to an expression. We can use the style binding to add CSS rules to elements. The template binding helps us to manage templates that are already loaded in the DOM. We can iterate along collections with the foreach binding. Inside a foreach, Knockout gives us some magic variables such as $parent, $parents, $index, $data, and $root. We can use the binding as along with the foreach binding to get an alias for each element. We can show and hide content using just jQuery and CSS. We can show and hide content using the bindings: if, ifnot, and visible. jQuery helps us to load Knockout templates asynchronously. You can use the koExternalTemplateEngine plugin to manage templates in a more efficient way. The project is abandoned but it is still a good solution. Summary In this article, you have learned how to split an application using templates that share the same view-model. Now that we know the basics, it would be interesting to extend the application. Maybe we can try to create a detailed view of the product, or maybe we can give the user the option to register where to send the order. Resources for Article: Further resources on this subject: Components [article] Web Application Testing [article] Top features of KnockoutJS [article]
Read more
  • 0
  • 0
  • 11034
article-image-time-travelling-spring
Packt
03 Mar 2015
18 min read
Save for later

Time Travelling with Spring

Packt
03 Mar 2015
18 min read
This article by Sujoy Acharya, the author of the book Mockito for Spring, delves into the details Time Travelling with Spring. Spring 4.0 is the Java 8-enabled latest release of the Spring Framework. In this article, we'll discover the major changes in the Spring 4.x release and the four important features of the Spring 4 framework. We will cover the following topics in depth: @RestController AsyncRestTemplate Async tasks Caching (For more resources related to this topic, see here.) Discovering the new Spring release This section deals with the new features and enhancements in Spring Framework 4.0. The following are the features: Spring 4 supports Java 8 features such as Java lambda expressions and java.time. Spring 4 supports JDK 6 as the minimum. All deprecated packages/methods are removed. Java Enterprise Edition 6 or 7 are the base of Spring 4, which is based on JPA 2 and Servlet 3.0. Bean configuration using the Groovy DSL is supported in Spring Framework 4.0. Hibernate 4.3 is supported by Spring 4. Custom annotations are supported in Spring 4. Autowired lists and arrays can be ordered. The @Order annotation and the Ordered interface are supported. The @Lazy annotation can now be used on injection points as well as on the @Bean definitions. For the REST application, Spring 4 provides a new @RestController annotation. We will discuss this in detail in the following section. The AsyncRestTemplate feature (class) is added for asynchronous REST client development. Different time zones are supported in Spring 4.0. New spring-websocket and spring-messaging modules have been added. The SocketUtils class is added to examine the free TCP and UDP server ports on localhost. All the mocks under the org.springframework.mock.web package are now based on the Servlet 3.0 specification. Spring supports JCache annotations and new improvements have been made in caching. The @Conditional annotation has been added to conditionally enable or disable an @Configuration class or even individual @Bean methods. In the test module, SQL script execution can now be configured declaratively via the new @Sql and @SqlConfig annotations on a per-class or per-method basis. You can visit the Spring Framework reference at http://docs.spring.io/spring/docs/4.1.2.BUILD-SNAPSHOT/spring-framework-reference/htmlsingle/#spring-whats-new for more details. Also, you can watch a video at http://zeroturnaround.com/rebellabs/spring-4-on-java-8-geekout-2013-video/ for more details on the changes in Spring 4. Working with asynchronous tasks Java 7 has a feature called Future. Futures let you retrieve the result of an asynchronous operation at a later time. The FutureTask class runs in a separate thread, which allows you to perform non-blocking asynchronous operations. Spring provides an @Async annotation to make it more easier to use. We'll explore Java's Future feature and Spring's @Async declarative approach: Create a project, TimeTravellingWithSpring, and add a package, com.packt.async. We'll exercise a bank's use case, where an automated job will run and settle loan accounts. It will also find all the defaulters who haven't paid the loan EMI for a month and then send an SMS to their number. The job takes time to process thousands of accounts, so it will be good if we can send SMSes asynchronously to minimize the burden of the job. We'll create a service class to represent the job, as shown in the following code snippet: @Service public class AccountJob {    @Autowired    private SMSTask smsTask; public void process() throws InterruptedException, ExecutionException { System.out.println("Going to find defaulters... "); Future<Boolean> asyncResult =smsTask.send("1", "2", "3"); System.out.println("Defaulter Job Complete. SMS will be sent to all defaulter"); Boolean result = asyncResult.get(); System.out.println("Was SMS sent? " + result); } } The job class autowires an SMSTask class and invokes the send method with phone numbers. The send method is executed asynchronously and Future is returned. When the job calls the get() method on Future, a result is returned. If the result is not processed before the get() method invocation, the ExecutionException is thrown. We can use a timeout version of the get() method. Create the SMSTask class in the com.packt.async package with the following details: @Component public class SMSTask { @Async public Future<Boolean> send(String... numbers) { System.out.println("Selecting SMS format "); try { Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); return new AsyncResult<>(false); } System.out.println("Async SMS send task is Complete!!!"); return new AsyncResult<>(true); } } Note that the method returns Future, and the method is annotated with @Async to signify asynchronous processing. Create a JUnit test to verify asynchronous processing: @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(locations="classpath:com/packt/async/          applicationContext.xml") public class AsyncTaskExecutionTest { @Autowired ApplicationContext context; @Test public void jobTest() throws Exception { AccountJob job = (AccountJob)context.getBean(AccountJob.class); job.process(); } } The job bean is retrieved from the applicationContext file and then the process method is called. When we execute the test, the following output is displayed: Going to find defaulters... Defaulter Job Complete. SMS will be sent to all defaulter Selecting SMS format Async SMS send task is Complete!!! Was SMS sent? true During execution, you might feel that the async task is executed after a delay of 2 seconds as the SMSTask class waits for 2 seconds. Exploring @RestController JAX-RS provides the functionality for Representational State Transfer (RESTful) web services. REST is well-suited for basic, ad hoc integration scenarios. Spring MVC offers controllers to create RESTful web services. In Spring MVC 3.0, we need to explicitly annotate a class with the @Controller annotation in order to specify a controller servlet and annotate each and every method with @ResponseBody to serve JSON, XML, or a custom media type. With the advent of the Spring 4.0 @RestController stereotype annotation, we can combine @ResponseBody and @Controller. The following example will demonstrate the usage of @RestController: Create a dynamic web project, RESTfulWeb. Modify the web.xml file and add a configuration to intercept requests with a Spring DispatcherServlet: <web-app xsi_schemaLocation="http:// java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/webapp_ 3_0.xsd" id="WebApp_ID" version="3.0"> <display-name>RESTfulWeb</display-name> <servlet> <servlet-name>dispatcher</servlet-name> <servlet-class> org.springframework.web.servlet.DispatcherServlet </servlet-class> <load-on-startup>1</load-on-startup> </servlet> <servlet-mapping> <servlet-name>dispatcher</servlet-name> <url-pattern>/</url-pattern> </servlet-mapping> <context-param> <param-name>contextConfigLocation</param-name> <param-value> /WEB-INF/dispatcher-servlet.xml </param-value> </context-param> </web-app> The DispatcherServlet expects a configuration file with the naming convention [servlet-name]-servlet.xml. Create an application context XML, dispatcher-servlet.xml. We'll use annotations to configure Spring beans, so we need to tell the Spring container to scan the Java package in order to craft the beans. Add the following lines to the application context in order to instruct the container to scan the com.packt.controller package: <context:component-scan base-package= "com.packt.controller" /> <mvc:annotation-driven /> We need a REST controller class to handle the requests and generate a JSON output. Go to the com.packt.controller package and add a SpringService controller class. To configure the class as a REST controller, we need to annotate it with the @RestController annotation. The following code snippet represents the class: @RestController @RequestMapping("/hello") public class SpringService { private Set<String> names = new HashSet<String>(); @RequestMapping(value = "/{name}", method =          RequestMethod.GET) public String displayMsg(@PathVariable String name) {    String result = "Welcome " + name;    names.add(name);    return result; } @RequestMapping(value = "/all/", method =          RequestMethod.GET) public String anotherMsg() {    StringBuilder result = new StringBuilder("We          greeted so far ");    for(String name:names){      result.append(name).append(", ");    }    return result.toString();  } } We annotated the class with @RequestMapping("/hello"). This means that the SpringService class will cater for the requests with the http://{site}/{context}/hello URL pattern, or since we are running the app in localhost, the URL can be http://localhost:8080/RESTfulWeb/hello. The displayMsg method is annotated with @RequestMapping(value = "/{name}", method = RequestMethod.GET). So, the method will handle all HTTP GET requests with the URL pattern /hello/{name}. The name can be any String, such as /hello/xyz or /hello/john. In turn, the method stores the name to Set for later use and returns a greeting message, welcome {name}. The anotherMsg method is annotated with @RequestMapping(value = "/all/", method = RequestMethod.GET), which means that the method accepts all the requests with the http://{SITE}/{Context}/hello/all/ URL pattern. Moreover, this method builds a list of all users who visited the /hello/{names} URL. Remember, the displayMsg method stores the names in Set; this method iterates Set and builds a list of names who visited the /hello/{name} URL. There is some confusion though: what will happen if you enter the /hello/all URL in the browser? When we pass only a String literal after /hello/, the displayMsg method handles it, so you will be greeted with welcome all. However, if you type /hello/all/ instead—note that we added a slash after all—it means that the URL does not match the /hello/{name} pattern and the second method will handle the request and show you the list of users who visited the first URL. When we run the application and access the /hello/{name} URL, the following output is displayed: When we access http://localhost:8080/RESTfulWeb/hello/all/, the following output is displayed: Therefore, our RESTful application is ready for use, but just remember that in the real world, you need to secure the URLs against unauthorized access. In a web service, development security plays a key role. You can read the Spring security reference manual for additional information. Learning AsyncRestTemplate We live in a small, wonderful world where everybody is interconnected and impatient! We are interconnected through technology and applications, such as social networks, Internet banking, telephones, chats, and so on. Likewise, our applications are interconnected; often, an application housed in India may need to query an external service hosted in Philadelphia to get some significant information. We are impatient as we expect everything to be done in seconds; we get frustrated when we make an HTTP call to a remote service, and this blocks the processing unless the remote response is back. We cannot finish everything in milliseconds or nanoseconds, but we can process long-running tasks asynchronously or in a separate thread, allowing the user to work on something else. To handle RESTful web service calls asynchronously, Spring offers two useful classes: AsyncRestTemplate and ListenableFuture. We can make an async call using the template and get Future back and then continue with other processing, and finally we can ask Future to get the result. This section builds an asynchronous RESTful client to query the RESTful web service we developed in the preceding section. The AsyncRestTemplate class defines an array of overloaded methods to access RESTful web services asynchronously. We'll explore the exchange and execute methods. The following are the steps to explore the template: Create a package, com.packt.rest.template. Add a AsyncRestTemplateTest JUnit test. Create an exchange() test method and add the following lines: @Test public void exchange(){ AsyncRestTemplate asyncRestTemplate = new AsyncRestTemplate(); String url ="http://localhost:8080/RESTfulWeb/ hello/all/"; HttpMethod method = HttpMethod.GET; Class<String> responseType = String.class; HttpHeaders headers = new HttpHeaders(); headers.setContentType(MediaType.TEXT_PLAIN); HttpEntity<String> requestEntity = new HttpEntity<String>("params", headers); ListenableFuture<ResponseEntity<String>> future = asyncRestTemplate.exchange(url, method, requestEntity, responseType); try { //waits for the result ResponseEntity<String> entity = future.get(); //prints body of the given URL System.out.println(entity.getBody()); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } } The exchange() method has six overloaded versions. We used the method that takes a URL, an HttpMethod method such as GET or POST, an HttpEntity method to set the header, and finally a response type class. We called the exchange method, which in turn called the execute method and returned ListenableFuture. The ListenableFuture is the handle to our output; we invoked the GET method on ListenableFuture to get the RESTful service call response. The ResponseEntity has the getBody, getClass, getHeaders, and getStatusCode methods for extracting the web service call response. We invoked the http://localhost:8080/RESTfulWeb/hello/all/ URL and got back the following response: Now, create an execute test method and add the following lines: @Test public void execute(){ AsyncRestTemplate asyncTemp = new AsyncRestTemplate(); String url ="http://localhost:8080/RESTfulWeb /hello/reader"; HttpMethod method = HttpMethod.GET; HttpHeaders headers = new HttpHeaders(); headers.setContentType(MediaType.TEXT_PLAIN); AsyncRequestCallback requestCallback = new AsyncRequestCallback (){ @Override public void doWithRequest(AsyncClientHttpRequest request) throws IOException { System.out.println(request.getURI()); } }; ResponseExtractor<String> responseExtractor = new ResponseExtractor<String>(){ @Override public String extractData(ClientHttpResponse response) throws IOException { return response.getStatusText(); } }; Map<String,String> urlVariable = new HashMap<String, String>(); ListenableFuture<String> future = asyncTemp.execute(url, method, requestCallback, responseExtractor, urlVariable); try { //wait for the result String result = future.get(); System.out.println("Status =" +result); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } } The execute method has several variants. We invoke the one that takes a URL, HttpMethod such as GET or POST, an AsyncRequestCallback method which is invoked from the execute method just before executing the request asynchronously, a ResponseExtractor to extract the response, such as a response body, status code or headers, and a URL variable such as a URL that takes parameters. We invoked the execute method and received a future, as our ResponseExtractor extracts the status code. So, when we ask the future to get the result, it returns the response status which is OK or 200. In the AsyncRequestCallback method, we invoked the request URI; hence, the output first displays the request URI and then prints the response status. The following is the output: Caching objects Scalability is a major concern in web application development. Generally, most web traffic is focused on some special set of information. So, only those records are queried very often. If we can cache these records, then the performance and scalability of the system will increase immensely. The Spring Framework provides support for adding caching into an existing Spring application. In this section, we'll work with EhCache, the most widely used caching solution. Download the latest EhCache JAR from the Maven repository; the URL to download version 2.7.2 is http://mvnrepository.com/artifact/net.sf.ehcache/ehcache/2.7.2. Spring provides two annotations for caching: @Cacheable and @CacheEvict. These annotations allow methods to trigger cache population or cache eviction, respectively. The @Cacheable annotation is used to identify a cacheable method, which means that for an annotate method the result is stored into the cache. Therefore, on subsequent invocations (with the same arguments), the value in the cache is returned without actually executing the method. The cache abstraction allows the eviction of cache for removing stale or unused data from the cache. The @CacheEvict annotation demarcates the methods that perform cache eviction, that is, methods that act as triggers to remove data from the cache. The following are the steps to build a cacheable application with EhCache: Create a serializable Employee POJO class in the com.packt.cache package to store the employee ID and name. The following is the class definition: public class Employee implements Serializable { private static final long serialVersionUID = 1L; private final String firstName, lastName, empId;   public Employee(String empId, String fName, String lName) {    this.firstName = fName;    this.lastName = lName;    this.empId = empId; //Getter methods Spring caching supports two storages: the ConcurrentMap and ehcache libraries. To configure caching, we need to configure a manager in the application context. The org.springframework.cache.ehcache.EhCacheCacheManager class manages ehcache. Then, we need to define a cache with a configurationLocation attribute. The configurationLocation attribute defines the configuration resource. The ehcache-specific configuration is read from the resource ehcache.xml. <beans   xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans- 4.1.xsd http://www.springframework.org/schema/cache http://www. springframework.org/schema/cache/spring-cache- 4.1.xsd http://www.springframework.org/schema/context http://www. springframework.org/schema/context/springcontext- 4.1.xsd "> <context:component-scan base-package= "com.packt.cache" /> <cache:annotation-driven/> <bean id="cacheManager" class="org.springframework.cache. ehcache.EhCacheCacheManager" p:cacheManager-ref="ehcache"/> <bean id="ehcache" class="org.springframework.cache. ehcache.EhCacheManagerFactoryBean" p:configLocation="classpath:com/packt/cache/ehcache.xml"/> </beans> The <cache:annotation-driven/> tag informs the Spring container that the caching and eviction is performed in annotated methods. We defined a cacheManager bean and then defined an ehcache bean. The ehcache bean's configLocation points to an ehcache.xml file. We'll create the file next. Create an XML file, ehcache.xml, under the com.packt.cache package and add the following cache configuration data: <ehcache>    <diskStore path="java.io.tmpdir"/>    <cache name="employee"            maxElementsInMemory="100"            eternal="false"            timeToIdleSeconds="120"            timeToLiveSeconds="120"            overflowToDisk="true"            maxElementsOnDisk="10000000"            diskPersistent="false"            diskExpiryThreadIntervalSeconds="120"            memoryStoreEvictionPolicy="LRU"/>   </ehcache> The XML configures many things. Cache is stored in memory, but memory has a limit, so we need to define maxElementsInMemory. EhCache needs to store data to disk when max elements in memory reaches the threshold limit. We provide diskStore for this purpose. The eviction policy is set as an LRU, but the most important thing is the cache name. The name employee will be used to access the cache configuration. Now, create a service to store the Employee objects in a HashMap. The following is the service: @Service public class EmployeeService { private final Map<String, Employee> employees = new ConcurrentHashMap<String, Employee>(); @PostConstruct public void init() { saveEmployee (new Employee("101", "John", "Doe")); saveEmployee (new Employee("102", "Jack", "Russell")); } @Cacheable("employee") public Employee getEmployee(final String employeeId) { System.out.println(String.format("Loading a employee with id of : %s", employeeId)); return employees.get(employeeId); } @CacheEvict(value = "employee", key = "#emp.empId") public void saveEmployee(final Employee emp) { System.out.println(String.format("Saving a emp with id of : %s", emp.getEmpId())); employees.put(emp.getEmpId(), emp); } } The getEmployee method is a cacheable method; it uses the cache employee. When the getEmployee method is invoked more than once with the same employee ID, the object is returned from the cache instead of the original method being invoked. The saveEmployee method is a CacheEvict method. Now, we'll examine caching. We'll call the getEmployee method twice; the first call will populate the cache and the subsequent call will be responded toby the cache. Create a JUnit test, CacheConfiguration, and add the following lines: @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(locations="classpath:com/packt/cache/ applicationContext.xml") public class CacheConfiguration { @Autowired ApplicationContext context; @Test public void jobTest() throws Exception { EmployeeService employeeService = (EmployeeService)context.getBean(EmployeeService.class); long time = System.currentTimeMillis(); employeeService.getEmployee("101"); System.out.println("time taken ="+(System.currentTimeMillis() - time)); time = System.currentTimeMillis(); employeeService.getEmployee("101"); System.out.println("time taken to read from cache ="+(System.currentTimeMillis() - time)); time = System.currentTimeMillis(); employeeService.getEmployee("102"); System.out.println("time taken ="+(System.currentTimeMillis() - time)); time = System.currentTimeMillis(); employeeService.getEmployee("102"); System.out.println("time taken to read from cache ="+(System.currentTimeMillis() - time)); employeeService.saveEmployee(new Employee("1000", "Sujoy", "Acharya")); time = System.currentTimeMillis(); employeeService.getEmployee("1000"); System.out.println("time taken ="+(System.currentTimeMillis() - time)); time = System.currentTimeMillis(); employeeService.getEmployee("1000"); System.out.println("time taken to read from cache ="+(System.currentTimeMillis() - time)); } } Note that the getEmployee method is invoked twice for each employee, and we recorded the method execution time in milliseconds. You will find from the output that every second call is answered by the cache, as the first call prints Loading a employee with id of : 101 and then the next call doesn't print the message but prints the time taken to execute. You will also find that the time taken for the cached objects is zero or less than the method invocation time. The following screenshot shows the output: Summary This article started with discovering the features of the new major Spring release 4.0, such as Java 8 support and so on. Then, we picked four Spring 4 topics and explored them one by one. The @Async section showcased the execution of long-running methods asynchronously and provided an example of how to handle asynchronous processing. The @RestController section eased the RESTful web service development with the advent of the @RestController annotation. The AsyncRestTemplate section explained the RESTful client code to invoke RESTful web service asynchronously. Caching is inevitable for a high-performance, scalable web application. The caching section explained the EhCache and Spring integrations to achieve a high-availability caching solution. Resources for Article: Further resources on this subject: Getting Started with Mockito [article] Progressive Mockito [article] Understanding outside-in [article]
Read more
  • 0
  • 0
  • 2002

article-image-getting-started-postgresql
Packt
03 Mar 2015
11 min read
Save for later

Getting Started with PostgreSQL

Packt
03 Mar 2015
11 min read
In this article by Ibrar Ahmed, Asif Fayyaz, and Amjad Shahzad, authors of the book PostgreSQL Developer's Guide, we will come across the basic features and functions of PostgreSQL, such as writing queries using psql, data definition in tables, and data manipulation from tables. (For more resources related to this topic, see here.) PostgreSQL is widely considered to be one of the most stable database servers available today, with multiple features that include: A wide range of built-in types MVCC New SQL enhancements, including foreign keys, primary keys, and constraints Open source code, maintained by a team of developers Trigger and procedure support with multiple procedural languages Extensibility in the sense of adding new data types and the client language From the early releases of PostgreSQL (from version 6.0 that is), many changes have been made, with each new major version adding new and more advanced features. The current version is PostgreSQL 9.4 and is available from several sources and in various binary formats. Writing queries using psql Before proceeding, allow me to explain to you that throughout this article, we will use a warehouse database called warehouse_db. In this section, I will show you how you can create such a database, providing you with sample code for assistance. You will need to do the following: We are assuming here that you have successfully installed PostgreSQL and faced no issues. Now, you will need to connect with the default database that is created by the PostgreSQL installer. To do this, navigate to the default path of installation, which is /opt/PostgreSQL/9.4/bin from your command line, and execute the following command that will prompt for a postgres user password that you provided during the installation: /opt/PostgreSQL/9.4/bin$./psql -U postgres Password for user postgres: Using the following command, you can log in to the default database with the user postgres and you will be able to see the following on your command line: psql (9.4beta1) Type "help" for help postgres=# You can then create a new database called warehouse_db using the following statement in the terminal: postgres=# CREATE DATABASE warehouse_db; You can then connect with the warehouse_db database using the following command: postgres=# c warehouse_db You are now connected to the warehouse_db database as the user postgres, and you will have the following warehouse_db shell: warehouse_db=# Let's summarize what we have achieved so far. We are now able to connect with the default database postgres and created a warehouse_db database successfully. It's now time to actually write queries using psql and perform some Data Definition Language (DDL) and Data Manipulation Language (DML) operations, which we will cover in the following sections. In PostgreSQL, we can have multiple databases. Inside the databases, we can have multiple extensions and schemas. Inside each schema, we can have database objects such as tables, views, sequences, procedures, and functions. We are first going to create a schema named record and then we will create some tables in this schema. To create a schema named record in the warehouse_db database, use the following statement: warehouse_db=# CREATE SCHEMA record; Creating, altering, and truncating a table In this section, we will learn about creating a table, altering the table definition, and truncating the table. Creating tables Now, let's perform some DDL operations starting with creating tables. To create a table named warehouse_tbl, execute the following statements: warehouse_db=# CREATE TABLE warehouse_tbl ( warehouse_id INTEGER NOT NULL, warehouse_name TEXT NOT NULL, year_created INTEGER, street_address TEXT, city CHARACTER VARYING(100), state CHARACTER VARYING(2), zip CHARACTER VARYING(10), CONSTRAINT "PRIM_KEY" PRIMARY KEY (warehouse_id) ); The preceding statements created the table warehouse_tbl that has the primary key warehouse_id. Now, as you are familiar with the table creation syntax, let's create a sequence and use that in a table. You can create the hist_id_seq sequence using the following statement: warehouse_db=# CREATE SEQUENCE hist_id_seq; The preceding CREATE SEQUENCE command creates a new sequence number generator. This involves creating and initializing a new special single-row table with the name hist_id_seq. The user issuing the command will own the generator. You can now create the table that implements the hist_id_seq sequence using the following statement: warehouse_db=# CREATE TABLE history ( history_id INTEGER NOT NULL DEFAULT nextval('hist_id_seq'), date TIMESTAMP WITHOUT TIME ZONE, amount INTEGER, data TEXT, customer_id INTEGER, warehouse_id INTEGER, CONSTRAINT "PRM_KEY" PRIMARY KEY (history_id), CONSTRAINT "FORN_KEY" FOREIGN KEY (warehouse_id) REFERENCES warehouse_tbl(warehouse_id) ); The preceding query will create a history table in the warehouse_db database, and the history_id column uses the sequence as the default input value. In this section, we successfully learned how to create a table and also learned how to use a sequence inside the table creation syntax. Altering tables Now that we have learned how to create multiple tables, we can practice some ALTER TABLE commands by following this section. With the ALTER TABLE command, we can add, remove, or rename table columns. Firstly, with the help of the following example, we will be able to add the phone_no column in the previously created table warehouse_tbl: warehouse_db=# ALTER TABLE warehouse_tbl ADD COLUMN phone_no INTEGER; We can then verify that a column is added in the table by describing the table as follows: warehouse_db=# d warehouse_tbl            Table "public.warehouse_tbl"                  Column     |         Type         | Modifiers ----------------+------------------------+----------- warehouse_id  | integer               | not null warehouse_name | text                   | not null year_created   | integer               | street_address | text                   | city           | character varying(100) | state           | character varying(2)   | zip             | character varying(10) | phone_no       | integer               | Indexes: "PRIM_KEY" PRIMARY KEY, btree (warehouse_id) Referenced by: TABLE "history" CONSTRAINT "FORN_KEY"FOREIGN KEY  (warehouse_id) REFERENCES warehouse_tbl(warehouse_id) TABLE  "history" CONSTRAINT "FORN_KEY" FOREIGN KEY (warehouse_id)  REFERENCES warehouse_tbl(warehouse_id) To drop a column from a table, we can use the following statement: warehouse_db=# ALTER TABLE warehouse_tbl DROP COLUMN phone_no; We can then finally verify that the column has been removed from the table by describing the table again as follows: warehouse_db=# d warehouse_tbl            Table "public.warehouse_tbl"                  Column     |         Type         | Modifiers ----------------+------------------------+----------- warehouse_id   | integer               | not null warehouse_name | text                   | not null year_created   | integer               | street_address | text                   | city           | character varying(100) | state           | character varying(2)   | zip             | character varying(10) | Indexes: "PRIM_KEY" PRIMARY KEY, btree (warehouse_id) Referenced by: TABLE "history" CONSTRAINT "FORN_KEY" FOREIGN KEY  (warehouse_id) REFERENCES warehouse_tbl(warehouse_id) TABLE  "history" CONSTRAINT "FORN_KEY" FOREIGN KEY (warehouse_id)  REFERENCES warehouse_tbl(warehouse_id) Truncating tables The TRUNCATE command is used to remove all rows from a table without providing any criteria. In the case of the DELETE command, the user has to provide the delete criteria using the WHERE clause. To truncate data from the table, we can use the following statement: warehouse_db=# TRUNCATE TABLE warehouse_tbl; We can then verify that the warehouse_tbl table has been truncated by performing a SELECT COUNT(*) query on it using the following statement: warehouse_db=# SELECT COUNT(*) FROM warehouse_tbl; count -------      0 (1 row) Inserting, updating, and deleting data from tables In this section, we will play around with data and learn how to insert, update, and delete data from a table. Inserting data So far, we have learned how to create and alter a table. Now it's time to play around with some data. Let's start by inserting records in the warehouse_tbl table using the following command snippet: warehouse_db=# INSERT INTO warehouse_tbl ( warehouse_id, warehouse_name, year_created, street_address, city, state, zip ) VALUES ( 1, 'Mark Corp', 2009, '207-F Main Service Road East', 'New London', 'CT', 4321 ); We can then verify that the record has been inserted by performing a SELECT query on the warehouse_tbl table as follows: warehouse_db=# SELECT warehouse_id, warehouse_name, street_address               FROM warehouse_tbl; warehouse_id | warehouse_name |       street_address         ---------------+----------------+------------------------------- >             1 | Mark Corp     | 207-F Main Service Road East (1 row) Updating data Once we have inserted data in our table, we should know how to update it. This can be done using the following statement: warehouse_db=# UPDATE warehouse_tbl SET year_created=2010 WHERE year_created=2009; To verify that a record is updated, let's perform a SELECT query on the warehouse_tbl table as follows: warehouse_db=# SELECT warehouse_id, year_created FROM               warehouse_tbl; warehouse_id | year_created --------------+--------------            1 |         2010 (1 row) Deleting data To delete data from a table, we can use the DELETE command. Let's add a few records to the table and then later on delete data on the basis of certain conditions: warehouse_db=# INSERT INTO warehouse_tbl ( warehouse_id, warehouse_name, year_created, street_address, city, state, zip ) VALUES ( 2, 'Bill & Co', 2014, 'Lilly Road', 'New London', 'CT', 4321 ); warehouse_db=# INSERT INTO warehouse_tbl ( warehouse_id, warehouse_name, year_created, street_address, city, state, zip ) VALUES ( 3, 'West point', 2013, 'Down Town', 'New London', 'CT', 4321 ); We can then delete data from the warehouse.tbl table, where warehouse_name is Bill & Co, by executing the following statement: warehouse_db=# DELETE FROM warehouse_tbl WHERE warehouse_name='Bill & Co'; To verify that a record has been deleted, we will execute the following SELECT query: warehouse_db=# SELECT warehouse_id, warehouse_name FROM warehouse_tbl WHERE warehouse_name='Bill & Co'; warehouse_id | warehouse_name --------------+---------------- (0 rows) The DELETE command is used to drop a row from a table, whereas the DROP command is used to drop a complete table. The TRUNCATE command is used to empty the whole table. Summary In this article, we learned how to utilize the SQL language for a collection of everyday DBMS exercises in an easy-to-use practical way. We also figured out how to make a complete database that incorporates DDL (create, alter, and truncate) and DML (insert, update, and delete) operators. Resources for Article: Further resources on this subject: Indexes [Article] Improving proximity filtering with KNN [Article] Using Unrestricted Languages [Article]
Read more
  • 0
  • 0
  • 2587

article-image-basic-sql-server-administration
Packt
03 Mar 2015
11 min read
Save for later

Basic SQL Server Administration

Packt
03 Mar 2015
11 min read
 In this article by Donabel Santos, the author of PowerShell for SQL Server Essentials, we will look at how to accomplish typical SQL Server administration tasks by using PowerShell. Many of the tasks that we will see can be accomplished by using SQL Server Management Objects (SMO). As we encounter new SMO classes, it is best to verify the properties and methods of that class using Get-Help, or by directly visiting the TechNet or MSDN website. (For more resources related to this topic, see here.) Listing databases and tables Let's start out by listing the current databases. The SMO Server class has access to all the databases in that instance, so a server variable will have to be created first. To create one using Windows Authentication, you can use the following snippet: Import-Module SQLPS -DisableNameChecking #current server name $servername = "ROGUE"   #below should be a single line of code $server = New-Object "Microsoft.SqlServer.Management.  Smo.Server" $servername If you need to use SQL Server Authentication, you can set the LoginSecure property to false, and prompt the user for the database credentials: #with SQL authentication, we need #to supply the SQL Login and password $server.ConnectionContext.LoginSecure=$false; $credential = Get-Credential $server.ConnectionContext.set_Login($credential.UserName) $server.ConnectionContext.set_SecurePassword($credential.Password) Another way is to create a Microsoft.SqlServer.Management.Common.ServerConnection object and pass the database connection string: #code below is a single line $connectionString = "Server=$dataSource;uid=$username;   pwd=$passwordd;Database=$database;Integrated Security=False"   $connection = New-Object System.Data.SqlClient.SqlConnection $connection.ConnectionString = $connectionString To find out how many databases are there, you can use the Count property of the Databases property: $server.databases.Count In addition to simply displaying the number of databases in an instance, we can also find out additional information such as creation data, recovery model, number of tables, stored procedures, and user-defined functions. The following is a sample script that pulls this information: #create empty array $result = @() $server.Databases | Where-Object IsSystemObject -eq $false | ForEach-Object {     $db = $_     $object = [PSCustomObject] @{        Name          = $db.Name        CreateDate    = $db.CreateDate        RecoveryModel = $db.RecoveryModel        NumTables     = $db.Tables.Count        NumUsers      = $db.Users.Count        NumSP         = $db.StoredProcedures.Count        NumUDF        = $db.UserDefinedFunctions.Count     }     $result += $object } $result | Format-Table -AutoSize A sample result looks like the following screenshot: In this script, we have manipulated the output a little. Since we want information in a format different from the default, we created a custom object using the PSCustomObject class to store all this information. The PSCustomObject class was introduced in PowerShell V3. You can also use PSCustomObject to draw data points from different objects and pull them together in a single result set. Each line in the sample result shown in the preceding screenshot is a single PSCustomObject. All of these, in turn, are stored in the $result array, which can be piped to the Format-Table cmdlet for a little easier display. After learning these basics about PSCustomObject, you can adapt this script to increase the list of properties you are querying and change the formatting of the display. You can also export these to a file if you need to. To find out additional properties, you can pipe $server.Databases to the Get-Member cmdlet: $server.Databases | Get-Member | Where-Object MemberType –eq "Property" Once you execute this, your resulting screen should look similar to the following screenshot: To find out which methods are available for SMO database objects, we can use a very similar snippet, but this time, we will filter based on methods: $server.Databases | Get-Member | Where-Object MemberType –eq "Method" Once you execute this, your resulting screen should look similar to the following screenshot: Listing database files and filegroups Managing databases also incorporates monitoring and managing of the files and filegroups associated with these databases. Still, using SMO, we can pull this information via PowerShell. You can start by pulling all non-system databases: $server.Databases | Where-Object IsSystemObject -eq $false The preceding snippet iterates over all the databases in the system. You can use the Foreach-Object cmdlet to do the iteration, and for each iteration, you can get a handle to the current database object. The SMO database object will have access to a Filegroups property, which you can query to find out more about the filegroups associated with each database: ForEach-Object {   $db = $_   $db.FileGroups } This FileGroups class, in turn, can access all the files in that specific filegroup. Here is the complete script that lists all files and filegroups for all databases. Note that we use Foreach-Object several times: once to loop through all databases, then to loop through all filegroups for each database, and again to loop through all files in each filegroup: Import-Module SQLPS -DisableNameChecking   #current server name $servername = "ROGUE"   $server = New-Object "Microsoft.SqlServer.Management.Smo.  Server" $servername   $result = @()   $server.Databases | Where-Object IsSystemObject -eq $false | ForEach-Object {    $db = $_    $db.FileGroups |    ForEach-Object {       $fg = $_       $fg.Files |       ForEach-Object {          $file = $_            $object = [PSCustomObject] @{                 Database = $db.Name                 FileGroup = $fg.Name                 FileName = $file.FileName | Split-Path -Leaf                 "Size(MB)" = "{0:N2}" -f ($file.Size/1024)                 "UsedSpace(MB)" = "{0:N2}" -f ($file.UsedSpace/1MB)                 }          $result += $object         }    } } $result | Format-Table -AutoSize A sample result looks like the following screenshot: We have adjusted the result to make the display a bit more readable. For the FileName property, we extracted just the actual filename and did not report the path by piping the FileName property to the Split-Path cmdlet. The -Leaf option provides the filename part of the full path: $file.FileName | Split-Path -Leaf With Size and UsedSpace, we report the value in megabytes (MB). Since the default sizes are reported in kilobytes (KB), we have to divide the value by 1024. We also display the values with two decimal places: "Size(MB)" = "{0:N2}" -f ($file.Size/1024)< "UsedSpace(MB)" = "{0:N2}" -f ($file.UsedSpace/1MB) If you simply want to get the directory where the primary datafile is stored, you can use the following command: $db.PrimaryFilePath If you want to export the results to Excel or CSV, you simply need to take $result and instead of piping it to Format-Table, use one of the Export or Convert cmdlets. Adding files and filegroups Filegroups in SQL Server allow for a group of files to be managed together. It is almost akin to having folders on your desktop to allow you to manage, move, and save files together. To add a filegroup, you have to use the Microsoft.SqlServer.Management.Smo.Filegroup class. Assuming you already have variables that point to your server instance, you can create a variable that references the database you wish to work with, as shown in the following snippet: $dbname = "Registration" $db = $server.Databases[$dbname] Instantiating a Filegroup variable requires the handle to the SMO database object and a filegroup name. We have shown this in the following screenshot: #code below is a single line $fg = New-Object "Microsoft.SqlServer.Management.Smo.  Filegroup" $db, "FG1" When you're ready to create, invoke the Create() method: $fg.Create() Adding a datafile uses a similar approach. You need to identify which filegroup this new datafile belongs to. You will also need to identify the logical filename and actual file path of the new file. The following snippet will help you do that: #code below is a single line $datafile = New-Object "Microsoft.SqlServer.Management.Smo.DataFile" $fg, "data4"   $datafile.FileName = "C:DATAdata4.ndf" $datafile.Create() You can verify the changes visually in SQL Server Management Studio when you go to the database's properties. Under Files, you will see that the new secondary file, data4.ndf, has been added. If, at a later time, you need to increase any of the files' sizes, you can use SMO to create a handle to the file and change the Size property. The Size property is allocated by KB, so you will need to calculate accordingly. After the Size property is changed, invoke the Alter() method to persist the changes. The following is an example snippet to do this: $db = $server.Databases[$dbname] $fg = $db.FileGroups["FG1"] $file = $fg.Files["data4"] $file.Size = 2 * 1024 #2MB $file.Alter() Listing the processes SQL Server has a number of processes in the background that are needed for a normal operation. The SMO server class can access the list of processes by using the method EnumProcesses(). The following is an example script to pull current non-system processes, the programs that are using them, the databases that are using them, and the account that's configured to use/run them: Import-Module SQLPS -DisableNameChecking   #current server name $servername = "ROGUE"   $server = New-Object "Microsoft.SqlServer.Management.Smo.Server" $servername   $server.EnumProcesses() | Where-Object IsSystem -eq $false | Select-Object Spid, Database, IsSystem, Login, Status, Cpu, MemUsage, Program | Format-Table -AutoSize The result that you will get looks like the following screenshot: You can adjust this script based on your needs. For example, if you only need running queries, you can pipe it to the Where-Object cmdlet and filter by status. You can also sort the result based on the highest CPU or memory usage by piping this to the Sort-Object cmdlet. Should you need to kill any process, for example when some processes are blocked, you can use the KillProcess() method of the SMO server object. You will need to pass the SQL Server session ID (or SPID) to this method: $server.KillProcess($blockingSpid) If you want to kill all processes in a specific database, you can use the KillAllProcesses() method and pass the database name: $server.KillAllProcesses($dbname) Be careful though. Killing processes should not be done lightly. Before you kill a process, investigate what the process does, why you need to kill it, and what potential effects killing it will have on your database. Otherwise, killing processes could result in varying levels of system instability. Checking enabled features SQL has many features. We can find out if certain features are enabled by using SMO and PowerShell. To determine this, you need to access the object that owns that feature. For example, some features are available to be queried once you create an SMO server object: Import-Module SQLPS -DisableNameChecking   #current server name $servername = "ROGUE"   $server = New-Object "Microsoft.SqlServer.Management.Smo.Server" $servername   $server | Select-Object IsClustered, ClusterName, FilestreamLevel, IsFullTextInstalled, LinkedServers, IsHadrEnabled, AvailabilityGroups In the preceding script, we can easily find out the following parameters: Is the server clustered (IsClustered)? Does it support FileStream and to what level (FilestreamLevel)? Is FullText installed (IsFullTextInstalled)? Are there any configured linked servers in the system (LinkedServers)? Is AlwaysOn enabled (IsHadrEnabled) and are any availability groups configured (AvailabilityGroups)? There are also a number of cmdlets available with the SQLPS module that allow you to manage the AlwaysOn parameter: Replication can also be managed programmatically using the Replication Management Objects assembly. More information can be found at http://msdn.microsoft.com/en-us/library/ms146869.aspx. Summary In this article, we looked at some of the commands that can used to perform basic SQL Server administration tasks in PowerShell. Resources for Article: Further resources on this subject: Sql Server Analysis Services Administering and Monitoring Analysis Services? [article] Unleashing your Development Skills Powershell [article] The Arduino Mobile Robot [article]
Read more
  • 0
  • 0
  • 1635
article-image-mapreduce-functions
Packt
03 Mar 2015
11 min read
Save for later

MapReduce functions

Packt
03 Mar 2015
11 min read
 In this article, by John Zablocki, author of the book, Couchbase Essentials, you will be acquainted to MapReduce and how you'll use it to create secondary indexes for our documents. At its simplest, MapReduce is a programming pattern used to process large amounts of data that is typically distributed across several nodes in parallel. In the NoSQL world, MapReduce implementations may be found on many platforms from MongoDB to Hadoop, and of course, Couchbase. Even if you're new to the NoSQL landscape, it's quite possible that you've already worked with a form of MapReduce. The inspiration for MapReduce in distributed NoSQL systems was drawn from the functional programming concepts of map and reduce. While purely functional programming languages haven't quite reached mainstream status, languages such as Python, C#, and JavaScript all support map and reduce operations. (For more resources related to this topic, see here.) Map functions Consider the following Python snippet: numbers = [1, 2, 3, 4, 5] doubled = map(lambda n: n * 2, numbers) #doubled == [2, 4, 6, 8, 10] These two lines of code demonstrate a very simple use of a map() function. In the first line, the numbers variable is created as a list of integers. The second line applies a function to the list to create a new mapped list. In this case, the map() function is supplied as a Python lambda, which is just an inline, unnamed function. The body of lambda multiplies each number by two. This map() function can be made slightly more complex by doubling only odd numbers, as shown in this code: numbers = [1, 2, 3, 4, 5] defdouble_odd(num):   if num % 2 == 0:     return num   else:     return num * 2   doubled = map(double_odd, numbers) #doubled == [2, 2, 6, 4, 10] Map functions are implemented differently in each language or platform that supports them, but all follow the same pattern. An iterable collection of objects is passed to a map function. Each item of the collection is then iterated over with the map function being applied to that iteration. The final result is a new collection where each of the original items is transformed by the map. Reduce functions Like maps, the reduce functions also work by applying a provided function to an iterable data structure. The key difference between the two is that the reduce function works to produce a single value from the input iterable. Using Python's built-in reduce() function, we can see how to produce a sum of integers, as follows: numbers = [1, 2, 3, 4, 5] sum = reduce(lambda x, y: x + y, numbers) #sum == 15 You probably noticed that unlike our map operation, the reduce lambda has two parameters (x and y in this case). The argument passed to x will be the accumulated value of all applications of the function so far, and y will receive the next value to be added to the accumulation. Parenthetically, the order of operations can be seen as ((((1 + 2) + 3) + 4) + 5). Alternatively, the steps are shown in the following list: x = 1, y = 2 x = 3, y = 3 x = 6, y = 4 x = 10, y = 5 x = 15 As this list demonstrates, the value of x is the cumulative sum of previous x and y values. As such, reduce functions are sometimes termed accumulate or fold functions. Regardless of their name, reduce functions serve the common purpose of combining pieces of a recursive data structure to produce a single value. Couchbase MapReduce Creating an index (or view) in Couchbase requires creating a map function written in JavaScript. When the view is created for the first time, the map function is applied to each document in the bucket containing the view. When you update a view, only new or modified documents are indexed. This behavior is known as incremental MapReduce. You can think of a basic map function in Couchbase as being similar to a SQL CREATE INDEX statement. Effectively, you are defining a column or a set of columns, to be indexed by the server. Of course, these are not columns, but rather properties of the documents to be indexed. Basic mapping To illustrate the process of creating a view, first imagine that we have a set of JSON documents as shown here: var books=[     { "id": 1, "title": "The Bourne Identity", "author": "Robert Ludlow"     },     { "id": 2, "title": "The Godfather", "author": "Mario Puzzo"     },     { "id": 3, "title": "Wiseguy", "author": "Nicholas Pileggi"     } ]; Each document contains title and author properties. In Couchbase, to query these documents by either title or author, we'd first need to write a map function. Without considering how map functions are written in Couchbase, we're able to understand the process with vanilla JavaScript: books.map(function(book) {   return book.author; }); In the preceding snippet, we're making use of the built-in JavaScript array's map() function. Similar to the Python snippets we saw earlier, JavaScript's map() function takes a function as a parameter and returns a new array with mapped objects. In this case, we'll have an array with each book's author, as follows: ["Robert Ludlow", "Mario Puzzo", "Nicholas Pileggi"] At this point, we have a mapped collection that will be the basis for our author index. However, we haven't provided a means for the index to be able to refer back to its original document. If we were using a relational database, we'd have effectively created an index on the Title column with no way to get back to the row that contained it. With a slight modification to our map function, we are able to provide the key (the id property) of the document as well in our index: books.map(function(book) {   return [book.author, book.id]; }); In this slightly modified version, we're including the ID with the output of each author. In this way, the index has its document's key stored with its title. [["The Bourne Identity", 1], ["The Godfather", 2], ["Wiseguy", 3]] We'll soon see how this structure more closely resembles the values stored in a Couchbase index. Basic reducing Not every Couchbase index requires a reduce component. In fact, we'll see that Couchbase already comes with built-in reduce functions that will provide you with most of the reduce behavior you need. However, before relying on only those functions, it's important to understand why you'd use a reduce function in the first place. Returning to the preceding example of the map, let's imagine we have a few more documents in our set, as follows: var books=[     { "id": 1, "title": "The Bourne Identity", "author": "Robert Ludlow"     },     { "id": 2, "title": "The Bourne Ultimatum", "author": "Robert Ludlow"     },     { "id": 3, "title": "The Godfather", "author": "Mario Puzzo"     },     { "id": 4, "title": "The Bourne Supremacy", "author": "Robert Ludlow"     },     { "id": 5, "title": "The Family", "author": "Mario Puzzo"     },  { "id": 6, "title": "Wiseguy", "author": "Nicholas Pileggi"     } ]; We'll still create our index using the same map function because it provides a way of accessing a book by its author. Now imagine that we want to know how many books an author has written, or (assuming we had more data) the average number of pages written by an author. These questions are not possible to answer with a map function alone. Each application of the map function knows nothing about the previous application. In other words, there is no way for you to compare or accumulate information about one author's book to another book by the same author. Fortunately, there is a solution to this problem. As you've probably guessed, it's the use of a reduce function. As a somewhat contrived example, consider this JavaScript: mapped = books.map(function (book) {     return ([book.id, book.author]); });   counts = {} reduced = mapped.reduce(function(prev, cur, idx, arr) { var key = cur[1];     if (! counts[key]) counts[key] = 0;     ++counts[key] }, null); This code doesn't quite accurately reflect the way you would count books with Couchbase but it illustrates the basic idea. You look for each occurrence of a key (author) and increment a counter when it is found. With Couchbase MapReduce, the mapped structure is supplied to the reduce() function in a better format. You won't need to keep track of items in a dictionary. Couchbase views At this point, you should have a general sense of what MapReduce is, where it came from, and how it will affect the creation of a Couchbase Server view. So without further ado, let's see how to write our first Couchbase view. In fact, there were two to choose from. The bucket we'll use is beer-sample. If you didn't install it, don't worry. You can add it by opening the Couchbase Console and navigating to the Settings tab. Here, you'll find the option to install the bucket, as shown next: First, you need to understand the document structures with which you're working. The following JSON object is a beer document (abbreviated for brevity): {  "name": "Sundog",  "type": "beer",  "brewery_id": "new_holland_brewing_company",  "description": "Sundog is an amber ale...",  "style": "American-Style Amber/Red Ale",  "category": "North American Ale" } As you can see, the beer documents have several properties. We're going to create an index to let us query these documents by name. In SQL, the query would look like this: SELECT Id FROM Beers WHERE Name = ? You might be wondering why the SQL example includes only the Id column in its projection. For now, just know that to query a document using a view with Couchbase, the property by which you're querying must be included in an index. To create that index, we'll write a map function. The simplest example of a map function to query beer documents by name is as follows: function(doc) {   emit(doc.name); } This body of the map function has only one line. It calls the built-in Couchbase emit() function. This function is used to signal that a value should be indexed. The output of this map function will be an array of names. The beer-sample bucket includes brewery data as well. These documents look like the following code (abbreviated for brevity): {   "name": "Thomas Hooker Brewing",   "city": "Bloomfield",   "state": "Connecticut",   "website": "http://www.hookerbeer.com/",   "type": "brewery" } If we reexamine our map function, we'll see an obvious problem; both the brewery and beer documents have a name property. When this map function is applied to the documents in the bucket, it will create an index with documents from either the brewery or beer documents. The problem is that Couchbase documents exist in a single container—the bucket. There is no namespace for a set of related documents. The solution has typically involved including a type or docType property on each document. The value of this property is used to distinguish one document from another. In the case of the beer-sample database, beer documents have type = "beer" and brewery documents have type = "brewery". Therefore, we are easily able to modify our map function to create an index only on beer documents: function(doc) {   if (doc.type == "beer") {     emit(doc.name);   } } The emit() function actually takes two arguments. The first, as we've seen, emits a value to be indexed. The second argument is an optional value and is used by the reduce function. Imagine that we want to count the number of beer types in a particular category. In SQL, we would write the following query: SELECT Category, COUNT(*) FROM Beers GROUP BY Category To achieve the same functionality with Couchbase Server, we'll need to use both map and reduce functions. First, let's write the map. It will create an index on the category property: function(doc) {   if (doc.type == "beer") {     emit(doc.category, 1);   } } The only real difference between our category index and our name index is that we're including an argument for the value parameter of the emit() function. What we'll do with that value is simply count them. This counting will be done in our reduce function: function(keys, values) {   return values.length; } In this example, the values parameter will be given to the reduce function as a list of all values associated with a particular key. In our case, for each beer category, there will be a list of ones (that is, [1, 1, 1, 1, 1, 1]). Couchbase also provides a built-in _count function. It can be used in place of the entire reduce function in the preceding example. Now that we've seen the basic requirements when creating an actual Couchbase view, it's time to add a view to our bucket. The easiest way to do so is to use the Couchbase Console. Summary In this article, you learned the purpose of secondary indexes in a key/value store. We dug deep into MapReduce, both in terms of its history in functional languages and as a tool for NoSQL and big data systems. Resources for Article: Further resources on this subject: Map Reduce? [article] Introduction to Mapreduce [article] Working with Apps Splunk [article]
Read more
  • 0
  • 0
  • 4795

article-image-elasticsearch-administration
Packt
03 Mar 2015
28 min read
Save for later

Elasticsearch Administration

Packt
03 Mar 2015
28 min read
In this article by Rafał Kuć and Marek Rogoziński, author of the book Mastering Elasticsearch, Second Edition we will talk more about the Elasticsearch configuration and new features introduced in Elasticsearch 1.0 and higher. By the end of this article, you will have learned: (For more resources related to this topic, see here.) Configuring the discovery and recovery modules Using the Cat API that allows a human-readable insight into the cluster status The backup and restore functionality Federated search Discovery and recovery modules When starting your Elasticsearch node, one of the first things that Elasticsearch does is look for a master node that has the same cluster name and is visible in the network. If a master node is found, the starting node gets joined into an already formed cluster. If no master is found, then the node itself is selected as a master (of course, if the configuration allows such behavior). The process of forming a cluster and finding nodes is called discovery. The module responsible for discovery has two main purposes—electing a master and discovering new nodes within a cluster. After the cluster is formed, a process called recovery is started. During the recovery process, Elasticsearch reads the metadata and the indices from the gateway, and prepares the shards that are stored there to be used. After the recovery of the primary shards is done, Elasticsearch should be ready for work and should continue with the recovery of all the replicas (if they are present). In this section, we will take a deeper look at these two modules and discuss the possibilities of configuration Elasticsearch gives us and what the consequences of changing them are. Note that the information provided in the Discovery and recovery modules section is an extension of what we already wrote in Elasticsearch Server Second Edition, published by Packt Publishing. Discovery configuration As we have already mentioned multiple times, Elasticsearch was designed to work in a distributed environment. This is the main difference when comparing Elasticsearch to other open source search and analytics solutions available. With such assumptions, Elasticsearch is very easy to set up in a distributed environment, and we are not forced to set up additional software to make it work like this. By default, Elasticsearch assumes that the cluster is automatically formed by the nodes that declare the same cluster.name setting and can communicate with each other using multicast requests. This allows us to have several independent clusters in the same network. There are a few implementations of the discovery module that we can use, so let's see what the options are. Zen discovery Zen discovery is the default mechanism that's responsible for discovery in Elasticsearch and is available by default. The default Zen discovery configuration uses multicast to find other nodes. This is a very convenient solution: just start a new Elasticsearch node and everything works—this node will be joined to the cluster if it has the same cluster name and is visible by other nodes in that cluster. This discovery method is perfectly suited for development time, because you don't need to care about the configuration; however, it is not advised that you use it in production environments. Relying only on the cluster name is handy but can also lead to potential problems and mistakes, such as the accidental joining of nodes. Sometimes, multicast is not available for various reasons or you don't want to use it for these mentioned reasons. In the case of bigger clusters, the multicast discovery may generate too much unnecessary traffic, and this is another valid reason why it shouldn't be used for production. For these cases, Zen discovery allows us to use the unicast mode. When using the unicast Zen discovery, a node that is not a part of the cluster will send a ping request to all the addresses specified in the configuration. By doing this, it informs all the specified nodes that it is ready to be a part of the cluster and can be either joined to an existing cluster or can form a new one. Of course, after the node joins the cluster, it gets the cluster topology information, but the initial connection is only done to the specified list of hosts. Remember that even when using unicast Zen discovery, the Elasticsearch node still needs to have the same cluster name as the other nodes. If you want to know more about the differences between multicast and unicast ping methods, refer to these URLs: http://en.wikipedia.org/wiki/Multicast and http://en.wikipedia.org/wiki/Unicast. If you still want to learn about the configuration properties of multicast Zen discovery, let's look at them. Multicast Zen discovery configuration The multicast part of the Zen discovery module exposes the following settings: discovery.zen.ping.multicast.address (the default: all available interfaces): This is the interface used for the communication given as the address or interface name. discovery.zen.ping.multicast.port (the default: 54328): This port is used for communication. discovery.zen.ping.multicast.group (the default: 224.2.2.4): This is the multicast address to send messages to. discovery.zen.ping.multicast.buffer_size (the default: 2048): This is the size of the buffer used for multicast messages. discovery.zen.ping.multicast.ttl (the default: 3): This is the time for which a multicast message lives. Every time a packet crosses the router, the TTL is decreased. This allows for the limiting area where the transmission can be received. Note that routers can have the threshold values assigned compared to TTL, which causes that TTL value to not match exactly the number of routers that a packet can jump over. discovery.zen.ping.multicast.enabled (the default: true): Setting this property to false turns off the multicast. You should disable multicast if you are planning to use the unicast discovery method. The unicast Zen discovery configuration The unicast part of Zen discovery provides the following configuration options: discovery.zen.ping.unicats.hosts: This is the initial list of nodes in the cluster. The list can be defined as a list or as an array of hosts. Every host can be given a name (or an IP address) or have a port or port range added. For example, the value of this property can look like this: ["master1", "master2:8181", "master3[80000-81000]"]. So, basically, the hosts' list for the unicast discovery doesn't need to be a complete list of Elasticsearch nodes in your cluster, because once the node is connected to one of the mentioned nodes, it will be informed about all the others that form the cluster. discovery.zen.ping.unicats.concurrent_connects (the default: 10): This is the maximum number of concurrent connections unicast discoveries will use. If you have a lot of nodes that the initial connection should be made to, it is advised that you increase the default value. Master node One of the main purposes of discovery apart from connecting to other nodes is to choose a master node—a node that will take care of and manage all the other nodes. This process is called master election and is a part of the discovery module. No matter how many master eligible nodes there are, each cluster will only have a single master node active at a given time. If there is more than one master eligible node present in the cluster, they can be elected as the master when the original master fails and is removed from the cluster. Configuring master and data nodes By default, Elasticsearch allows every node to be a master node and a data node. However, in certain situations, you may want to have worker nodes, which will only hold the data or process the queries and the master nodes that will only be used as cluster-managed nodes. One of these situations is to handle a massive amount of data, where data nodes should be as performant as possible, and there shouldn't be any delay in master nodes' responses. Configuring data-only nodes To set the node to only hold data, we need to instruct Elasticsearch that we don't want such a node to be a master node. In order to do this, we add the following properties to the elasticsearch.yml configuration file: node.master: falsenode.data: true Configuring master-only nodes To set the node not to hold data and only to be a master node, we need to instruct Elasticsearch that we don't want such a node to hold data. In order to do that, we add the following properties to the elasticsearch.yml configuration file: node.master: truenode.data: false Configuring the query processing-only nodes For large enough deployments, it is also wise to have nodes that are only responsible for aggregating query results from other nodes. Such nodes should be configured as nonmaster and nondata, so they should have the following properties in the elasticsearch.yml configuration file: node.master: falsenode.data: false Please note that the node.master and the node.data properties are set to true by default, but we tend to include them for configuration clarity. The master election configuration We already wrote about the master election configuration in Elasticsearch Server Second Edition, but this topic is very important, so we decided to refresh our knowledge about it. Imagine that you have a cluster that is built of 10 nodes. Everything is working fine until, one day, your network fails and three of your nodes are disconnected from the cluster, but they still see each other. Because of the Zen discovery and the master election process, the nodes that got disconnected elect a new master and you end up with two clusters with the same name with two master nodes. Such a situation is called a split-brain and you must avoid it as much as possible. When a split-brain happens, you end up with two (or more) clusters that won't join each other until the network (or any other) problems are fixed. If you index your data during this time, you may end up with data loss and unrecoverable situations when the nodes get joined together after the network split. In order to prevent split-brain situations or at least minimize the possibility of their occurrences, Elasticsearch provides a discovery.zen.minimum_master_nodes property. This property defines a minimum amount of master eligible nodes that should be connected to each other in order to form a cluster. So now, let's get back to our cluster; if we set the discovery.zen.minimum_master_nodes property to 50 percent of the total nodes available plus one (which is six, in our case), we would end up with a single cluster. Why is that? Before the network failure, we would have 10 nodes, which is more than six nodes, and these nodes would form a cluster. After the disconnections of the three nodes, we would still have the first cluster up and running. However, because only three nodes disconnected and three is less than six, these three nodes wouldn't be allowed to elect a new master and they would wait for reconnection with the original cluster. Zen discovery fault detection and configuration Elasticsearch runs two detection processes while it is working. The first process is to send ping requests from the current master node to all the other nodes in the cluster to check whether they are operational. The second process is a reverse of that—each of the nodes sends ping requests to the master in order to verify that it is still up and running and performing its duties. However, if we have a slow network or our nodes are in different hosting locations, the default configuration may not be sufficient. Because of this, the Elasticsearch discovery module exposes three properties that we can change: discovery.zen.fd.ping_interval: This defaults to 1s and specifies the interval of how often the node will send ping requests to the target node. discovery.zen.fd.ping_timeout: This defaults to 30s and specifies how long the node will wait for the sent ping request to be responded to. If your nodes are 100 percent utilized or your network is slow, you may consider increasing that property value. discovery.zen.fd.ping_retries: This defaults to 3 and specifies the number of ping request retries before the target node will be considered not operational. You can increase this value if your network has a high number of lost packets (or you can fix your network). There is one more thing that we would like to mention. The master node is the only node that can change the state of the cluster. To achieve a proper cluster state updates sequence, Elasticsearch master nodes process single cluster state update requests one at a time, make the changes locally, and send the request to all the other nodes so that they can synchronize their state. The master nodes wait for the given time for the nodes to respond, and if the time passes or all the nodes are returned, with the current acknowledgment information, it proceeds with the next cluster state update request processing. To change the time, the master node waits for all the other nodes to respond, and you should modify the default 30 seconds time by setting the discovery.zen.publish_timeout property. Increasing the value may be needed for huge clusters working in an overloaded network. The Amazon EC2 discovery Amazon, in addition to selling goods, has a few popular services such as selling storage or computing power in a pay-as-you-go model. So-called Amazon Elastic Compute Cloud (EC2) provides server instances and, of course, they can be used to install and run Elasticsearch clusters (among many other things, as these are normal Linux machines). This is convenient—you pay for instances that are needed in order to handle the current traffic or to speed up calculations, and you shut down unnecessary instances when the traffic is lower. Elasticsearch works well on EC2, but due to the nature of the environment, some features may work slightly differently. One of these features that works differently is discovery, because Amazon EC2 doesn't support multicast discovery. Of course, we can switch to unicast discovery, but sometimes, we want to be able to automatically discover nodes and, with unicast, we need to at least provide the initial list of hosts. However, there is an alternative—we can use the Amazon EC2 plugin, a plugin that combines the multicast and unicast discovery methods using the Amazon EC2 API. Make sure that during the set up of EC2 instances, you set up communication between them (on port 9200 and 9300 by default). This is crucial in order to have Elasticsearch nodes communicate with each other and, thus, cluster functioning is required. Of course, this communication depends on network.bind_host and network.publish_host (or network.host) settings. The EC2 plugin installation The installation of a plugin is as simple as with most of the plugins. In order to install it, we should run the following command: bin/plugin install elasticsearch/elasticsearch-cloud-aws/2.4.0 The EC2 plugin's generic configuration This plugin provides several configuration settings that we need to provide in order for the EC2 discovery to work: cluster.aws.access_key: Amazon access key—one of the credential values you can find in the Amazon configuration panel cluster.aws.secret_key: Amazon secret key—similar to the previously mentioned access_key setting, it can be found in the EC2 configuration panel The last thing is to inform Elasticsearch that we want to use a new discovery type by setting the discovery.type property to ec2 value and turn off multicast. Optional EC2 discovery configuration options The previously mentioned settings are sufficient to run the EC2 discovery, but in order to control the EC2 discovery plugin behavior, Elasticsearch exposes additional settings: cloud.aws.region: This region will be used to connect with Amazon EC2 web services. You can choose a region that's adequate for the region where your instance resides, for example, eu-west-1 for Ireland. The possible values can be eu-west, sa-east, us-east, us-west-1, us-west-2, ap-southeast-1, and ap-southeast-1. cloud.aws.ec2.endpoint: If you are using EC2 API services, instead of defining a region, you can provide an address of the AWS endpoint, for example, ec2.eu-west-1.amazonaws.com. cloud.aws.protocol: This is the protocol that should be used by the plugin to connect to Amazon Web Services endpoints. By default, Elasticsearch will use the HTTPS protocol (which means setting the value of the property to https). We can also change this behavior and set the property to http for the plugin to use HTTP without encryption. We are also allowed to overwrite the cloud.aws.protocol settings for each service by using the cloud.aws.ec2.protocol and cloud.aws.s3.protocol properties (the possible values are the same—https and http). cloud.aws.proxy_host: Elasticsearch allows us to define a proxy that will be used to connect to AWS endpoints. The cloud.aws.proxy_host property should be set to the address to the proxy that should be used. cloud.aws.proxy_port: The second property related to the AWS endpoints proxy allows us to specify the port on which the proxy is listening. The cloud.aws.proxy_port property should be set to the port on which the proxy listens. discovery.ec2.ping_timeout (the default: 3s): This is the time to wait for the response for the ping message sent to the other node. After this time, the nonresponsive node will be considered dead and removed from the cluster. Increasing this value makes sense when dealing with network issues or we have a lot of EC2 nodes. The EC2 nodes scanning configuration The last group of settings we want to mention allows us to configure a very important thing when building cluster working inside the EC2 environment—the ability to filter available Elasticsearch nodes in our Amazon Elastic Cloud Computing network. The Elasticsearch EC2 plugin exposes the following properties that can help us configure its behavior: discovery.ec2.host_type: This allows us to choose the host type that will be used to communicate with other nodes in the cluster. The values we can use are private_ip (the default one; the private IP address will be used for communication), public_ip (the public IP address will be used for communication), private_dns (the private hostname will be used for communication), and public_dns (the public hostname will be used for communication). discovery.ec2.groups: This is a comma-separated list of security groups. Only nodes that fall within these groups can be discovered and included in the cluster. discovery.ec2.availability_zones: This is array or command-separated list of availability zones. Only nodes with the specified availability zones will be discovered and included in the cluster. discovery.ec2.any_group (this defaults to true): Setting this property to false will force the EC2 discovery plugin to discover only those nodes that reside in an Amazon instance that falls into all of the defined security groups. The default value requires only a single group to be matched. discovery.ec2.tag: This is a prefix for a group of EC2-related settings. When you launch your Amazon EC2 instances, you can define tags, which can describe the purpose of the instance, such as the customer name or environment type. Then, you use these defined settings to limit discovery nodes. Let's say you define a tag named environment with a value of qa. In the configuration, you can now specify the following: discovery.ec2.tag.environment: qa and only nodes running on instances with this tag will be considered for discovery. cloud.node.auto_attributes: When this is set to true, Elasticsearch will add EC2-related node attributes (such as the availability zone or group) to the node properties and will allow us to use them, adjusting the Elasticsearch shard allocation and configuring the shard placement. Other discovery implementations The Zen discovery and EC2 discovery are not the only discovery types that are available. There are two more discovery types that are developed and maintained by the Elasticsearch team, and these are: Azure discovery: https://github.com/elasticsearch/elasticsearch-cloud-azure Google Compute Engine discovery: https://github.com/elasticsearch/elasticsearch-cloud-gce In addition to these, there are a few discovery implementations provided by the community, such as the ZooKeeper discovery for older versions of Elasticsearch (https://github.com/sonian/elasticsearch-zookeeper). The gateway and recovery configuration The gateway module allows us to store all the data that is needed for Elasticsearch to work properly. This means that not only is the data in Apache Lucene indices stored, but also all the metadata (for example, index allocation settings), along with the mappings configuration for each index. Whenever the cluster state is changed, for example, when the allocation properties are changed, the cluster state will be persisted by using the gateway module. When the cluster is started up, its state will be loaded using the gateway module and applied. One should remember that when configuring different nodes and different gateway types, indices will use the gateway type configuration present on the given node. If an index state should not be stored using the gateway module, one should explicitly set the index gateway type to none. The gateway recovery process Let's say explicitly that the recovery process is used by Elasticsearch to load the data stored with the use of the gateway module in order for Elasticsearch to work. Whenever a full cluster restart occurs, the gateway process kicks in to load all the relevant information we've mentioned—the metadata, the mappings, and of course, all the indices. When the recovery process starts, the primary shards are initialized first, and then, depending on the replica state, they are initialized using the gateway data, or the data is copied from the primary shards if the replicas are out of sync. Elasticsearch allows us to configure when the cluster data should be recovered using the gateway module. We can tell Elasticsearch to wait for a certain number of master eligible or data nodes to be present in the cluster before starting the recovery process. However, one should remember that when the cluster is not recovered, all the operations performed on it will not be allowed. This is done in order to avoid modification conflicts. Configuration properties Before we continue with the configuration, we would like to say one more thing. As you know, Elasticsearch nodes can play different roles—they can have a role of data nodes—the ones that hold data—they can have a master role, or they can be only used for request handing, which means not holding data and not being master eligible. Remembering all this, let's now look at the gateway configuration properties that we are allowed to modify: gateway.recover_after_nodes: This is an integer number that specifies how many nodes should be present in the cluster for the recovery to happen. For example, when set to 5, at least 5 nodes (doesn't matter whether they are data or master eligible nodes) must be present for the recovery process to start. gateway.recover_after_data_nodes: This is an integer number that allows us to set how many data nodes should be present in the cluster for the recovery process to start. gateway.recover_after_master_nodes: This is another gateway configuration option that allows us to set how many master eligible nodes should be present in the cluster for the recovery to start. gateway.recover_after_time: This allows us to set how much time to wait before the recovery process starts after the conditions defined by the preceding properties are met. If we set this property to 5m, we tell Elasticsearch to start the recovery process 5 minutes after all the defined conditions are met. The default value for this property is 5m, starting from Elasticsearch 1.3.0. Let's imagine that we have six nodes in our cluster, out of which four are data eligible. We also have an index that is built of three shards, which are spread across the cluster. The last two nodes are master eligible and they don't hold the data. What we would like to configure is the recovery process to be delayed for 3 minutes after the four data nodes are present. Our gateway configuration could look like this: gateway.recover_after_data_nodes: 4gateway.recover_after_time: 3m Expectations on nodes In addition to the already mentioned properties, we can also specify properties that will force the recovery process of Elasticsearch. These properties are: gateway.expected_nodes: This is the number of nodes expected to be present in the cluster for the recovery to start immediately. If you don't need the recovery to be delayed, it is advised that you set this property to the number of nodes (or at least most of them) with which the cluster will be formed from, because that will guarantee that the latest cluster state will be recovered. gateway.expected_data_nodes: This is the number of expected data eligible nodes to be present in the cluster for the recovery process to start immediately. gateway.expected_master_nodes: This is the number of expected master eligible nodes to be present in the cluster for the recovery process to start immediately. Now, let's get back to our previous example. We know that when all six nodes are connected and are in the cluster, we want the recovery to start. So, in addition to the preceeding configuration, we would add the following property: gateway.expected_nodes: 6 So the whole configuration would look like this: gateway.recover_after_data_nodes: 4gateway.recover_after_time: 3mgateway.expected_nodes: 6 The preceding configuration says that the recovery process will be delayed for 3 minutes once four data nodes join the cluster and will begin immediately after six nodes are in the cluster (doesn't matter whether they are data nodes or master eligible nodes). The local gateway With the release of Elasticsearch 0.20 (and some of the releases from 0.19 versions), all the gateway types, apart from the default local gateway type, were deprecated. It is advised that you do not use them, because they will be removed in future versions of Elasticsearch. This is still not the case, but if you want to avoid full data reindexation, you should only use the local gateway type, and this is why we won't discuss all the other types. The local gateway type uses a local storage available on a node to store the metadata, mappings, and indices. In order to use this gateway type and the local storage available on the node, there needs to be enough disk space to hold the data with no memory caching. The persistence to the local gateway is different from the other gateways that are currently present (but deprecated). The writes to this gateway are done in a synchronous manner in order to ensure that no data will be lost during the write process. In order to set the type of gateway that should be used, one should use the gateway.type property, which is set to local by default. There is one additional thing regarding the local gateway of Elasticsearch that we didn't talk about—dangling indices. When a node joins a cluster, all the shards and indices that are present on the node, but are not present in the cluster, will be included in the cluster state. Such indices are called dangling indices, and we are allowed to choose how Elasticsearch should treat them. Elasticsearch exposes the gateway.local.auto_import_dangling property, which can take the value of yes (the default value that results in importing all dangling indices into the cluster), close (results in importing the dangling indices into the cluster state but keeps them closed by default), and no (results in removing the dangling indices). When setting the gateway.local.auto_import_dangling property to no, we can also set the gateway.local.dangling_timeout property (defaults to 2h) to specify how long Elasticsearch will wait while deleting the dangling indices. The dangling indices feature can be nice when we restart old Elasticsearch nodes, and we don't want old indices to be included in the cluster. Low-level recovery configuration We discussed that we can use the gateway to configure the behavior of the Elasticsearch recovery process, but in addition to that, Elasticsearch allows us to configure the recovery process itself. However, we decided that it would be good to mention the properties we can use in the section dedicated to gateway and recovery. Cluster- level recovery configuration The recovery configuration is specified mostly on the cluster level and allows us to set general rules for the recovery module to work with. These settings are: indices.recovery.concurrent_streams: This defaults to 3 and specifies the number of concurrent streams that are allowed to be opened in order to recover a shard from its source. The higher the value of this property, the more pressure will be put on the networking layer; however, the recovery may be faster, depending on your network usage and throughput. indices.recovery.max_bytes_per_sec: By default, this is set to 20MB and specifies the maximum number of data that can be transferred during shard recovery per second. In order to disable data transfer limiting, one should set this property to 0. Similar to the number of concurrent streams, this property allows us to control the network usage of the recovery process. Setting this property to higher values may result in higher network utilization and a faster recovery process. indices.recovery.compress: This is set to true by default and allows us to define whether ElasticSearch should compress the data that is transferred during the recovery process. Setting this to false may lower the pressure on the CPU, but it will also result in more data being transferred over the network. indices.recovery.file_chunk_size: This is the chunk size used to copy the shard data from the source shard. By default, it is set to 512KB and is compressed if the indices.recovery.compress property is set to true. indices.recovery.translog_ops: This defaults to 1000 and specifies how many transaction log lines should be transferred between shards in a single request during the recovery process. indices.recovery.translog_size: This is the chunk size used to copy the shard transaction log data from the source shard. By default, it is set to 512KB and is compressed if the indices.recovery.compress property is set to true. In the versions prior to Elasticsearch 0.90.0, there was the indices.recovery.max_size_per_sec property that could be used, but it was deprecated, and it is suggested that you use the indices.recovery.max_bytes_per_sec property instead. However, if you are using an Elasticsearch version older than 0.90.0, it may be worth remembering this. All the previously mentioned settings can be updated using the Cluster Update API, or they can be set in the elasticsearch.yml file. Index-level recovery settings In addition to the values mentioned previously, there is a single property that can be set on a per-index basis. The property can be set both in the elasticsearch.yml file and using the indices Update Settings API, and it is called index.recovery.initial_shards. In general, Elasticsearch will only recover a particular shard when there is a quorum of shards present and if that quorum can be allocated. A quorum is 50 percent of the shards for the given index plus one. By using the index.recovery.initial_shards property, we can change what Elasticsearch will take as a quorum. This property can be set to the one of the following values: quorum: 50 percent, plus one shard needs to be present and be allocable. This is the default value. quorum-1: 50 percent of the shards for a given index need to be present and be allocable. full: All of the shards for the given index need to be present and be allocable. full-1: 100 percent minus one shards for the given index need to be present and be allocable. integer value: Any integer such as 1, 2, or 5 specifies the number of shards that are needed to be present and that can be allocated. For example, setting this value to 2 will mean that at least two shards need to be present and Elasticsearch needs at least 2 shards to be allocable. It is good to know about this property, but in most cases, the default value will be sufficient for most deployments. Summary In this article, we focused more on the Elasticsearch configuration and new features that were introduced in Elasticsearch 1.0. We configured discovery and recovery, and we used the human-friendly Cat API. In addition to that, we used the backup and restore functionality, which allowed easy backup and recovery of our indices. Finally, we looked at what federated search is and how to search and index data to multiple clusters, while still using all the functionalities of Elasticsearch and being connected to a single node. If you want to dig deeper, buy the book Mastering Elasticsearch, Second Edition and read in a simple step-by-step fashion using Elasticsearch to enhance your knowlege further. Resources for Article: Further resources on this subject: Downloading and Setting Up ElasticSearch [Article] Indexing the Data [Article] Driving Visual Analyses with Automobile Data (Python) [Article]
Read more
  • 0
  • 0
  • 5417
Modal Close icon
Modal Close icon