Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7010 Articles
article-image-working-webstart-and-browser-plugin
Packt
06 Feb 2015
12 min read
Save for later

Working with WebStart and the Browser Plugin

Packt
06 Feb 2015
12 min read
 In this article by Alex Kasko, Stanislav Kobyl yanskiy, and Alexey Mironchenko, authors of the book OpenJDK Cookbook, we will cover the following topics: Building the IcedTea browser plugin on Linux Using the IcedTea Java WebStart implementation on Linux Preparing the IcedTea Java WebStart implementation for Mac OS X Preparing the IcedTea Java WebStart implementation for Windows Introduction For a long time, for end users, the Java applets technology was the face of the whole Java world. For a lot of non-developers, the word Java itself is a synonym for the Java browser plugin that allows running Java applets inside web browsers. The Java WebStart technology is similar to the Java browser plugin but runs remotely on loaded Java applications as separate applications outside of web browsers. The OpenJDK open source project does not contain the implementations for the browser plugin nor for the WebStart technologies. The Oracle Java distribution, otherwise matching closely to OpenJDK codebases, provided its own closed source implementation for these technologies. The IcedTea-Web project contains free and open source implementations of the browser plugin and WebStart technologies. The IcedTea-Web browser plugin supports only GNU/Linux operating systems and the WebStart implementation is cross-platform. While the IcedTea implementation of WebStart is well-tested and production-ready, it has numerous incompatibilities with the Oracle WebStart implementation. These differences can be seen as corner cases; some of them are: Different behavior when parsing not well-formed JNLP descriptor files: The Oracle implementation is generally more lenient for malformed descriptors. Differences in JAR (re)downloading and caching behavior: The Oracle implementation uses caching more aggressively. Differences in sound support: This is due to differences in sound support between Oracle Java and IcedTea on Linux. Linux historically has multiple different sound providers (ALSA, PulseAudio, and so on) and IcedTea has more wide support for different providers, which can lead to sound misconfiguration. The IcedTea-Web browser plugin (as it is built on WebStart) has these incompatibilities too. On top of them, it can have more incompatibilities in relation to browser integration. User interface forms and general browser-related operations such as access from/to JavaScript code should work fine with both implementations. But historically, the browser plugin was widely used for security-critical applications like online bank clients. Such applications usually require security facilities from browsers, such as access to certificate stores or hardware crypto-devices that can differ from browser to browser, depending on the OS (for example, supports only Windows), browser version, Java version, and so on. Because of that, many real-world applications can have problems running the IcedTea-Web browser plugin on Linux. Both WebStart and the browser plugin are built on the idea of downloading (possibly untrusted) code from remote locations, and proper privilege checking and sandboxed execution of that code is a notoriously complex task. Usually reported security issues in the Oracle browser plugin (most widely known are issues during the year 2012) are also fixed separately in IcedTea-Web. Building the IcedTea browser plugin on Linux The IcedTea-Web project is not inherently cross-platform; it is developed on Linux and for Linux, and so it can be built quite easily on popular Linux distributions. The two main parts of it (stored in corresponding directories in the source code repository) are netx and plugin. NetX is a pure Java implementation of the WebStart technology. We will look at it more thoroughly in the following recipes of this article. Plugin is an implementation of the browser plugin using the NPAPI plugin architecture that is supported by multiple browsers. Plugin is written partly in Java and partly in native code (C++), and it officially supports only Linux-based operating systems. There exists an opinion about NPAPI that this architecture is dated, overcomplicated, and insecure, and that modern web browsers have enough built-in capabilities to not require external plugins. And browsers have gradually reduced support for NPAPI. Despite that, at the time of writing this book, the IcedTea-Web browser plugin worked on all major Linux browsers (Firefox and derivatives, Chromium and derivatives, and Konqueror). We will build the IcedTea-Web browser plugin from sources using Ubuntu 12.04 LTS amd64. Getting ready For this recipe, we will need a clean Ubuntu 12.04 running with the Firefox web browser installed. How to do it... The following procedure will help you to build the IcedTea-Web browser plugin: Install prepackaged binaries of OpenJDK 7: sudo apt-get install openjdk-7-jdk Install the GCC toolchain and build dependencies: sudo apt-get build-dep openjdk-7 Install the specific dependency for the browser plugin: sudo apt-get install firefox-dev Download and decompress the IcedTea-Web source code tarball: wget http://icedtea.wildebeest.org/download/source/icedtea-web-1.4.2.tar.gz tar xzvf icedtea-web-1.4.2.tar.gz Run the configure script to set up the build environment: ./configure Run the build process: make Install the newly built plugin into the /usr/local directory: sudo make install Configure the Firefox web browser to use the newly built plugin library: mkdir ~/.mozilla/plugins cd ~/.mozilla/plugins ln -s /usr/local/IcedTeaPlugin.so libjavaplugin.so Check whether the IcedTea-Web plugin has appeared under Tools | Add-ons | Plugins. Open the http://java.com/en/download/installed.jsp web page to verify that the browser plugin works. How it works... The IcedTea browser plugin requires the IcedTea Java implementation to be compiled successfully. The prepackaged OpenJDK 7 binaries in Ubuntu 12.04 are based on IcedTea, so we installed them first. The plugin uses the GNU Autconf build system that is common between free software tools. The xulrunner-dev package is required to access the NPAPI headers. The built plugin may be installed into Firefox for the current user only without requiring administrator privileges. For that, we created a symbolic link to our plugin in the place where Firefox expects to find the libjavaplugin.so plugin library. There's more... The plugin can also be installed into other browsers with NPAPI support, but installation instructions can be different for different browsers and different Linux distributions. As the NPAPI architecture does not depend on the operating system, in theory, a plugin can be built for non-Linux operating systems. But currently, no such ports are planned. Using the IcedTea Java WebStart implementation on Linux On the Java platform, the JVM needs to perform the class load process for each class it wants to use. This process is opaque for the JVM and actual bytecode for loaded classes may come from one of many sources. For example, this method allows the Java Applet classes to be loaded from a remote server to the Java process inside the web browser. Remote class loading also may be used to run remotely loaded Java applications in standalone mode without integration with the web browser. This technique is called Java WebStart and was developed under Java Specification Request (JSR) number 56. To run the Java application remotely, WebStart requires an application descriptor file that should be written using the Java Network Launching Protocol (JNLP) syntax. This file is used to define the remote server to load the application form along with some metainformation. The WebStart application may be launched from the web page by clicking on the JNLP link, or without the web browser using the JNLP file obtained beforehand. In either case, running the application is completely separate from the web browser, but uses a sandboxed security model similar to Java Applets. The OpenJDK project does not contain the WebStart implementation; the Oracle Java distribution provides its own closed-source WebStart implementation. The open source WebStart implementation exists as part of the IcedTea-Web project. It was initially based on the NETwork eXecute (NetX) project. Contrary to the Applet technology, WebStart does not require any web browser integration. This allowed developers to implement the NetX module using pure Java without native code. For integration with Linux-based operating systems, IcedTea-Web implements the javaws command as shell script that launches the netx.jar file with proper arguments. In this recipe, we will build the NetX module from the official IcedTea-Web source tarball. Getting ready For this recipe, we will need a clean Ubuntu 12.04 running with the Firefox web browser installed. How to do it... The following procedure will help you to build a NetX module: Install prepackaged binaries of OpenJDK 7: sudo apt-get install openjdk-7-jdk Install the GCC toolchain and build dependencies: sudo apt-get build-dep openjdk-7 Download and decompress the IcedTea-Web source code tarball: wget http://icedtea.wildebeest.org/download/source/icedtea-web-1.4.2.tar.gz tar xzvf icedtea-web-1.4.2.tar.gz Run the configure script to set up a build environment excluding the browser plugin from the build: ./configure –disable-plugin Run the build process: make Install the newly-built plugin into the /usr/local directory: sudo make install Run the WebStart application example from the Java tutorial: javaws http://docs.oracle.com/javase/tutorialJWS/samples/ deployment/dynamictree_webstartJWSProject/dynamictree_webstart.jnlp How it works... The javaws shell script is installed into the /usr/local/* directory. When launched with a path or a link to the JNLP file, javaws launches the netx.jar file, adding it to the boot classpath (for security reasons) and providing the JNLP link as an argument. Preparing the IcedTea Java WebStart implementation for Mac OS X The NetX WebStart implementation from the IcedTea-Web project is written in pure Java, so it can also be used on Mac OS X. IcedTea-Web provides the javaws launcher implementation only for Linux-based operating systems. In this recipe, we will create a simple implementation of the WebStart launcher script for Mac OS X. Getting ready For this recipe, we will need Mac OS X Lion with Java 7 (the prebuilt OpenJDK or Oracle one) installed. We will also need the netx.jar module from the IcedTea-Web project, which can be built using instructions from the previous recipe. How to do it... The following procedure will help you to run WebStart applications on Mac OS X: Download the JNLP descriptor example from the Java tutorials at http://docs.oracle.com/javase/tutorialJWS/samples/deployment/dynamictree_webstartJWSProject/dynamictree_webstart.jnlp. Test that this application can be run from the terminal using netx.jar: java -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot dynamictree_webstart.jnlp Create the wslauncher.sh bash script with the following contents: #!/bin/bash if [ "x$JAVA_HOME" = "x" ] ; then JAVA="$( which java 2>/dev/null )" else JAVA="$JAVA_HOME"/bin/java fi if [ "x$JAVA" = "x" ] ; then echo "Java executable not found" exit 1 fi if [ "x$1" = "x" ] ; then echo "Please provide JNLP file as first argument" exit 1 fi $JAVA -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot $1 Mark the launcher script as executable: chmod 755 wslauncher.sh Run the application using the launcher script: ./wslauncher.sh dynamictree_webstart.jnlp How it works... The next.jar file contains a Java application that can read JNLP files and download and run classes described in JNLP. But for security reasons, next.jar cannot be launched directly as an application (using the java -jar netx.jar syntax). Instead, netx.jar is added to the privileged boot classpath and is run specifying the main class directly. This allows us to download applications in sandbox mode. The wslauncher.sh script tries to find the Java executable file using the PATH and JAVA_HOME environment variables and then launches specified JNLP through netx.jar. There's more... The wslauncher.sh script provides a basic solution to run WebStart applications from the terminal. To integrate netx.jar into your operating system environment properly (to be able to launch WebStart apps using JNLP links from the web browser), a native launcher or custom platform scripting solution may be used. Such solutions lay down the scope of this book. Preparing the IcedTea Java WebStart implementation for Windows The NetX WebStart implementation from the IcedTea-Web project is written in pure Java, so it can also be used on Windows; we also used it on Linux and Mac OS X in previous recipes in this article. In this recipe, we will create a simple implementation of the WebStart launcher script for Windows. Getting ready For this recipe, we will need a version of Windows running with Java 7 (the prebuilt OpenJDK or Oracle one) installed. We will also need the netx.jar module from the IcedTea-Web project, which can be built using instructions from the previous recipe in this article. How to do it... The following procedure will help you to run WebStart applications on Windows: Download the JNLP descriptor example from the Java tutorials at http://docs.oracle.com/javase/tutorialJWS/samples/deployment/dynamictree_webstartJWSProject/dynamictree_webstart.jnlp. Test that this application can be run from the terminal using netx.jar: java -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot dynamictree_webstart.jnlp Create the wslauncher.sh bash script with the following contents: #!/bin/bash if [ "x$JAVA_HOME" = "x" ] ; then JAVA="$( which java 2>/dev/null )" else JAVA="$JAVA_HOME"/bin/java fi if [ "x$JAVA" = "x" ] ; then echo "Java executable not found" exit 1 fi if [ "x$1" = "x" ] ; then echo "Please provide JNLP file as first argument" exit 1 fi $JAVA -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot $1 Mark the launcher script as executable: chmod 755 wslauncher.sh Run the application using the launcher script: ./wslauncher.sh dynamictree_webstart.jnlp How it works... The netx.jar module must be added to the boot classpath as it cannot be run directly because of security reasons. The wslauncher.bat script tries to find the Java executable using the JAVA_HOME environment variable and then launches specified JNLP through netx.jar. There's more... The wslauncher.bat script may be registered as a default application to run the JNLP files. This will allow you to run WebStart applications from the web browser. But the current script will show the batch window for a short period of time before launching the application. It also does not support looking for Java executables in the Windows Registry. A more advanced script without those problems may be written using Visual Basic script (or any other native scripting solution) or as a native executable launcher. Such solutions lay down the scope of this book. Summary In this article we covered the configuration and installation of WebStart and browser plugin components, which are the biggest parts of the Iced Tea project.
Read more
  • 0
  • 0
  • 7645

article-image-setting-our-development-environment-and-creating-game-activity
Packt
06 Feb 2015
17 min read
Save for later

Setting up our development environment and creating a game activity

Packt
06 Feb 2015
17 min read
In this article by John Horton, author of the book Learning Java by Building Android Games, we will learn how to set up our development environment by installing JDK and Android Studio. We will also learn how to create a new game activity and layout the same on a game screen UI. (For more resources related to this topic, see here.) Setting up our development environment The first thing we need to do is prepare our PC to develop for Android using Java. Fortunately, this is made quite simple for us. The next two tutorials have Windows-specific instructions and screenshots. However, it shouldn't be too difficult to vary the steps slightly to suit Mac or Linux. All we need to do is: Install a software package called the Java Development Kit (JDK), which allows us to develop in Java. Install Android Studio, a program designed to make Android development fast and easy. Android Studio uses the JDK and some other Android-specific tools that automatically get installed when we install Android Studio. Installing the JDK The first thing we need to do is get the latest version of the JDK. To complete this guide, perform the following steps: You need to be on the Java website, so visit http://www.oracle.com/technetwork/java/javase/downloads/index.html. Find the three buttons shown in the following screenshot and click on the one that says JDK (highlighted). They are on the right-hand side of the web page. Click on the DOWNLOAD button under the JDK option: You will be taken to a page that has multiple options to download the JDK. In the Product/File description column, you need to click on the option that matches your operating system. Windows, Mac, Linux and some other less common options are all listed. A common question here is, "do I have 32- or 64-bit windows?". To find out, right-click on your My Computer (This PC on Windows 8) icon, click on the Properties option, and look under the System heading in the System type entry, as shown in the following screenshot: Click on the somewhat hidden Accept License Agreement checkbox: Now click on the download option for your OS and system type as previously determined. Wait for the download to finish. In your Downloads folder, double-click on the file you just downloaded. The latest version at time of writing this for a 64-bit Windows PC was jdk-8u5-windows-x64. If you are using Mac/Linux or have a 32-bit OS, your filename will vary accordingly. In the first of several install dialogs, click on the Next button and you will see the next dialog box: Accept the defaults shown in the previous screenshot by clicking on Next. In the next dialog box, you can accept the default install location by clicking on Next. Next is the last dialog of the Java installer. Click on Close. The JDK is now installed. Next we will make sure that Android Studio is able to use the JDK. Right-click on your My Computer (This PC on Windows 8) icon and navigate to Properties | Advanced system settings | Environment variables | New (under System variables, not under User variables). Now you can see the New System Variable dialog, as shown in the following screenshot: Type JAVA_HOME for Variable name and enter C:Program FilesJavajdk1.8.0_05 for the Variable value field. If you installed the JDK somewhere else, then the file path you enter in the Variable value: field will need to point to wherever you put it. Your exact file path will likely have a different ending to match the latest version of Java at the time you downloaded it. Click on OK to save your new settings. Now click on OK again to clear the Advanced system settings dialog. Now we have the JDK installed on our PC. We are about half way towards starting to learn Java programming, but we need a friendly way to interact with the JDK and to help us make Android games in Java. Android Studio We learned that Android Studio is a tool that simplifies Android development and uses the JDK to allow us to write and build Java programs. There are other tools you can use instead of Android Studio. There are pros and cons in them all. For example, another extremely popular option is Eclipse. And as with so many things in programming, a strong argument can be made as to why you should use Eclipse instead of Android Studio. I use both, but what I hope you will love about Android Studio are the following elements: It is a very neat and, despite still being under development, a very refined and clean interface. It is much easier to get started compared to Eclipse because several Android tools that would otherwise need to be installed separately are already included in the package. Android Studio is being developed by Google, based on another product called IntelliJ IDEA. There is a chance it will be the standard way to develop Android in the not-too-distant future. If you want to use Eclipse, that's fine. However, some the keyboard shortcuts and user interface buttons will obviously be different. If you do not have Eclipse installed already and have no prior experience with Eclipse, then I even more strongly recommend you to go ahead with Android Studio. Installing Android Studio So without any delay, let's get Android Studio installed and then we can begin our first game project. To do this, let's visit https://developer.android.com/sdk/installing/studio.html. Click on the button labeled Download Android Studio to start the Android studio download. This will take you to another web page with a very similar-looking button to the one you just clicked on. Accept the license by checking in the checkbox, commence the download by clicking on the button labeled Download Android Studio for Windows, and wait for the download to complete. The exact text on the button will probably vary depending on the current latest version. In the folder in which you just downloaded Android Studio, right-click on the android-studio-bundle-135.12465-windows.exe file and click on Run as administrator. The end of your filename will vary depending upon the version of Android Studio and your operating system. When asked if you want to Allow the following program from an unknown publisher to make changes to your computer, click on Yes. On the next screen, click on Next. On the screen shown in the following screenshot, you can choose which users of your PC can use Android Studio. Choose whatever is right for you as all options will work, and then click on Next: In the next dialog, leave the default settings and then click on Next. Then on the Choose start menu folder dialog box, leave the defaults and click on Install. On the Installation complete dialog, click on Finish to run Android Studio for the first time. The next dialog is for users who have already used Android Studio, so assuming you are a first time user, select the I do not have a previous version of Android Studio or I do not want to import my settings checkbox, and then click on OK: That was the last piece of software we needed. Math game – asking a question Now that we have all that knowledge under our belts, we can use it to improve our math game. First, we will create a new Android activity to be the actual game screen as opposed to the start menu screen. We will then use the UI designer to lay out a simple game screen so that we can use our Java skills with variables, types, declaration, initialization, operators, and expressions to make our math game generate a question for the player. We can then link the start menu and game screens together with a push button. Creating the new game activity We will first need to create a new Java file for the game activity code and a related layout file to hold the game activity UI. Run Android Studio and select your Math Game Chapter 2 project. It might have been opened by default. Now we will create the new Android activity that will contain the actual game screen, which will run when the player taps the Play button on our main menu screen. To create a new activity, we now need another layout file and another Java file. Fortunately Android Studio will help us do this. To get started with creating all the files we need for a new activity, right-click on the src folder in the Project Explorer and then go to New | Activity. Now click on Blank Activity and then on Next. We now need to tell Android Studio a little bit about our new activity by entering information in the above dialog box. Change the Activity Name field to GameActivity. Notice how the Layout Name field is automatically changed for us to activity_game and the Title field is automatically changed to GameActivity. Click on Finish. Android Studio has created two files for us and has also registered our new activity in a manifest file, so we don't need to concern ourselves with it. If you look at the tabs at the top of the editor window, you will see that GameActivity.java has been opened up ready for us to edit, as shown in the following screenshot: Ensure that GameActivity.java is active in the editor window by clicking on the GameActivity.java tab shown previously. Here, we can see the code that is unnecessary. If we remove it, then it will make our working environment simpler and cleaner. We will simply use the code from MainActivity.java as a template for GameActivity.java. We can then make some minor changes. Click on the MainActivity.java tab in the editor window. Highlight all of the code in the editor window using Ctrl + A on the keyboard. Now copy all of the code in the editor window using the Ctrl + C on the keyboard. Now click on the GameActivity.java tab. Highlight all of the code in the editor window using Ctrl + A on the keyboard. Now paste the copied code and overwrite the currently highlighted code using Ctrl + V on the keyboard. Notice that there is an error in our code denoted by the red underlining as shown in the following screenshot. This is because we pasted the code referring to MainActivity in our file that is called GameActivity. Simply change the text MainActivity to GameActivity and the error will disappear. Take a moment to see if you can work out what other minor change is necessary, before I tell you. Remember that setContentView loads our UI design. Well what we need to do is change setContentView to load the new design (that we will build next) instead of the home screen design. Change setContentView(R.layout.activity_main); to setContentView(R.layout.activity_game);. Save your work and we are ready to move on. Note the Project Explorer where Android Studio puts the two new files it created for us. I have highlighted two folders in the next screenshot. In future, I will simply refer to them as our java code folder or layout files folder. You might wonder why we didn't simply copy and paste the MainActivity.java file to begin with and saved going through the process of creating a new activity? The reason is that Android Studio does things behind the scenes. Firstly, it makes the layout template for us. It also registers the new activity for use through a file we will see later, called AndroidManifest.xml. This is necessary for the new activity to be able to work in the first place. All things considered, the way we did it is probably the quickest. The code at this stage is exactly the same as the code for the home menu screen. We state the package name and import some useful classes provided by Android: package com.packtpub.mathgamechapter3a.mathgamechapter3a;   import android.app.Activity; import android.os.Bundle; We create a new activity, this time called GameActivity: public class GameActivity extends Activity { Then we override the onCreate method and use the setContentView method to set our UI design as the contents of the player's screen. Currently, however, this UI is empty: super.onCreate(savedInstanceState);setContentView(R.layout.activity_main); We can now think about the layout of our actual game screen. Laying out the game screen UI As we know, our math game will ask questions and offer the player some multiple choices to choose answers from. There are lots of extra features we could add, such as difficulty levels, high scores, and much more. But for now, let's just stick to asking a simple, predefined question and offering a choice of three predefined possible answers. Keeping the UI design to the bare minimum suggests a layout. Our target UI will look somewhat like this: The layout is hopefully self-explanatory, but let's ensure that we are really clear; when we come to building this layout in Android Studio, the section in the mock-up that displays 2 x 2 is the question and will be made up of three text views (both numbers, and the = sign is also a separate view). Finally, the three options for the answer are made up of Button layout elements. This time, as we are going to be controlling them using our Java code, there are a few extra things we need to do to them. So let's go through it step by step: Open the file that will hold our game UI in the editor window. Do this by double-clicking on activity_game.xml. This is located in our UI layout folder, which can be found in the project explorer. Delete the Hello World TextView, as it is not required. Find the Large Text element on the palette. It can be found under the Widgets section. Drag three elements onto the UI design area and arrange them near the top of the design as shown in the next screenshot. It does not have to be exact; just ensure that they are in a row and not overlapping, as shown in the following screenshot: Notice in the Component Tree window that each of the three TextViews has been assigned a name automatically by Android Studio. They are textView , textView2, and textView3: Android Studio refers to these element names as an id. This is an important concept that we will be making use of. So to confirm this, select any one of the textViews by clicking on its name (id), either in the component tree as shown in the preceding screenshot or directly on it in the UI designer shown previously. Now look at the Properties window and find the id property. You might need to scroll a little to do this: Notice that the value for the id property is textView. It is this id that we will use to interact with our UI from our Java code. So we want to change all the IDs of our TextViews to something useful and easy to remember. If you look back at our design, you will see that the UI element with the textView id is going to hold the number for the first part of our math question. So change the id to textPartA. Notice the lowercase t in text, the uppercase P in Part, and the uppercase A. You can use any combination of cases and you can actually name the IDs anything you like. But just as with naming conventions with Java variables, sticking to conventions here will make things less error-prone as our program gets more complicated. Now select textView2 and change id to textOperator. Select the element currently with id textView3 and change it to textPartB. This TextView will hold the later part of our question. Now add another Large Text from the palette. Place it after the row of the three TextViews that we have just been editing. This Large Text will simply hold our equals to sign and there is no plan to ever change it. So we don't need to interact with it in our Java code. We don't even need to concern ourselves with changing the ID or knowing what it is. If this situation changed, we could always come back at a later time and edit its ID. However, this new TextView currently displays Large Text and we want it to display an equals to sign. So in the Properties window, find the text property and enter the value =. We have changed the text property, and you might also like to change the text property for textPartA, textPartB, and textOperator. This is not absolutely essential because we will soon see how we can change it via our Java code; however, if we change the text property to something more appropriate, then our UI designer will look more like it will when the game runs on a real device. So change the text property of textPartA to 2, textPartB to 2, and textOperator to x. Your UI design and Component tree should now look like this: For the buttons to contain our multiple choice answers, drag three buttons in a row, below the = sign. Line them up neatly like our target design. Now, just as we did for the TextViews, find the id properties of each button, and from left to right, change the id properties to buttonChoice1, buttonChoice2, and buttonChoice3. Why not enter some arbitrary numbers for the text property of each button so that the designer more accurately reflects what our game will look like, just as we did for our other TextViews? Again, this is not absolutely essential as our Java code will control the button appearance. We are now actually ready to move on. But you probably agree that the UI elements look a little lost. It would look better if the buttons and text were bigger. All we need to do is adjust the textSize property for each TextView and for each Button. Then, we just need to find the textSize property for each element and enter a number with the sp syntax. If you want your design to look just like our target design from earlier, enter 70sp for each of the TextView textSize properties and 40sp for each of the Buttons textSize properties. When you run the game on your real device, you might want to come back and adjust the sizes up or down a bit. But we have a bit more to do before we can actually try out our game. Save the project and then we can move on. As before, we have built our UI. This time, however, we have given all the important parts of our UI a unique, useful, and easy to identify ID. As we will see we are now able to communicate with our UI through our Java code. Summary In this article, we learned how to set up our development environment by installing JDK and Android Studio. In addition to this, we also learned how to create a new game activity and layout the same on a game screen UI. Resources for Article: Further resources on this subject: Sound Recorder for Android [article] Reversing Android Applications [article] 3D Modeling [article]
Read more
  • 0
  • 0
  • 2078

article-image-creating-games-cocos2d-x-easy-and-100-percent-free
Packt
06 Feb 2015
5 min read
Save for later

Creating Games with Cocos2d-x is Easy and 100-percent Free

Packt
06 Feb 2015
5 min read
This article written by Raydelto Hernandez, the author of Cocos2d-x Android Game Development, explains the history of game development. It also shows how Cocos2d-x is a beneficial software for game development. This article also explains that this software is free and open source, which makes it all the more beneficial. The launch of the Apple App Store back in 2008 leveraged the reach capacity of indie game developers that since this occurrence are able to reach millions of users and compete with large companies, outperforming them in some situations. This reality led the trend of creating reusable game engines such as Cocos2D-iPhone written natively using Objective-C by the argentine, Ricardo Quesada; it allowed many independent developers to reach the top charts of downloads. Picking an existing game engine is a smart choice for indies and large companies since it allows them to focus on the game logic rather than rewriting core features over and over again, thus there are many game engines out there with all kind of licenses and characteristics. The most popular game engines for mobile systems right now are Unity, Marmalade, and Cocos2d-x; the three of them have the capabilities to create 2D and 3D games. Determining which one is the best in terms of ease of use and available tools may be arguably but, there is one objective fact that we can mention that could be easily verified. Among these three engines, Cocos2d-x is the only one that you can use for free, no matter how much money you make using it. We highlighted on this article's title that Cocos2d-x is completely free. This emphasis was done because the other two frameworks also allow some ways of free usage; nevertheless, both at some point require a payment for the usage license. In order to understand why Cocos2d-x is still free and open source, we need to understand how this tool was born. Ricardo, an enthusiastic Python programmer, often participated on game creation challenges from the scratch in only one week. Back in those days, Ricardo and his team re-wrote the core engine for each game until they came with the idea of creating a framework for encapsulating core game capabilities that could be used on any two-dimensional game and make it open source, so contributions could be received worldwide. And that is why Cocos2d was originally written for fun. With the launch in 2007 of the first iPhone, Ricardo lead the development of the port of the Cocos2d Python framework to the iPhone platform using its native language Objective-C. Cocos2D-iPhone quickly became popular among indie game developers, some of them turning themselves into appillionaires, as Chris Stevens called those individuals and enterprises that made millions of dollars during the app store bubble period. This phenomenon made game development companies look at this framework created by hobbyist as a tool creating their products. Zynga was one of the first big companies to adopt Cocos2d as their framework for delivering their famous Farmville game to the iPhone in 2009; this company trades on NASDAQ since 2011 and has more than 2,000 employees. In July 2010, a C++ port of the Cocos2d iPhone called Cocos2d-x was written in China with the objective of taking the power of the framework to other platforms such as the Android operating system that by that time was gaining market share at a spectacular rate. In 2011, this Cocos2d port was acquired by Chukong Technologies, the third largest mobile game development company in China, who later hired the original Cocos2d-iPhone author to join their team. Today, Cocos2d-x-based games dominate the top grossing charts of Google Play and the App Store, especially in Asia. Recognized companies and leading studios such as Konami, Zynga, BANDAI NAMCO, Wooga, Disney Mobile, and Square Enix are using Cocos2d-x in their games. Currently, there are 400,000 developers working on adding new functionalities and making this framework as stable as possible, including engineers from Google, ARM, INTEL, BlackBerry, and Microsoft, who officially support the ports to their products such as Windows Phone, Windows, Windows Metro Interface, and they're planning to support Cocos2d-x for the Xbox during this year. Cocos2d-x is a very straightforward engine that requires a little learning curve to grasp it. I teach game development courses at many universities using this framework. During the first week, the students are capable of creating a game with the complexity of the famous title, Doodle Jump. This can be easily achieved because the framework provides us with all the single components required for our game, such as physics, audio-handling, collision detection, animations, networking, data storage, user input, map rendering, scene transitions, 3D rendering, particle systems rendering, font handling, menu creation, displaying forms, threads handling, and so on, abstracting us from the low-level logic and allowing us to focus on the game logic. In conclusion, if you are willing to learn how to develop games for mobile platforms I strongly recommend you to learn and use the Cocos2d-x framework because it is easy to use, is totally free, is open source, which means that you could better understand it by reading its source, you could modify it if needed, and you have the warranty that you will never be forced to pay a license fee if your game becomes a hit. Another big advantage of this framework is its highly available documentation including the Packt Publishing collection of Cocos2d-x game development books. Sumary This article talked about the different uses of Cocos2d-x. It explained how Cocos2d-x is used worldwide today for game development. This article talked about the use of Cocos2d-x as a free and open source platform for game development.
Read more
  • 0
  • 0
  • 1699

article-image-getting-your-own-video-and-feeds
Packt
06 Feb 2015
18 min read
Save for later

Getting Your Own Video and Feeds

Packt
06 Feb 2015
18 min read
"One server to satisfy them all" could have been the name of this article by David Lewin, the author of BeagleBone Media Center. We now have a great media server where we can share any media, but we would like to be more independent so that we can choose the functionalities the server can have. The goal of this article is to let you cross the bridge, where you are going to increase your knowledge by getting your hands dirty. After all, you want to build your own services, so why not create your own contents as well. (For more resources related to this topic, see here.) More specifically, here we will begin by building a webcam streaming service from scratch, and we will see how this can interact with what we have implemented previously in the server. We will also see how to set up a service to retrieve RSS feeds. We will discuss the services in the following sections: Installing and running MJPG-Streamer Detecting the hardware device and installing drivers and libraries for a webcam Configuring RSS feeds with Leed Detecting the hardware device and installing drivers and libraries for a webcam Even though today many webcams are provided with hardware encoding capabilities such as the Logitech HD Pro series, we will focus on those without this capability, as we want to have a low budget project. You will then learn how to reuse any webcam left somewhere in a box because it is not being used. At the end, you can then create a low cost video conference system as well. How to know your webcam As you plug in the webcam, the Linux kernel will detect it, so you can read every detail it's able to retrieve about the connected device. We are going to see two ways to retrieve the webcam we have plugged in: the easy one that is not complete and the harder one that is complete. "All magic comes with a price."                                                                                     –Rumpelstiltskin, Once Upon a Time Often, at a certain point in your installation, you have to choose between the easy or the hard way. Most of the time, powerful Linux commands or tools are not thought to be easy at first but after some experiments you'll discover that they really can make your life better. Let's start with the fast and easy way, which is lsusb : debian@arm:~$ lsusb Bus 001 Device 002: ID 046d:0802 Logitech, Inc. Webcam C200 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub This just confirms that the webcam is running well and is seen correctly from the USB. Most of the time we want more details, because a hardware installation is not exactly as described in books or documentations, so you might encounter slight differences. This is why the second solution comes in. Among some of the advantages, you are able to know each step that has taken place when the USB device was discovered by the board and Linux, such as in a hardware scenario: debian@arm:~$ dmesg A UVC device (here, a Logitech C200) has been used to obtain these messages Most probably, you won't exactly have the same outputs, but they should be close enough so that you can interpret them easily when they are referred to: New USB device found: This is the main message. In case of any issue, we will check its presence elsewhere. This message indicates that this is a hardware error and not a software or configuration error that you need to investigate. idVendor and idProduct: This message indicates that the device has been detected. This information is interesting so you can check the constructor detail. Most recent webcams are compatible with the Linux USB Video Class (UVC), you can check yours at http://www.ideasonboard.org/uvc/#devices. Among all the messages, you should also look for the one that says Registered new interface driver interface because failing to find it can be a clue that Linux could detect the device but wasn't able to install it. The new device will be detected as /dev/video0. Nevertheless, at start, you can see your webcam as a different device name according to your BeagleBone configuration, for example, if a video capable cape is already plugged in. Setting up your webcam Now we know what is seen from the USB level. The next step is to use the crucial Video4Linux driver, which is like a Swiss army knife for anything related to video capture: debian@arm:~$ Install v4l-utils The primary use of this tool is to inquire about what the webcam can provide with some of its capabilities: debian@arm:~$ v4l2-ctl -–all There are four distinctive sections that let you know how your webcam will be used according to the current settings: Driver info (1) : This contains the following information: Name, vendor, and product IDs that we find in the system message The driver info (the kernel's version) Capabilities: the device is able to provide video streaming Video capture supported format(s) (2): This contains the following information: What resolution(s) are to be used. As this example uses an old webcam, there is not much to choose from but you can easily have a lot of choices with devices nowadays. The pixel format is all about how the data is encoded but more details can be retrieved about format capabilities (see the next paragraph). The remaining stuff is relevant only if you want to know in precise detail. Crop capabilities (3): This contains your current settings. Indeed, you can define the video crop window that will be used. If needed, use the crop settings: --set-crop-output=top=<x>,left=<y>,width=<w>,height=<h> Video input (4): This contains the following information: The input number. Here we have used 0, which is the one that we found previously. Its current status. The famous frames per second, which gives you a local ratio. This is not what you will obtain when you'll be using a server, as network latencies will downgrade this ratio value. You can grab capabilities for each parameter. For instance, if you want to see all the video formats the webcam can provide, type this command: debian@arm:~$ v4l2-ctl --list-formats Here, we see that we can also use MJPEG format directly provided by the cam. While this part is not mandatory, such a hardware tour is interesting because you know what you can do with your device. It is also a good habit to be able to retrieve diagnostics when the webcam shows some bad signs. If you would like to get more in depth knowledge about your device, install the uvcdynctrl package, which lets you retrieve all the formats and frame rates supported. Installing and running MJPG-Streamer Now that we have checked the chain from the hardware level up to the driver, we can install the software that will make use of Video4Linux for video streaming. Here comes MJPG-Streamer. This application aims to provide you with a JPEG stream on the network available for browsers and all video applications. Besides this, we are also interested in this solution as it's made for systems with less advanced CPU, so we can start MJPG-Streamer as a service. With this streamer, you can also use the built-hardware compression and even control webcams such as pan, tilt, rotations, zoom capabilities, and so on. Installing MJPG-Streamer Before installing MJPG-Streamer, we will install all the necessary dependencies: debian@arm:~$ install subversion libjpeg8-dev imagemagick Next, we will retrieve the code from the project: debian@arm:~$ svn checkout http://svn.code.sf.net/p/mjpg-streamer/code/ mjpg-streamer-code You can now build the executable from the sources you just downloaded by performing the following steps: Enter the following into the local directory you have downloaded: debian@arm:~$ cd mjpg-streamer-code/mjpg-streamer Then enter the following command: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ make When the compilation is complete, we end up with some new files. From this picture the new green files are produced from the compilation: there are the executables and some plugins as well. That's all that is needed, so the application is now considered ready. We can now try it out. Not so much to do after all, don't you think? Starting the application This section aims at getting you started quickly with MJPG-Streamer. At the end, we'll see how to start it as a service on boot. Before getting started, the server requires some plugins to be copied into the dedicated lib directory for this purpose: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ sudo cp input_uvc.so output_http.so /usr/lib The MJPG-Streamer application has to know the path where these files can be found, so we define the following environment variable: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ export LD_LIBRARY_PATH=/usr/ lib;$LD_LIBRARY_PATH Enough preparation! Time to start streaming: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$./mjpg_streamer -i "input_uvc.so" -o "output_http.so -w www" As the script starts, the input parameters that will be taken into consideration are displayed. You can now identify this information, as they have been explained previously: The detected device from V4L2 The resolution that will be displayed, according to your settings Which port will be opened Some controls that depend on your camera capabilities (tilt, pan, and so on) If you need to change the port used by MJPG-Streamer, add -p xxxx at the end of the command, which is shown as follows: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg_streamer -i "input_uvc.so" -o "output_http.so -w www –p 1234" Let's add some security If you want to add some security, then you should set the credentials: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg-streamer -o "output_http.so -w ./www -c debian:temppwd" Credentials can always be stolen and used without your consent. The best way to ensure that your stream is confidential all along would be to encrypt it. So if you intend to use strong encryption for secured applications, the crypto-cape is worth taking a look at http://datko.net/2013/10/03/howto_crypto_beaglebone_black/. "I'm famous" – your first stream That's it. The webcam is made accessible to everyone across the network from BeagleBone; you can access the video from your browser and connect to http://192.168.0.15:8080/. You will then see the default welcome screen, bravo!: Your first contact with the MJPG-Server You might wonder how you would get informed about which port to use among those already assigned. Using our stream across the network Now that the webcam is available across the network, you have several options to handle this: You can use the direct flow available from the home page. On the left-hand side menu, just click on the stream tab. Using VLC, you can open the stream with the direct link available at http://192.168.0.15:8080/?action=stream.The VideoLAN menu tab is a M3U-playlist link generator that you can click on. This will generate a playlist file you can open thereafter. In this case, VLC is efficient, as you can transcode the webcam stream to any format you need. Although it's not mandatory, this solution is the most efficient, as it frees the BeagleBone's CPU so that your server can focus on providing services. Using MediaDrop, we can integrate this new stream in our shiny MediaDrop server, knowing that currently MediaDrop doesn't support direct local streams. You can create a new post with the related URL link in the message body, as shown in the following screenshot: Starting the streaming service automatically on boot In the beginning, we saw that MJPG-Streamer needs only one command line to be started. We can put it in a bash script, but servicing on boot is far better. For this, use a console text editor – nano or vim – and create a file dedicated to this service. Let's call it start_mjpgstreamer and add the following commands: #! /bin/sh # /etc/init.d/start_mjpgstreamer export LD_LIBRARY_PATH="/home/debian/mjpg-streamer/mjpg-streamer-code/ mjpg-streamer;$LD_LIBRARY_PATH" EXEC_PATH="/home/debian/mjpg-streamer/mjpg-streamer-code/mjpg-streamer" $EXEC_PATH/mjpg_streamer -i "input_uvc.so" -o "output_http.so -w EXEC_PATH /www" You can then use administrator rights to add it to the services: debian@arm:~$ sudo /etc/init.d/start_mjpgstreamer start On the next reboot, MJPG-Streamer will be started automatically. Exploring new capabilities to install For those about to explore, we salute you! Plugins Remember that at the beginning of this article, we began the demonstration with two plugins: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg_streamer -i "input_uvc.so" -o "output_http.so -w www" If we take a moment to look at these plugins, we will understand that the first plugin is responsible for handling the webcam directly from the driver. Simply ask for help and options as follows: debian@beaglebone:~/mjpg-streamer-code/mjpg-streamer$ ./mjpg_streamer --input "input_uvc.so --help" The second plugin is about the web server settings: The path to the directory contains the final web server HTML pages. This implies that you can modify the existing pages with a little effort or create new ones based on those provided. Force a special port to be used. Like I said previously, port use is dedicated for a server. You define here which will be the one for this service. You can discover many others by asking: debian@arm:~$ ./mjpg_streamer --output "output_http.so --help" Apart from input_uvc and output_http, you have other available plugins to play with. Let's take a look at the plugins directory. Another tool for the webcam The Mjpg_streamer project is dedicated for streaming over network, but it is not the only one. For instance, do you have any specific needs such as monitoring your house/son/cat/Jon Snow figurine? buuuuzzz: if you answered yes to the last one, you just defined yourself as a geek. Well, in that case the Motion project is for you; just install the motion package and start it with the default motion.conf configuration. You will then record videos and pictures of any moving object/person that will be detected. As MJPG-Streamer motion aims to be a low CPU consumer, it works very well on BeagleBone Black. Configuring RSS feeds with Leed Our server can handle videos, pictures, and music from any source and it would be cool to have another tool to retrieve news from some RSS providers. This can be done with Leed, a RSS project organized for servers. You can have a final result, as shown in the following screenshot: This project has a "quick and easy" installation spirit, so you can give it a try without harness. Leed (for Light Feed) allows you to you access RSS feeds from any browser, so no RSS reader application is needed, and every user in your network can read them as well. You install it on the server and feeds are automatically updated. Well, the truth behind the scenes is that a cron task does this for you. You will be guided to set some synchronisation after the installation. Creating the environment for Leed in three steps We already have Apache, MySQL, and PHP installed, and we need a few other prerequisites to run Leed: Create a database for Leed Download the project code and set permissions Install Leed itself Creating a database for Leed You will begin by opening a MySQL session: debian@arm:~$ mysql –u root –p What we need here is to have a dedicated Leed user with its database. This user will be connected using the following: create user 'debian_leed'@'localhost' IDENTIFIED BY 'temppwd'; create database leed_db; use leed_db; grant create, insert, update, select, delete on leed_db.* to debian_leed@localhost; exit Downloading the project code and setting permissions We prepared our server to have its environment ready for Leed, so after getting the latest version, we'll get it working with Apache by performing the following steps: From your home, retrieve the latest project's code. It will also create a dedicated directory: debian@arm:~$ git clone https://github.com/ldleman/Leed.git debian@arm:~$ ls mediadrop mjpg-streamer Leed music Now, we need to put this new directory where the Apache server can find it: debian@arm:~$ sudo mv Leed /var/www/ Change the permissions for the application: debian@arm:~$ chmod 777 /var/www/Leed/ -R Installing Leed When you go to the server address (http//192.168.0.15/leed/install.php), you'll get the following installation screen: We now need to fill in the database details that we previously defined and add the Administrator credentials as well. Now save and quit. Don't worry about the explanations, we'll discuss these settings thereafter. It's important that all items from the prerequisites list on the right are green. Otherwise, a warning message will be displayed about the wrong permissions settings, as shown in the following screenshot: After the configuration, the installation is complete: Leed is now ready for you. Setting up a cron job for feed updates If you want automatic updates for your feeds, you'll need to define a synchronization task with cron: Modify cron jobs: debian@arm:~$ sudo crontab –e Add the following line: 0 * * * * wget -q -O /var/www/leed/logsCron "http://192.168.0.15/Leed/action.php?action=synchronize Save it and your feeds will be refreshed every hour. Finally, some little cleanup: remove install.php for security matters: debian@arm:~$ rm /var/www/Leed/install.php Using Leed to add your RSS feed When you need to add some feeds from the Manage menu, in Feed Options (on the right- hand side) select Preferences and you just have to paste the RSS link and add it with the button: You might find it useful to organize your feeds into groups, as we did for movies in MediaDrop. The Rename button will serve to achieve this goal. For example, here a TV Shows category has been created, so every feed related to this type will be organized on the main screen. Some Leed preferences settings in a server environment You will be asked to choose between two synchronisation modes: Complete and Graduated. Complete: This isto be used in a usual computer, as it will update all your feeds in a row, which is a CPU consuming task Graduated: Look for the oldest 10 feeds and update them if required You also have the possibility of allowing anonymous people to read your feeds. Setting Allow anonymous readers to Yeswill let your guests access your feeds but not add any. Extending Leed with plugins If you want to extend Leed capabilities, you can use the Leed Market—as the author defined it—from Feed options in the Manage menu. There, you'll be directed to the Leed Market space. Installation is just a matter of downloading the ZIP file with all plugins: debian@arm:~/Leed$ wget  https://github.com/ldleman/Leed-market/archive/master.zip debian@arm:~/Leed$ sudo unzip master.zip Let's use the AdBlock plugin for this example: Copy the content of the AdBlock plugin directory where Leed can see it: debian@arm:~/Leed$ sudo cp –r Leed-market-master/adblock /var/www/Leed/plugins Connect yourself and set the plugin by navigating to Manage | Available Plugins and then activate adblock withEnable, as follows: In this article, we covered: Some words about the hardware How to know your webcam Configuring RSS feeds with Leed Summary In this article, we had some good experiments with the hardware part of the server "from the ground," to finally end by successfully setting up the webcam service on boot. We discovered hardware detection, a way to "talk" with our local webcam and thus to be able to see what happens when we plug a device in the BeagleBone. Through the topics, we also discovered video4linux to retrieve information about the device, and learned about configuring devices. Along the way, we encountered MJPG-Streamer. Finally, it's better to be on our own instead of being dependent on some GUI interfaces, where you always wonder where you need to click. Finally, our efforts have been rewarded, as we ended up with a web page we can use and modify according to our tastes. RSS news can also be provided by our server so that you can manage all your feeds in one place, read them anywhere, and even organize dedicated groups. Plenty of concepts have been seen for hardware and software. Then think of this article as a concrete example you can use and adapt to understand how Linux works. I hope you enjoyed this freedom of choice, as you drag ideas and drop them in your BeagleBone as services. We entered in the DIY area, showing you ways to explore further. You can argue, saying that we can choose the software but still use off the shelf commercial devices. Resources for Article: Further resources on this subject: Using PVR with Raspbmc [Article] Pulse width modulator [Article] Making the Unit Very Mobile - Controlling Legged Movement [Article]
Read more
  • 0
  • 0
  • 4608

article-image-lync-2013-hybrid-and-lync-online
Packt
06 Feb 2015
27 min read
Save for later

Lync 2013 Hybrid and Lync Online

Packt
06 Feb 2015
27 min read
In this article, by the authors, Fabrizio Volpe, Alessio Giombini, Lasse Nordvik Wedø, and António Vargas of the book, Lync Server Cookbook, we will cover the following recipes: Introducing Lync Online Administering with the Lync Admin Center Using Lync Online Remote PowerShell Using Lync Online cmdlets Introducing Lync in a hybrid scenario Planning and configuring a hybrid deployment Moving users to the cloud Moving users back on-premises Debugging Lync Online issues (For more resources related to this topic, see here.) Introducing Lync Online Lync Online is part of the Office 365 offer and provides online users with the same Instant Messaging (IM), presence, and conferencing features that we would expect from an on-premises deployment of Lync Server 2013. Enterprise Voice, however, is not available on Office 365 tenants (or at least, it is available only with limitations regarding both specific Office 365 plans and geographical locations). There is no doubt that forthcoming versions of Lync and Office 365 will add what is needed to also support all the Enterprise Voice features in the cloud. Right now, the best that we are able to achieve is to move workloads, homing a part of our Lync users (the ones with no telephony requirements) in Office 365, while the remaining Lync users are homed on-premises. These solutions might be interesting for several reasons, including the fact that we can avoid the costs of expanding our existing on-premises resources by moving a part of our Lync-enabled users to Office 365. The previously mentioned configuration, which involves different kinds of Lync tenants, is called a hybrid deployment of Lync, and we will see how to configure it and move our users from online to on-premises and vice versa. In this Article, every time we talk about Lync Online and Office 365, we will assume that we have already configured an Office tenant. Administering with the Lync Admin Center Lync Online provides the Lync Admin Center (LAC), a dedicated control panel, to manage Lync settings. To open it, access the Office 365 portal and select Service settings, Lync, and Manage settings in the Lync admin center, as shown in the following screenshot: LAC, if you compare it with the on-premises Lync Control Panel (or with the Lync Management Shell), offers few options. For example, it is not possible to create or delete users directly inside Lync. We will see some of the tasks we are able to perform in LAC, and then, we will move to the (more powerful) Remote PowerShell. There is an alternative path to open LAC. From the Office 365 portal, navigate to Users & Groups | Active Users. Select a user, after which you will see a Quick Steps area with an Edit Lync Properties link that will open the user-editable part of LAC. How to do it... LAC is divided into five areas: users, organization, dial-in conferencing, meeting invitation, and tools, as you can see in the following screenshot: The Users panel will show us the configuration of the Lync Online enabled users. It is possible to modify the settings with the Edit option (the small pencil icon on the right): I have tried to summarize all the available options (inside the general, external communications, and dial-in conferencing tabs) in the following screenshot: Some of the user's settings are worth a mention; in the General tab, we have the following:    The Record Conversations and meetings option enables the Start recording option in the Lync client    The Allow anonymous attendees to dial-out option controls whether the anonymous users that are dialing-in to a conference are required to call the conferencing service directly or are authorized for callback    The For compliance, turn off non-archived features option disables Lync features that are not recorded by In-Place Hold for Exchange When you place an Exchange 2013 mailbox on In-Place Hold or Litigation Hold, the Microsoft Lync 2013 content (instant messaging conversations and files shared in an online meeting) is archived in the mailbox. In the dial-in conferencing tab, we have the configuration required for dial-in conferencing. The provider's drop-down menu shows a list of third parties that are able to deliver this kind of feature. The Organization tab manages privacy for presence information, push services, and external access (the equivalent of the Lync federation on-premises). If you enable external access, we will have the option to turn on Skype federation, as we can see in the following screenshot: The Dial-In Conferencing option is dedicated to the configuration of the external providers. The Meeting Invitation option allows the user to customize the Lync Meeting invitation. The Tools options offer a collection of troubleshooting resources. See also For details about Exchange In-Place Hold, see the TechNet post In-Place Hold and Litigation Hold at http://technet.microsoft.com/en-us/library/ff637980(v=exchg.150).aspx. Using Lync Online Remote PowerShell The possibility to manage Lync using Remote PowerShell on a distant deployment has been available since Lync 2010. This feature has always required a direct connection from the management station to the Remote Lync, and a series of steps that is not always simple to set up. Lync Online supports Remote PowerShell using a dedicated (64-bit only) PowerShell module, the Lync Online Connector. It is used to manage online users, and it is interesting because there are many settings and automation options that are available only through PowerShell. Getting ready Lync Online Connector requires one of the following operating systems: Windows 7 (with Service Pack 1), Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2, Windows 8, or Windows 8.1. At least PowerShell 3.0 is needed. To check it, we can use the $PSVersionTable variable. The result will be like the one in the following screenshot (taken on Windows 8.1, which uses PowerShell 4.0): How to do it... Download Windows PowerShell Module for Lync Online from the Microsoft site at http://www.microsoft.com/en-us/download/details.aspx?id=39366 and install it. It is useful to store our Office 365 credentials in an object (it is possible to launch the cmdlets at step 3 anyway, and we will be required with the Office 365 administrator credentials, but using this method, we will have to insert the authentication information again every time it is required). We can use the $credential = Get-Credential cmdlet in a PowerShell session. We will be prompted for our username and password for Lync Online, as shown in the following screenshot: To use the Online Connector, open a PowerShell session and use the New-CsOnlineSession cmdlet. One of the ways to start a remote PowerShell session is $session = New-CsOnlineSession -Credential $credential. Now, we need to import the session that we have created with Lync Online inside PowerShell, with the Import-PSSession $session cmdlet. A temporary Windows PowerShell module will be created, which contains all the Lync Online cmdlets. The name of the temporary module will be similar to the one we can see in the following screenshot: Now, we will have the cmdlets of the Lync Online module loaded in memory, in addition to any command that we already have available in PowerShell. How it works... The feature is based on a PowerShell module, the LyncOnlineConnector, shown in the following screenshot: It contains only two cmdlets, the Set-WinRMNetworkDelayMS and New-CsOnlineSession cmdlets. The latter will load the required cmdlets in memory. As we have seen in the previous steps, the Online Connector adds the Lync Online PowerShell cmdlets to the ones already available. This is something we will use when talking about hybrid deployments, where we will start from the Lync Management Shell and then import the module for Lync Online. It is a good habit to verify (and close) your previous remote sessions. This can be done by selecting a specific session (using Get-PSSession and then pointing to a specific session with the Remove-PSSession statement) or closing all the existing ones with the Get-PSSession | Remove-PSSession cmdlet. In the previous versions of the module, Microsoft Online Services Sign-In Assistant was required. This prerequisite was removed from the latest version. There's more... There are some checks that we are able to perform when using the PowerShell module for Lync Online. By launching the New-CsOnlineSession cmdlet with the –verbose switch, we will see all the messages related to the opening of the session. The result should be similar to the one shown in the following screenshot: Another verification comes from the Get-Command -Module tmp_gffrkflr.ufz command, where the module name (in this example, tmp_gffrkflr.ufz) is the temporary module we saw during the Import-PSSession step. The output of the command will show all the Lync Online cmdlets that we have loaded in memory. The Import-PSSession cmdlet imports all commands except the ones that have the same name of a cmdlet that already exists in the current PowerShell session. To overwrite the existing cmdlets, we can use the -AllowClobber parameter. See also During the introduction of this section, we also discussed the possibility to administer on-premises, remote Lync Server 2013 deployment with a remote PowerShell session. John Weber has written a great post about it in his blog Lync 2013 Remote Admin with PowerShell at http://tsoorad.blogspot.it/2013/10/lync-2013-remote-admin-with-powershell.html, which is helpful if you want to use the previously mentioned feature. Using Lync Online cmdlets In the previous recipe, we outlined the steps required to establish a remote PowerShell session with Lync Online. We have less than 50 cmdlets, as shown in the result of the Get-Command -Module command in the following screenshot: Some of them are specific for Lync Online, such as the following: Get-CsAudioConferencingProvider Get-CsOnlineUser Get-CsTenant Get-CsTenantFederationConfiguration Get-CsTenantHybridConfiguration Get-CsTenantLicensingConfiguration Get-CsTenantPublicProvider New-CsEdgeAllowAllKnownDomains New-CsEdgeAllowList New-CsEdgeDomainPattern Set-CsTenantFederationConfiguration Set-CsTenantHybridConfiguration Set-CsTenantPublicProvider Update-CsTenantMeetingUrl All the remaining cmdlets can be used either with Lync Online or with the on-premises version of Lync Server 2013. We will see the use of some of the previously mentioned cmdlets. How to do it... The Get-CsTenant cmdlet will list Lync Online tenants configured for use in our organization. The output of the command includes information such as the preferred language, registrar pool, domains, and assigned plan. The Get-CsTenantHybridConfiguration cmdlet gathers information about the hybrid configuration of Lync. Management of the federation capability for Lync Online (the feature that enables Instant Messaging and Presence information exchange with users of other domains) is based on the allowed domain and blocked domain lists, as we can see in the organization and external communications screen of LAC, shown in the following screenshot: There are similar ways to manage federation from the Lync Online PowerShell, but it required to put together different statements as follows:     We can use an accept all domains excluding the ones in the exceptions list approach. To do this, we have put the New-CsEdgeAllowAllKnownDomains cmdlet inside a variable. Then, we can use the Set-CsTenantFederationConfiguration cmdlet to allow all the domains (except the ones in the block list) for one of our domains on a tenant. We can use the example on TechNet (http://technet.microsoft.com/en-us/library/jj994088.aspx) and integrate it with Get-CsTenant.     If we prefer, we can use a block all domains but permit the ones in the allow list approach. It is required to define a domain name (pattern) for every domain to allow the New-CsEdgeDomainPattern cmdlet, and each one of them will be saved in a variable. Then, the New-CsEdgeAllowList cmdlet will create a list of allowed domains from the variables. Finally, the Set-CsTenantFederationConfiguration cmdlet will be used. The domain we will work on will be (again) cc3b6a4e-3b6b-4ad4-90be-6faa45d05642. The example on Technet (http://technet.microsoft.com/en-us/library/jj994023.aspx) will be used: $x = New-CsEdgeDomainPattern -Domain "contoso.com" $y = New-CsEdgeDomainPattern -Domain "fabrikam.com" $newAllowList = New-CsEdgeAllowList -AllowedDomain $x,$y Set-CsTenantFederationConfiguration -Tenant " cc3b6a4e-3b6b-4ad4-90be-6faa45d05642" -AllowedDomains $newAllowList The Get-CsOnlineUser cmdlet provides information about users enabled on Office 365. The result will show both users synced with Active Directory and users homed in the cloud. The command supports filters to limit the output; for example, the Get-CsOnlineUser -identity fab will gather information about the user that has alias = fab. This is an account synced from the on-premises Directory Services, so the value of the DirSyncEnabled parameter will be True. See also All the cmdlets of the Remote PowerShell for Lync Online are listed in the TechNet post Lync Online cmdlets at http://technet.microsoft.com/en-us/library/jj994021.aspx. This is the main source of details on the single statement. Introducing Lync in a hybrid scenario In a Lync hybrid deployment, we have the following: User accounts and related information homed in the on-premises Directory Services and replicated to Office 365. A part of our Lync users that consume on-premises resources and a part of them that use online (Office 365 / Lync Online) resources. The same (public) domain name used both online and on-premises (Lync-split DNS). Other Office 365 services and integration with other applications available to all our users, irrespective of where their Lync is provisioned. One way to define Lync hybrid configuration is by using an on-premises Lync deployment federated with an Office 365 / Lync Online tenant subscription. While it is not a perfect explanation, it gives us an idea of the scenario we are talking about. Not all the features of Lync Server 2013 (especially the ones related to Enterprise Voice) are available to Lync Online users. The previously mentioned motivations, along with others (due to company policies, compliance requirements, and so on), might recommend a hybrid deployment of Lync as the best available solution. What we have to clarify now is how to make those users on different deployments talk to each other, see each other's presence status, and so on. What we will see in this section is a high-level overview of the required steps. The Planning and configuring a hybrid deployment recipe will provide more details about the individual steps. The list of steps here is the one required to configure a hybrid deployment, starting from Lync on-premises. In the following sections, we will also see the opposite scenario (with our initial deployment in the cloud). How to do it... It is required to have an available Office 365 tenant configuration. Our subscription has to include Lync Online. We have to configure an Active Directory Federation Services (AD FS) server in our domain and make it available to the Internet using a public FQDN and an SSL certificate released from a third-party certification authority. Office 365 must be enabled to synchronize with our company's Directory Services, using Active Directory Sync. Our Office 365 tenant must be federated. The last step is to configure Lync for a hybrid deployment. There's more... One of the requirements for a hybrid distribution of Lync is an on-premises deployment of Lync Server 2013 or Lync Server 2010. For Lync Server 2010, it is required to have the latest available updates installed, both on the Front Ends and on the Edge servers. It is also required to have the Lync Server 2013 administrative tools installed on a separate server. More details about supported configuration are available on the TechNet post Planning for Lync Server 2013 hybrid deployments at http://technet.microsoft.com/en-us/library/jj205403.aspx. DNS SRV records for hybrid deployments, _sipfederationtls._tcp.<domain> and _sip._tls.<domain>, should point to the on-premises deployment. The lyncdiscover. <domain> record will point to the FQDN of the on-premises reverse proxy server. The _sip._tls. <domain> SRV record will resolve to the public IP of the Access Edge service of Lync on-premises. Depending on the kind of service we are using for Lync, Exchange, and SharePoint, only a part of the features related to the integration with the additional services might be available. For example, skills search is available only if we are using Lync and SharePoint on-premises. The following TechNet post Supported Lync Server 2013 hybrid configurations at http://technet.microsoft.com/en-us/library/jj945633.aspx offers a matrix of features / service deployment combinations. See also Interesting information about Lync Hybrid configuration is presented in sessions available on Channel9 and coming from the Lync Conference 2014 (Lync Online Hybrid Deep Dive at http://channel9.msdn.com/Events/Lync-Conference/Lync-Conference-2014/ONLI302) and from TechEd North America 2014 (Microsoft Lync Online Hybrid Deep Dive at http://channel9.msdn.com/Events/TechEd/NorthAmerica/2014/OFC-B341#fbid=). Planning and configuring a hybrid deployment The planning phase for a hybrid deployment starts from a simple consideration: do we have an on-premises deployment of Lync Server? If the previously mentioned scenario is true, do we want to move users to the cloud or vice versa? Although the first situation is by far the most common, we have to also consider the case in which we have our first deployment in the cloud. How to do it... This step is all that is required for the scenario that starts from Lync Online. We have to completely deploy our Lync on-premises. Establish a remote PowerShell session with Office 365. Use the shared SIP address cmdlet Set-CsTenantFederationConfiguration -SharedSipAddressSpace $True to enable Office 365 to use a Shared Session Initiation Protocol (SIP) address space with our on-premises deployment. To verify this, we can use the Get-CsTenantFederationConfiguration command. The SharedSipAddressSpace value should be set to True. All the following steps are for the scenario that starts from the on-premises Lync deployment. After we have subscribed with a tenant, the first step is to add the public domain we use for our Lync users to Office 365 (so that we can split it on the two deployments). To access the Office 365 portal, select Domains. The next step is Specify a domain name and confirm ownership. We will be required to type a domain name. If our domain is hosted on some specific providers (such as GoDaddy), the verification process can be automated, or we have to proceed manually. The process requires to add one DNS record (TXT or MX), like the ones shown in the following screenshot: If we need to check our Office 365 and on-premises deployments before continuing with the hybrid deployment, we can use the Setup Assistant for Office 365. The tool is available inside the Office 365 portal, but we have to launch it from a domain-joined computer (the login must be performed with the domain administrative credentials). In the Setup menu, we have a Quick Start and an Extend Your Setup option (we have to select the second one). The process can continue installing an app or without software installation, as shown in the following screenshot: The app (which makes the assessment of the existing deployment easier) is installed by selecting Next in the previous screen (it requires at least Windows 7 with Service Pack 1, .NET Framework 3.5, and PowerShell 2.0). Synchronization with the on-premises Active Directory is required. This last step federates Lync Server 2013 with Lync Online to allow communication between our users. The first cmdlet to use is Set-CSAccessEdgeConfiguration -AllowOutsideUsers 1 -AllowFederatedUsers 1 -UseDnsSrvRouting -EnablePartnerDiscovery 1. Note that the -EnablePartnerDiscovery parameter is required. Setting it to 1 enables automatic discovery of federated partner domains. It is possible to set it to 0. The second required cmdlet is New-CSHostingProvider -Identity LyncOnline -ProxyFqdn "sipfed.online.lync.com" -Enabled $true -EnabledSharedAddressSpace $true -HostsOCSUsers $true –VerificationLevel UseSourceVerification -IsLocal $false -AutodiscoverUrl https://webdir.online.lync.com/Autodiscover/AutodiscoverService.svc/root. The result of the commands is shown in the following screenshot: If Lync Online is already defined, we have to use the Set- CSHostingProvider cmdlet, or we can remove it (Remove-CsHostingProvider -Identity LyncOnline) and then create it using the previously mentioned cmdlet. There's more... In the Lync hybrid scenario, users created in the on-premises directory are replicated to the cloud, while users generated in the cloud will not be replicated on-premises. Lync Online users are managed using the Office 365 portal, while the users on-premises are managed using the usual tools (Lync Control Panel and Lync Management Shell). Moving users to the cloud By moving users from Lync on-premises to the cloud, we will lose some of the parameters. The operation requires the Lync administrative tools and the PowerShell module for Lync Online to be installed on the same computer. If we install the module for Lync Online before the administrative tools for Lync 2013 Server, the OCSCore.msi file overwrites the LyncOnlineConnector.ps1 file, and New-CsOnlineSession will require a -TargetServer parameter. In this situation, we have to reinstall the Lync Online module (see the following post on the Microsoft support site at http://support.microsoft.com/kb/2955287). Getting ready Remember that to move the user to Lync Online, they must be enabled for both Lync Server on-premises and Lync Online (so we have to assign the user a license for Lync Online by using the Office 365 portal). Users with no assigned licenses will show the error Move-CsUser : HostedMigration fault: Error=(507), Description=(User must has an assigned license to use Lync Online. For more details, refer to the Microsoft support site at http://support.microsoft.com/kb/2829501. How to do it... Open a new Lync Management Shell session and launch the remote session on Office 365 with the cmdlets' sequence we saw earlier. We have to add the –AllowClobber parameter so that the Lync Online module's cmdlets are able to overwrite the corresponding Lync Management Shell cmdlets: $credential = Get-Credential $session = New-CsOnlineSession -Credential $credential Import-PSSession $session -AllowClobber Open the Lync Admin Center (as we have seen in the dedicated section) by going to Service settings | Lync | Manage settings in the Lync Admin Center, and copy the first part of the URL, for example, https://admin0e.online.lync.com. Add the following string to the previous URL /HostedMigration/hostedmigrationservice.svc (in our example, the result will be https://admin0a.online.lync.com/HostedMigration/hostedmigrationservice.svc). The following cmdlet will move users from Lync on-premises to Lync Online. The required parameters are the identity of the Lync user and the URL that we prepared in step 2. The user identity is fabrizio.volpe@absoluteuc.biz: Move-CsUser -Identity fabrizio.volpe@absoluteuc.biz –Target sipfed.online.lync.com -Credential $creds -HostedMigrationOverrideUrl https://admin0e.online.lync.com/HostedMigration/hostedmigrationservice.sVc Usually, we are required to insert (again) the Office 365 administrative credentials, after which we will receive a warning about the fact that we are moving our user to a different version of the service, like the one in the following screenshot: See the There's more... section of this recipe for details about user information that is migrated to Lync Online. We are able to quickly verify whether the user has moved to Lync Online by using the Get-CsUser | fl DisplayName,HostingProvider,RegistrarPool,SipAddress command. On-premises HostingProvider is equal to SRV: and RegistrarPool is madhatter.wonderland.lab (the name of the internal Lync Front End). Lync Online values are HostingProvider : sipfed.online.lync.com, and leave RegistrarPool empty, as shown in the following screenshot (the user Fabrizio is homed on-premises, while the user Fabrizio volpe is homed on the cloud): There's more... If we plan to move more than one user, we have to add a selection and pipe it before the cmdlet we have already used, removing the –identity parameter. For example, to move all users from an Organizational Unit (OU), (for example, the LyncUsers in the Wonderland.Lab domain) to Lync Online, we can use Get-CsUser -OU "OU=LyncUsers,DC=wonderland,DC=lab"| Move-CsUser -Target sipfed.online.lync.com -Credential $creds -HostedMigrationOverrideUrl https://admin0e.online.lync.com/HostedMigration/hostedmigrationservice.sVc. We are also able to move users based on a parameter to match using the Get-CsUser –Filter cmdlet. As we mentioned earlier, not all the user information is migrated to Lync Online. Migration contact list, groups, and access control lists are migrated, while meetings, contents, and schedules are lost. We can use the Lync Meeting Update Tool to update the meeting links (which have changed when our user's home server has changed) and automatically send updated meeting invitations to participants. There is a 64-bit version (http://www.microsoft.com/en-us/download/details.aspx?id=41656) and a 32-bit version (http://www.microsoft.com/en-us/download/details.aspx?id=41657) of the previously mentioned tool. Moving users back on-premises It is possible to move back users that have been moved from the on-premises Lync deployment to the cloud, and it is also possible to move on-premises users that have been defined and enabled directly in Office 365. In the latter scenario, it is important to create the user also in the on-premises domain (Directory Service). How to do it… The Lync Online user must be created in the Active Directory (for example, I will define the BornOnCloud user that already exists in Office 365). The user must be enabled in the on-premises Lync deployment, for example, using the Lync Management Shell with the following cmdlet: Enable-CsUser -Identity "BornOnCloud" -SipAddress "SIP:BornOnCloud@absoluteuc.biz" -HostingProviderProxyFqdn "sipfed.online.lync.com" Sync the Directory Services. Now, we have to save our Office 365 administrative credentials in a $cred = Get-Credential variable and then move the user from Lync Online to the on-premises Front End using the Lync Management Shell (the -HostedMigrationOverrideURL parameter has the same value that we used in the previous section): Move-CsUser -Identity BornOnCloud@absoluteuc.biz -Target madhatter.wonderland.lab -Credential $cred -HostedMigrationOverrideURL https://admin0e.online.lync.com/HostedMigration/hostedmigrationservice.svc The Get-CsUser | fl DisplayName,HostingProvider,RegistrarPool,SipAddress cmdlet is used to verify whether the user has moved as expected. See also Guy Bachar has published an interesting post on his blog Moving Users back to Lync on-premises from Lync Online (http://guybachar.wordpress.com/2014/03/31/moving-users-back-to-lync-on-premises-from-lync-online/), where he shows how he solved some errors related to the user motion by modifying the HostedMigrationOverrideUrl parameter. Debugging Lync Online issues Getting ready When moving from an on-premises solution to a cloud tenant, the first aspect we have to accept is that we will not have the same level of control on the deployment we had before. The tools we will list are helpful in resolving issues related to Lync Online, but the level of understanding on an issue they give to a system administrator is not the same we have with tools such as Snooper or OCSLogger. Knowing this, the more users we will move to the cloud, the more we will have to use the online instruments. How to do it… The Set up Lync Online external communications site on Microsoft Support (http://support.microsoft.com/common/survey.aspx?scid=sw;en;3592&showpage=1) is a guided walk-through that helps in setting up communication between our Lync Online users and external domains. The tool provides guidelines to assist in the setup of Lync Online for small to enterprise businesses. As you can see in the following screenshot, every single task is well explained: The Remote Connectivity Analyzer (RCA) (https://testconnectivity.microsoft.com/) is an outstanding tool to troubleshoot both Lync on-premises and Lync Online. The web page includes tests to analyze common errors and misconfigurations related to Microsoft services such as Exchange, Lync, and Office 365. To test different scenarios, it is necessary to use various network protocols and ports. If we are working on a firewall-protected network, using the RCA, we are also able to test services that are not directly available to us. For Lync Online, there are some tests that are especially interesting; in the Office 365 tab, the Office 365 General Tests section includes the Office 365 Lync Domain Name Server (DNS) Connectivity Test and the Office 365 Single Sign-On Test, as shown in the following screenshot: The Single Sign-On test is really useful in a scenario. The test requires our domain username and password, both synced with the on-premises Directory Services. The steps include searching the FQDN of our AD FS server on an Internet DNS, verifying the certificate and connectivity, and then validating the token that contains the credentials. The Client tab offers to download the Microsoft Connectivity Analyzer Tool and the Microsoft Lync Connectivity Analyzer Tool, which we will see in the following two dedicated steps: The Microsoft Connectivity Analyzer Tool makes many of the tests we see in the RCA available on our desktop. The list of prerequisites is provided in the article Microsoft Connectivity Analyzer Tool (http://technet.microsoft.com/library/jj851141(v=exchg.80).aspx), and includes Windows Vista/Windows 2008 or later versions of the operating system, .NET Framework 4.5, and an Internet browser, such as Internet Explorer, Chrome, or Firefox. For the Lync tests, a 64-bit operating system is mandatory, and the UCMA runtime 4.0 is also required (it is part of Lync Server 2013 setup, and is also available for download at http://www.microsoft.com/en-us/download/details.aspx?id=34992). The tools propose ways to solve different issues, and then, they run the same tests available on the RCA site. We are able to save the results in an HTML file. The Microsoft Lync Connectivity Analyzer Tool is dedicated to troubleshooting the clients for mobile devices (the Lync Windows Store app and Lync apps). It tests all the required configurations, including autodiscover and webticket services. The 32-bit version is available at http://www.microsoft.com/en-us/download/details.aspx?id=36536, while the 64-bit version can be downloaded from http://www.microsoft.com/en-us/download/details.aspx?id=36535. .NET Framework 4.5 is required. The tool itself requires a few configuration parameters; we have to insert the user information that we usually add in the Lync app, and we have to use a couple of drop-down menus to describe the scenario we are testing (on-premises or Internet, and the kind of client we are going to test). The Show drop-down menu enables us to look not only at a summary of the test results but also at the detailed information. The detailed view includes all the information and requests sent and received during the test, with the FQDN included in the answer ticket from our services, and so on, as shown in the following screenshot: The Troubleshooting Lync Online sign-in post is a support page, available in two different versions (admins and users), and is a walk-through to help admins (or users) to troubleshoot login issues. The admin version is available at http://support.microsoft.com/common/survey.aspx?scid=sw;en;3695&showpage=1, while the user version is available at http://support.microsoft.com/common/survey.aspx?scid=sw;en;3719&showpage=1. Based on our answers to the different scenario questions, the site will propose to information or solution steps. The following screenshot is part of the resolution for the log-I issues of a company that has an enterprise subscription with a custom domain: The Office 365 portal includes some information to help us monitor our Lync subscription. In the Service Health menu, navigate to Service Health; we have a list of all the incidents and service issues of the past days. In the Reports menu, we have statistics about our Office 365 consumption, including Lync. In the following screenshot, we can see the previously mentioned pages: There's more... One interesting aspect of the Microsoft Lync Connectivity Analyzer Tool that we have seen is that it enables testing for on-premises or Office 365 accounts (both testing from inside our network and from the Internet). The previously mentioned capability makes it a great tool to troubleshoot the configuration for Lync on the mobile devices that we have deployed in our internal network. This setup is usually complex, including hair-pinning and split DNS, so the diagnostic is important to quickly find misconfigured services. See also The Troubleshooting Lync Sign-in Errors (Administrators) page on Office.com at http://office.microsoft.com/en-001/communicator-help/troubleshooting-lync-sign-in-errors-administrators-HA102759022.aspx contains a list of messages related to sign-in errors with a suggested solution or a link to additional external resources. Summary In this article, we have learned about managing Lync 2013 and Lync Online and using Lync Online Remote PowerShell and Lync Online cmdlets. Resources for Article: Further resources on this subject: Adding Dialogs [article] Innovation of Communication and Information Technologies [article] Choosing Lync 2013 Clients [article]
Read more
  • 0
  • 0
  • 12847

article-image-upgrading-interface
Packt
06 Feb 2015
4 min read
Save for later

Upgrading the interface

Packt
06 Feb 2015
4 min read
In this article by Marco Schwartz and Oliver Manickum authors of the book Programming Arduino with LabVIEW, we will see how to design an interfave using LabVIEW. (For more resources related to this topic, see here.) At this stage, we know that we have our two sensors working and that they were interfaced correctly with the LabVIEW interface. However, we can do better; for now, we simply have a text display of the measurements, which is not elegant to read. Also, the light-level measurement goes from 0 to 5, which doesn't mean anything for somebody who will look at the interface for the first time. Therefore, we will modify the interface slightly. We will add a temperature gauge to display the data coming from the temperature sensor, and we will modify the output of the reading from the photocell to display the measurement from 0 (no light) to 100 percent (maximum brightness). We first need to place the different display elements. To do this, perform the following steps: Start with Front Panel. You can use a temperature gauge for the temperature and a simple slider indicator for Light Level. You will find both in the Indicators submenu of LabVIEW. After that, simply place them on the right-hand side of the interface and delete the other indicators we used earlier. Also, name the new indicators accordingly so that we can know to which element we have to connect them later. Then, it is time to go back to Block Diagram to connect the new elements we just added in Front Panel. For the temperature element, it is easy: you can simply connect the temperature gauge to the TMP36 output pin. For the light level, we will make slightly more complicated changes. We will divide the measured value beside the Analog Read element by 5, thus obtaining an output value between 0 and 1. Then, we will multiply this value by 100, to end up with a value going from 0 to 100 percent of the ambient light level. To do so perform the following steps: The first step is to place two elements corresponding to the two mathematical operations we want to do: a divide operator and a multiply operator. You can find both of them in the Functions panel of LabVIEW. Simply place them close to the Analog Read element in your program. After that, right-click on one of the inputs of each operator element, and go to Create | Constant to create a constant input for each block. Add a value of 5 for the division block, and add a value of 100 for the multiply block. Finally, connect the output of the Analog Read element to the input of the division block, the output of this block to the input of the multiply block, and the output of the multiply block to the input of the Light Level indicator. You can now go back to Front Panel to see the new interface in action. You can run the program again by clicking on the little arrow on the toolbar. You should immediately see that Temperature is now indicated by the gauge on the right and Light Level is immediately changing on the slider, depending on how you cover the sensor with your hand. Summary In this article, we connected a temperature sensor and a light-level sensor to Arduino and built a simple LabVIEW program to read data from these sensors. Then, we built a nice graphical interface to visualize the data coming from these sensors. There are many ways you can build other projects based on what you learned in this article. You can, for example, connect higher temperatures and/or more light-level sensors to the Arduino board and display these measurements in the interface. You can also connect other kinds of sensors that are supported by LabVIEW, for example, other analog sensors. For example, you can add a barometric pressure sensor or a humidity sensor to the project to build an even more complete weather-measurement station. One other interesting extension of this article will be to use the storage and plotting capabilities of LabVIEW to dynamically plot the history of the measured data inside the LabVIEW interface. Resources for Article: Further resources on this subject: The Arduino Mobile Robot [article] Using the Leap Motion Controller with Arduino [article] Avoiding Obstacles Using Sensors [article]
Read more
  • 0
  • 0
  • 7274
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-nsb-and-security
Packt
06 Feb 2015
14 min read
Save for later

NSB and Security

Packt
06 Feb 2015
14 min read
This article by Rich Helton, the author of Learning NServiceBus Sagas, delves into the details of NSB and its security. In this article, we will cover the following: Introducing web security Cloud vendors Using .NET 4 Adding NServiceBus Benefits of NSB (For more resources related to this topic, see here.) Introducing web security According to the Top 10 list of 2013 by the Open Web Application Security Project (OWASP), found at https://www.owasp.org/index.php/Top10#OWASP_Top_10_for_2013, injection flaws still remain at the top among the ways to penetrate a web site. This is shown in the following screenshot: An injection flaw is a means of being able to access information or the site by injecting data into the input fields. This is normally used to bypass proper authentication and authorization. Normally, this is the data that the website has not seen in the testing efforts or considered during development. For references, I will consider some slides found at http://www.slideshare.net/rhelton_1/cweb-sec-oct27-2010-final. An instance of an injection flaw is to put SQL commands in form fields and even URL fields to try to get SQL errors and returns with further information. If the error is not generic, and a SQL exception occurs, it will sometimes return with table names. It may deny authorization for sa under the password table in SQL Server 2008. Knowing this gives a person knowledge of the SQL Server version, the sa user is being used, and the existence of a password table. There are many tools and websites for people on the Internet to practice their web security testing skills, rather than them literally being in IT security as a professional or amateur. Many of these websites are well-known and posted at places such as https://www.owasp.org/index.php/Phoenix/Tools. General disclaimer I do not endorse or encourage others to practice on websites without written permission from the website owner. Some of the live sites are as follows, and most are used to test web scanners: http://zero.webappsecurity.com/: This is developed by SPI Dynamics (now HP Security) for Web Inspect. It is an ASP site. http://crackme.cenzic.com/Kelev/view/home.php: This PHP site is from Cenzic. http://demo.testfire.net/: This is developed by WatchFire (now IBM Rational AppScan). It is an ASP site. http://testaspnet.vulnweb.com/: This is developed by Acunetix. It is a PHP site. http://webscantest.com/: This is developed by NT OBJECTives NTOSpider. It is a PHP site. There are many more sites and tools, and one would have to research them themselves. There are tools that will only look for SQL Injection. Hacking professionals who are very gifted and spend their days looking for only SQL injection would find these useful. We will start with SQL injection, as it is one of the most popular ways to enter a website. But before we start an analysis report on a website hack, we will document the website. Our target site will be http://zero.webappsecurity.com/. We will start with the EC-Council's Certified Ethical Hacker program, where they divide footprinting and scanning into seven basic steps: Information gathering Determining the network range Identifying active machines Finding open ports and access points OS fingerprinting Fingerprinting services Mapping the network We could also follow the OWASP Web Testing checklist, which includes: Information gathering Configuration testing Identity management testing Authentication testing Session management testing Data validation testing Error handling Cryptography Business logic testing Client-side testing The idea is to gather as much information on the website as possible before launching an attack, as there is no information gathered so far. To gather information on the website, you don't actually have to scan the website yourself at the start. There are many scanners that scan the website before you start. There are Google Bots gathering search information about the site, the Netcraft search engine gathering statistics about the site, as well as many domain search engines with contact information. If another person has hacked the site, there are sites and blogs where hackers talk about hacking a specific site, including what tools they used. They may even post security scans on the Internet, which could be found by googling. There is even a site (https://archive.org/) that is called the WayBack Machine as it keeps previous versions of websites that it scans for in archive. These are just some basic pieces, and any person who has studied for their Certified Ethical Hacker's exam should have all of this on their fingertips. We will discuss some of the benefits that Microsoft and Particular.net have taken into consideration to assist those who develop solutions in C#. We can search at http://web.archive.org/web/ or http://zero.webappsecurity.com/ for changes from the WayBack Machine, and we will see something like this: From this search engine, we look at what the screens looked like 2003, and walk through various changes to the present 2014. Actually, there were errors on archive copying the site in 2003, so this machine directed us to the first best copy on May 11, 2006, as shown in the following screenshot: Looking with Netcraft, we can see that it was first started in 2004, last rebooted in 2014, and is running Ubuntu, as shown in this screenshot: Next, we can try to see what Google tells us. There are many Google Hacking Databases that keep track of keywords in the Google Search Engine API. These keywords are expressions such as file: passwd to search for password files in Ubuntu, and many more. This is not a hacking book, and this site is well-known, so we will just search for webappsecurity.com file:passwd. This gives me more information than needed. On the first item, I get a sample web scan report of the available vulnerabilities in the site from 2008, as shown in the following screenshot: We can also see which links Google has already found by running http://zero.webappsecurity.com/, as shown in this screenshot: In these few steps, I have enough information to bring a targeted website attack to check whether these vulnerabilities are still active or not. I know the operating system of the website and have details of the history of the website. This is before I have even considered running tools to approach the website. To scan the website, for which permission is always needed ahead of time, there are multiple web scanners available. For a list of web scanners, one website is http://sectools.org/tag/web-scanners/. One of the favorites is built by the famed Googler Michal Zalewski, and is called skipfish. Skipfish is an open source tool written in the C language, and it can be used in Windows by compiling it in Cygwin libraries, which are Linux virtual libraries and tools for Windows. Skipfish has its own man pages at http://dev.man-online.org/man1/skipfish/, and it can be downloaded from https://code.google.com/p/skipfish/. Skipfish performs web crawling, fuzzing, and tests for many issues such as XSS and SQL Injection. In Skipfish's case, its fussing uses dictionaries to add more paths to websites, extensions, and keywords that are normally found as attack vectors through the experience of hackers, to apply to the website being scanned. For instance, it may not be apparent from the pages being scanned that there is an admin/index.html page available, but the dictionary will try to check whether the page is available. Skipfish results will appear as follows: The issue with Skipfish is that it is noisy, because of its fuzzer. Skipfish will try many scans and checks for links that might not exist, which will take some time and can be a little noisy out of the box. There are many configurations, and there is throttling of the scanning to try to hide the noise. An associated scan in HP's WebInspect scanner will appear like this: These are just automated means to inspect a website. These steps are common, and much of this material is known in web security. After an initial inspection of a website, a person may start making decisions on how to check their information further. Manually checking websites An experienced web security person may now start proceeding through more manual checks and less automated checking of websites after taking an initial look at the website. For instance, type Admin as the user ID and password, or type Guest instead of Admin, and the list progresses based on experience. Then try the Admin and password combination, then the Admin and password123 combination, and so on. A person inspecting a website might have a lot of time to try to perform penetration testing, and might try hundreds of scenarios. There are many tools and scripts to automate the process. As security analysts, we find many sites that give admin access just by using Admin and Admin as the user ID and password, respectively. To enhance personal skills, there are many tutorials to walk through. One thing to do is to pull down a live website that you can set up for practice, such as WebGoat, and go through the steps outlined in the tutorials from sites such as http://webappsecmovies.sourceforge.net/webgoat/. These sites will show a person how to perform SQL Injection testing through the WebGoat site. As part of these tutorials, there are plugins of Firefox to test security scripts, HTML, debug pieces and tamper with the website through the browser, as shown in this screenshot: Using .NET 4 can help Every page that is deployed to the Internet (and in many cases, the Intranet as well), constantly gets probed and prodded by scans, viruses, and network noise. There are so many pokes, probes, and prods on networks these days that most of them are seen as noise. By default, .NET 4 offers some validation and out-of-the-box support for Web requests. Using .NET 4, you may discover that some input types such as double quotes, single quotes, and even < are blocked in some form fields. You will get an error like what is shown in the following screenshot when trying to pass some of the values: This is very basic validation, and it will reside in the .NET version 4 framework's pooling pieces of Internet Information Services (IIS) for Windows. To further offer security following Microsoft's best enterprise practices, we may also consider using Model-View-Controller (MVC) and Entity Frameworks (EF). To get this information, we can review Microsoft Application Architecture Guide at http://msdn.microsoft.com/en-us/library/ff650706.aspx. The MVC design pattern is the most commonly used pattern in software and is designed as follows: This is a very common design pattern, so why is this important in security? What is helpful is that we can validate data requests and responses through the controllers, as well as provide data annotations for each data element for more validation. A common attack that appeared through viruses through the years is the buffer overflow. A buffer overflow is used to send a lot of data to the data elements. Validation can check whether there is sufficient data to counteract the buffer overflow. EF is a Microsoft framework used to provide an object-relationship mapper. Not only can it easily generate objects to and from the SQL Server through Visual Studio, but it can also use objects instead of SQL scripting. Since it does not use SQL, SQL Injection, which is an attack involving injecting SQL commands through input fields, can be counteracted. Even though some of these techniques will help mitigate many attack vectors, the gateway to backend processes is usually through the website. There are many more injection attack vectors. If stored procedures are used for SQL Server, a scan be tried to access any stored procedures that the website may be calling, as well as for any default stored procedures that may be lingering from default installations from SQL Server. So how do we add further validation and decouple the backend processes in an organization from the website? NServiceBus to the rescue NServiceBus is the most popular C# platform framework used to implement an Enterprise Service Bus (ESB) for service-oriented architecture (SOA). Basically, NSB hosts Windows services through its NServiceBus.Host.exe program, and interfaces these services through different message queuing components. A C# MVC-EF program can call web services directly, and when the web service receives an error, the website will receive the error directly in the MVC program. This creates a coupling of the web service and the website, where changes in the website can affect the web services and actions in the web services can affect the website. Because of this coupling, websites may have a Please do not refresh the page until the process is finished warning. Normally, it is wise to step away from the phone, tablet, or computer until the website is loaded. It could be that even though you may not touch the website, another process running on the machine may. A virus scanner, update, or multiple other processes running on the device could cause any glitch in the refreshing of anything on the device. With all the scans that could be happening on a website and that others on the Internet could be doing, it seems quite odd that a page would say Please don't' touch me, I am busy. In order to decouple the website from the web services, a service needs to be deployed between the website and web service. It helps if that service has a lot of out-of-the-box security features as well, to help protect the interaction between the website and web service. For this reason, a product such as NServiceBus is most helpful, where others have already laid the groundwork to have advanced security features in services tested through the industry by their use. Being the most common C# ESB platform has its advantages, as developers and architects ensure the integrity of the framework well before a new design starts using it. Benefits of NSB NSB provides many components needed for automation that are only found in ESBs. ESBs provide the following: Separation of duties: There is separation of duties from the frontend to the backend, allowing the frontend to fire a message to a service and continue in its processing, and not worrying about the results until it needs an update. Also, separation of workflow responsibility exists through separating out NSB services. One service could be used to send payments to a bank, and another service could be used to provide feedback of the current status of payment to the MVC-EF database so that a user may see their payment status. Message durability: Messages are saved in queues between services so that in case services are stopped, they can start from the messages in the queues when they restart, and the messages will persist until told otherwise. Workflow retries: Messages, or endpoints, can be told to retry a number of times until they completely fail and send an error. The error is automated to return to an error queue. For instance, a web service message can be sent to a bank, and it can be set to retry the web service every 5 minutes for 20 minutes before giving up completely. This is useful during any network or server issues. Monitoring: NSB ServicePulse can keep a heartbeat on its services. Other monitoring can easily be done on the NSB queues to report on the number of messages. Encryption: Messages between services and endpoints can be easily encrypted. High availability: Multiple services or subscribers could be processing the same or similar messages from various services that are living on different servers. When one server or service goes down, others could be made available to take over those that are already running. Summary If any website is on the Internet, it is being scanned by a multitude of means, from websites and others. It is wise to decouple external websites from backend processes through a means such as NServiceBus. Websites that are not decoupled from the backend can be acted upon by the external processes that it may be accomplishing, such as a web service to validate a credit card. These websites may say Do not refresh this page. Other conditions might occur to the website and be beyond your reach, refreshing the page to affect that interaction. The best solution is to decouple the website from these processes through NServiceBus. Resources for Article: Further resources on this subject: Mobile Game Design [Article] CryENGINE 3: Breaking Ground with Sandbox [Article] CryENGINE 3: Fun Physics [Article]
Read more
  • 0
  • 0
  • 3783

article-image-qlik-senses-vision
Packt
06 Feb 2015
12 min read
Save for later

Qlik Sense's Vision

Packt
06 Feb 2015
12 min read
In this article by Christopher Ilacqua, Henric Cronström, and James Richardson, authors of the book Learning Qlik® Sense, we will look at the evolving requirements that compel organizations to readdress how they deliver business intelligence and support data-driven decision-making. This is important as it supplies some of the reasons as to why Qlik® Sense is relevant and important to their success. The purpose of covering these factors is so that you can consider and plan for them in your organization. Among other things, in this article, we will cover the following topics: The ongoing data explosion The rise of in-memory processing Barrierless BI through Human-Computer Interaction The consumerization of BI and the rise of self-service The use of information as an asset The changing role of IT (For more resources related to this topic, see here.) Evolving market factors Technologies are developed and evolved in response to the needs of the environment they are created and used within. The most successful new technologies anticipate upcoming changes in order to help people take advantage of altered circumstances or reimagine how things are done. Any market is defined by both the suppliers—in this case, Qlik®—and the buyers, that is, the people who want to get more use and value from their information. Buyers' wants and needs are driven by a variety of macro and micro factors, and these are always in flux in some markets more than others. This is obviously and apparently the case in the world of data, BI, and analytics, which has been changing at a great pace due to a number of factors discussed further in the rest of this article. Qlik Sense has been designed to be the means through which organizations and the people that are a part of them thrive in a changed environment. Big, big, and even bigger data A key factor is that there's simply much more data in many forms to analyze than before. We're in the middle of an ongoing, accelerating data boom. According to Science Daily, 90 percent of the world's data was generated over the past two years. The fact is that with technologies such as Hadoop and NoSQL databases, we now have unprecedented access to cost-effective data storage. With vast amounts of data now storable and available for analysis, people need a way to sort the signal from the noise. People from a wider variety of roles—not all of them BI users or business analysts—are demanding better, greater access to data, regardless of where it comes from. Qlik Sense's fundamental design centers on bringing varied data together for exploration in an easy and powerful way. The slow spinning down of the disk At the same time, we are seeing a shift in how computation occurs and potentially, how information is managed. Fundamentals of the computing architectures that we've used for decades, the spinning disk and moving read head, are becoming outmoded. This means storing and accessing data has been around since Edison invented the cylinder phonograph in 1877. It's about time this changed. This technology has served us very well; it was elegant and reliable, but it has limitations. Speed limitations primarily. Fundamentals that we take for granted today in BI, such as relational and multidimensional storage models, were built around these limitations. So were our IT skills, whether we realized it at the time. With the use of in-memory processing and 64-bit addressable memory spaces, these limitations are gone! This means a complete change in how we think about analysis. Processing data in memory means we can do analysis that was impractical or impossible before with the old approach. With in-memory computing, analysis that would've taken days before, now takes just seconds (or much less). However, why does it matter? Because it allows us to use the time more effectively; after all, time is the most finite resource of all. In-memory computing enables us to ask more questions, test more scenarios, do more experiments, debunk more hypotheses, explore more data, and run more simulations in the short window available to us. For IT, it means no longer trying to second-guess what users will do months or years in advance and trying to premodel it in order to achieve acceptable response times. People hate watching the hourglass spin. Qlik Sense's predecessor QlikView® was built on the exploitation of in-memory processing; Qlik Sense has it at its core too. Ubiquitous computing and the Internet of Things You may know that more than a billion people use Facebook, but did you know that the majority of those people do so from a mobile device? The growth in the number of devices connected to the Internet is absolutely astonishing. According to Cisco's Zettabyte Era report, Internet traffic from wireless devices will exceed traffic from wired devices in 2014. If we were writing this article even as recently as a year ago, we'd probably be talking about mobile BI as a separate thing from desktop or laptop delivered analytics. The fact of the matter is that we've quickly gone beyond that. For many people now, the most common way to use technology is on a mobile device, and they expect the kind of experience they've become used to on their iOS or Android device to be mirrored in complex software, such as the technology they use for visual discovery and analytics. From its inception, Qlik Sense has had mobile usage in the center of its design ethos. It's the first data discovery software to be built for mobiles, and that's evident in how it uses HTML5 to automatically render output for the device being used, whatever it is. Plug in a laptop running Qlik Sense to a 70-inch OLED TV and the visual output is resized and re-expressed to optimize the new form factor. So mobile is the new normal. This may be astonishing but it's just the beginning. Mobile technology isn't just a medium to deliver information to people, but an acceleration of data production for analysis too. By 2020, pretty much everyone and an increasing number of things will be connected to the Internet. There are 7 billion people on the planet today. Intel predicts that by 2020, more than 31 billion devices will be connected to the Internet. So, that's not just devices used by people directly to consume or share information. More and more things will be put online and communicate their state: cars, fridges, lampposts, shoes, rubbish bins, pets, plants, heating systems—you name it. These devices will generate a huge amount of data from sensors that monitor all kinds of measurable attributes: temperature, velocity, direction, orientation, and time. This means an increasing opportunity to understand a huge gamut of data, but without the right technology and approaches it will be complex to analyze what is going on. Old methods of analysis won't work, as they don't move quickly enough. The variety and volume of information that can be analyzed will explode at an exponential rate. The rise of this type of big data makes us redefine how we build, deliver, and even promote analytics. It is an opportunity for those organizations that can exploit it through analysis; this can sort the signals from the noise and make sense of the patterns in the data. Qlik Sense is designed as just such a signal booster; it takes how users can zoom and pan through information too large for them to easily understand the product. Unbound Human-Computer Interaction We touched on the boundary between the computing power and the humans using it in the previous section. Increasingly, we're removing barriers between humans and technology. Take the rise of touch devices. Users don't want to just view data presented to them in a static form. Instead, they want to "feel" the data and interact with it. The same is increasingly true of BI. The adoption of BI tools has been too low because the technology has been hard to use. Adoption has been low because in the past BI tools often required people to conform to the tool's way of working, rather than reflecting the user's way of thinking. The aspiration for Qlik Sense (when part of the QlikView.Next project) was that the software should be both "gorgeous and genius". The genius part obviously refers to the built-in intelligence, the smarts, the software will have. The gorgeous part is misunderstood or at least oversimplified. Yes, it means cosmetically attractive (which is important) but much more importantly, it means enjoyable to use and experience. In other words, Qlik Sense should never be jarring to users but seamless, perhaps almost transparent to them, inducing a state of mental flow that encourages thinking about the question being considered rather than the tool used to answer it. The aim was to be of most value to people. Qlik Sense will empower users to explore their data and uncover hidden insights, naturally. Evolving customer requirements It is not only the external market drivers that impact how we use information. Our organizations and the people that work within them are also changing in their attitude towards technology, how they express ideas through data, and how increasingly they make use of data as a competitive weapon. Consumerization of BI and the rise of self-service The consumerization of any technology space is all about how enterprises are affected by, and can take advantage of, new technologies and models that originate and develop in the consumer marker, rather than in the enterprise IT sector. The reality is that individuals react quicker than enterprises to changes in technology. As such, consumerization cannot be stopped, nor is it something to be adopted. It can be embraced. While it's not viable to build a BI strategy around consumerization alone, its impact must be considered. Consumerization makes itself felt in three areas: Technology: Most investment in innovation occurs in the consumer space first, with enterprise vendors incorporating consumer-derived features after the fact. (Think about how vendors added the browser as a UI for business software applications.) Economics: Consumer offerings are often less expensive or free (to try) with a low barrier of entry. This drives prices down, including enterprise sectors, and alters selection behavior. People: Demographics, which is the flow of Millennial Generation into the workplace, and the blurring of home/work boundaries and roles, which may be seen from a traditional IT perspective as rogue users, with demands to BYOPC or device. In line with consumerization, BI users want to be able to pick up and just use the technology to create and share engaging solutions; they don't want to read the manual. This places a high degree of importance on the Human-Computer Interaction (HCI) aspects of a BI product (refer to the preceding list) and governed access to information and deployment design. Add mobility to this and you get a brand new sourcing and adoption dynamic in BI, one that Qlik engendered, and Qlik Sense is designed to take advantage of. Think about how Qlik Sense Desktop was made available as a freemium offer. Information as an asset and differentiator As times change, so do differentiators. For example, car manufacturers in the 1980s differentiated themselves based on reliability, making sure their cars started every single time. Today, we expect that our cars will start; reliability is now a commodity. The same is true for ERP systems. Originally, companies implemented ERPs to improve reliability, but in today's post-ERP world, companies are shifting to differentiating their businesses based on information. This means our focus changes from apps to analytics. And analytics apps, like those delivered by Qlik Sense, help companies access the data they need to set themselves apart from the competition. However, to get maximum return from information, the analysis must be delivered fast enough, and in sync with the operational tempo people need. Things are speeding up all the time. For example, take the fashion industry. Large mainstream fashion retailers used to work two seasons per year. Those that stuck to that were destroyed by fast fashion retailers. The same is true for old style, system-of-record BI tools; they just can't cope with today's demands for speed and agility. The rise of information activism A new, tech-savvy generation is entering the workforce, and their expectations are different than those of past generations. The Beloit College Mindset List for the entering class of 2017 gives the perspective of students entering college this year, how they see the world, and the reality they've known all their lives. For this year's freshman class, Java has never been just a cup of coffee and a tablet is no longer something you take in the morning. This new generation of workers grew up with the Internet and is less likely to be passive with data. They bring their own devices everywhere they go, and expect it to be easy to mash-up data, communicate, and collaborate with their peers. The evolution and elevation of the role of IT We've all read about how the role of IT is changing, and the question CIOs today must ask themselves is: "How do we drive innovation?". IT must transform from being gatekeepers (doers) to storekeepers (enablers), providing business users with self-service tools they need to be successful. However, to achieve this transformation, they need to stock helpful tools and provide consumable information products or apps. Qlik Sense is a key part of the armory that IT needs to provide to be successful in this transformation. Summary In this article, we looked at the factors that provide the wider context for the use of Qlik Sense. The factors covered arise out of both increasing technical capability and demands to compete in a globalized, information-centric world, where out-analyzing your competitors is a key success factor. Resources for Article: Further resources on this subject: Securing QlikView Documents [article] Conozca QlikView [article] Introducing QlikView elements [article]
Read more
  • 0
  • 0
  • 2208

article-image-visualforce-development-apex
Packt
06 Feb 2015
12 min read
Save for later

Visualforce Development with Apex

Packt
06 Feb 2015
12 min read
In this article by Matt Kaufman and Michael Wicherski, authors of the book Learning Apex Programming, we will see how we can use Apex to extend the Salesforce1 Platform. We will also see how to create a customized Force.com page. (For more resources related to this topic, see here.) Apex on its own is a powerful tool to extend the Salesforce1 Platform. It allows you to define your own database logic and fully customize the behavior of the platform. Sometimes, controlling "what happens behind the scenes isn't enough. You might have a complex process that needs to step users through a wizard or need to present data in a format that isn't native to the Salesforce1 Platform, or maybe even make things look like your corporate website. Anytime you need to go beyond custom logic and implement a custom interface, you can turn to Visualforce. Visualforce is the user interface framework for the Salesforce1 Platform. It supports the use of HTML, JavaScript, CSS, and Flash—all of which enable you to build your own custom web pages. These web pages are stored and hosted by the Salesforce1 Platform and can be exposed to just your internal users, your external community users, or publicly to the world. But wait, there's more! Also included with Visualforce is a robust markup language. This markup language (which is also referred to as Visualforce) allows you to bind your web pages to data and actions stored on the platform. It also allows you to leverage Apex for code-based objects and actions. Like the rest of the platform, the markup portion of Visualforce is upgraded three times a year with new tags and features. All of these features mean that Visualforce is very powerful. s-con-what? Before the "introduction of Visualforce, the Salesforce1 Platform had a feature called s-controls. These were simple files where you could write HTML, CSS, and JavaScript. There was no custom markup language included. In order to make things look like the Force.com GUI, a lot of HTML was required. If you wanted to create just a simple input form for a new Account record, so much HTML code was required. The following is just a" small, condensed excerpt of what the HTML would look like if you wanted to recreate such a screen from scratch: <div class="bPageTitle"><div class="ptBody"><div class="content"> <img src="/s.gif" class="pageTitleIcon" title="Account" /> <h1 class="pageType">    Account Edit<span class="titleSeparatingColon">:</span> </h1> <h2 class="pageDescription"> New Account</h2> <div class="blank">&nbsp;</div> </div> <div class="links"></div></div><div   class="ptBreadcrumb"></div></div> <form action="/001/e" method="post" onsubmit="if   (window.ffInAlert) { return false; }if (window.sfdcPage   &amp;&amp; window.sfdcPage.disableSaveButtons) { return   window.sfdcPage.disableSaveButtons(); }"> <div class="bPageBlock brandSecondaryBrd bEditBlock   secondaryPalette"> <div class="pbHeader">    <table border="0" cellpadding="0" cellspacing="0"><tbody>      <tr>      <td class="pbTitle">      <img src="/s.gif" width="12" height="1" class="minWidth"         style="margin-right: 0.25em;margin-right: 0.25em;margin-       right: 0.25em;">      <h2 class="mainTitle">Account Edit</h2>      </td>      <td class="pbButton" id="topButtonRow">      <input value="Save" class="btn" type="submit">      <input value="Cancel" class="btn" type="submit">      </td>      </tr>    </tbody></table> </div> <div class="pbBody">    <div class="pbSubheader brandTertiaryBgr first       tertiaryPalette" >    <span class="pbSubExtra"><span class="requiredLegend       brandTertiaryFgr"><span class="requiredExampleOuter"><span       class="requiredExample">&nbsp;</span></span>      <span class="requiredMark">*</span>      <span class="requiredText"> = Required Information</span>      </span></span>      <h3>Account Information<span         class="titleSeparatingColon">:</span> </h3>    </div>    <div class="pbSubsection">    <table class="detailList" border="0" cellpadding="0"     cellspacing="0"><tbody>      <tr>        <td class="labelCol requiredInput">        <label><span class="requiredMark">*</span>Account         Name</label>      </td>      <td class="dataCol col02">        <div class="requiredInput"><div         class="requiredBlock"></div>        <input id="acc2" name="acc2" size="20" type="text">        </div>      </td>      <td class="labelCol">        <label>Website</label>      </td>      <td class="dataCol">        <span>        <input id="acc12" name="acc12" size="20" type="text">        </span>      </td>      </tr>    </tbody></table>    </div> </div> <div class="pbBottomButtons">    <table border="0" cellpadding="0" cellspacing="0"><tbody>    <tr>      <td class="pbTitle"><img src="/s.gif" width="12" height="1"       class="minWidth" style="margin-right: 0.25em;margin-right:       0.25em;margin-right: 0.25em;">&nbsp;</td>      <td class="pbButtonb" id="bottomButtonRow">      <input value=" Save " class="btn" title="Save"         type="submit">      <input value="Cancel" class="btn" type="submit">      </td>    </tr>    </tbody></table> </div> <div class="pbFooter secondaryPalette"><div class="bg"> </div></div> </div> </form> We did our best to trim down this HTML to as little as possible. Despite all of our efforts, it still "took up more space than we wanted. The really sad part is that all of that code only results in the following screenshot: Not only was it time consuming to write all this HTML, but odds were that we wouldn't get it exactly right the first time. Worse still, every time the business requirements changed, we had to go through the exhausting effort of modifying the HTML code. Something had to change in order to provide us relief. That something was the introduction of Visualforce and its markup language. Your own personal Force.com The markup "tags in Visualforce correspond to various parts of the Force.com GUI. These tags allow you to quickly generate HTML markup without actually writing any HTML. It's really one of the greatest tricks of the Salesforce1 Platform. You can easily create your own custom screens that look just like the built-in ones with less effort than it would take you to create a web page for your corporate website. Take a look at the Visualforce markup that corresponds to the HTML and screenshot we showed you earlier: <apex:page standardController="Account" > <apex:sectionHeader title="Account Edit" subtitle="New Account"     /> <apex:form>    <apex:pageBlock title="Account Edit" mode="edit" >      <apex:pageBlockButtons>        <apex:commandButton value="Save" action="{!save}" />        <apex:commandButton value="Cancel" action="{!cancel}" />      </apex:pageBlockButtons>      <apex:pageBlockSection title="Account Information" >        <apex:inputField value="{!account.Name}" />        <apex:inputField value="{!account.Website}" />      </apex:pageBlockSection>    </apex:pageBlock> </apex:form> </apex:page> Impressive! With "merely these 15 lines of markup, we can render nearly 100 lines of earlier HTML. Don't believe us, you can try it out yourself. Creating a Visualforce page Just like" triggers and classes, Visualforce pages can "be created and edited using the Force.com IDE. The Force.com GUI also includes a web-based editor to work with Visualforce pages. To create a new Visualforce page, perform these simple steps: Right-click on your project and navigate to New | Visualforce Page. The Create New Visualforce Page window appears as shown: Enter" the label and name for your "new page in the Label and Name fields, respectively. For this example, use myTestPage. Select the API version for the page. For this example, keep it at the default value. Click on Finish. A progress bar will appear followed by your new Visualforce page. Remember that you always want to create your code in a Sandbox or Developer Edition org, not directly in Production. It is technically possible to edit Visualforce pages in Production, but you're breaking all sorts of best practices when you do. Similar to other markup languages, every tag in a Visualforce page must be closed. Tags and their corresponding closing tags must also occur in a proper order. The values of tag attributes are enclosed by double quotes; however, single quotes can be used inside the value to denote text values. Every Visualforce page starts with the <apex:page> tag and ends with </apex:page> as shown: <apex:page> <!-- Your content goes here --> </apex:page> Within "the <apex:page> tags, you can paste "your existing HTML as long as it is properly ordered and closed. The result will be a web page hosted by the Salesforce1 Platform. Not much to see here If you are" a web developer, then there's a lot you can "do with Visualforce pages. Using HTML, CSS, and images, you can create really pretty web pages that educate your users. If you have some programming skills, you can also use JavaScript in your pages to allow for interaction. If you have access to web services, you can use JavaScript to call the web services and make a really powerful application. Check out the following Visualforce page for an example of what you can do: <apex:page> <script type="text/javascript"> function doStuff(){    var x = document.getElementById("myId");    console.log(x); } </script> <img src="http://www.thisbook.com/logo.png" /> <h1>This is my title</h1> <h2>This is my subtitle</h2> <p>In a world where books are full of code, there was only one     that taught you everything you needed to know about Apex!</p> <ol>    <li>My first item</li>    <li>Etc.</li> </ol> <span id="myId"></span> <iframe src="http://www.thisbook.com/mypage.html" /> <form action="http://thisbook.com/submit.html" >    <input type="text" name="yoursecret" /> </form> </apex:page> All of this code is standalone and really has nothing to do with the Salesforce1 Platform other than being hosted by it. However, what really makes Visualforce powerful is its ability to interact with your data, which allows your pages to be more dynamic. Even better, you" can write Apex code to control how "your pages behave, so instead of relying on client-side JavaScript, your logic can run server side. Summary In this article we learned how a few features of Apex and how we can use it to extend the SalesForce1 Platform. We also created a custom Force.com page. Well, you've made a lot of progress. Not only can you write code to control how the database behaves, but you can create beautiful-looking pages too. You're an Apex rock star and nothing is going to hold you back. It's time to show your skills to the world. If you want to dig deeper, buy the book and read Learning Apex Programming in a simple step-by-step fashion by using Apex, the language for extension of the Salesforce1 Platform. Resources for Article: Further resources on this subject: Learning to Fly with Force.com [article] Building, Publishing, and Supporting Your Force.com Application [article] Adding a Geolocation Trigger to the Salesforce Account Object [article]
Read more
  • 0
  • 0
  • 2474

article-image-mobile-administration
Packt
06 Feb 2015
17 min read
Save for later

Mobile Administration

Packt
06 Feb 2015
17 min read
In this article by Paul Goodey, author of the book Salesforce CRM – The Definitive Admin Handbook - Third Edition, we will look at the administration of Salesforce Mobile solutions that can significantly improve productivity and user satisfaction and help them access data and application functionality out of the office. (For more resources related to this topic, see here.) In the past, mobile devices that were capable of accessing software applications were very expensive. Often, these devices were regarded as a nice to have accessory by management and were seen as a company perk by field-based teams. Today, mobile devices are far more prevalent within the business environment, and organizations are increasingly realizing the benefits of using mobile phones and devices to access business applications. Salesforce has taken the lead in recognizing how mobiles have become the new standard for being connected in people's personal and professional lives. It has also highlighted how increasingly, the users of their apps are living lives connected to the Internet, but rather than sitting at a desk in the office, they are in between meetings, on the road, in planes, in trains, in cabs, or even in the queue for lunch. As a result, Salesforce has developed innovative mobile solutions that help you and your users embrace this mobile-first world in Salesforce CRM. Accessing Salesforce Mobile products Salesforce offers two varieties of mobile solutions, namely mobile browser apps and downloadable apps. Mobile browser apps, as the name suggests, are accessed using a web browser that is available on a mobile device. Downloadable apps are accessed by first downloading the client software from, say, the Apple App Store or Google Play and then installing it onto the mobile device. Mobile browser apps and downloadable apps offer various features and benefits and, as we'll see, are available for various Salesforce mobile products and device combinations. Most mobile devices these days have some degree of web browser capability, which can be used to access Salesforce CRM; however, some Salesforce mobile products are optimized for use with certain devices. By accessing a Salesforce mobile browser app, your users do not require anything to be installed. Supported mobile browsers for Salesforce are generally available on Android, Apple, BlackBerry, and Microsoft Windows 8.1 devices. Downloadable apps, on the other hand, will require the app to be first downloaded from the App Store for Apple® devices or from Google Play™ for Android™ devices and then installed on the mobile device. Salesforce mobile products' overview Salesforce has provided certain mobile products as downloadable apps only, while others have been provided as both downloadable and mobile browser-based. The following list outlines the various mobile app products, features, and capabilities used to access Salesforce CRM on mobile devices: SalesforceA Salesforce Touch Salesforce1 Salesforce Classic Salesforce Touch is no longer available and is mentioned here for completeness as this product has been recently incorporated into the Salesforce1 product. SalesforceA SalesforceA is a downloadable system administration app that allows you to manage your organization's users and view certain information for your Salesforce organization from your mobile device. Salesforce A is intended to be used by system administrators, as it is restricted to users with the Manage Users permission. The SalesforceA app provides the facilities to carry out user tasks, such as deactivating or freezing users, resetting passwords, unlocking users, editing user details, calling and emailing users, and assigning permission sets. These user task buttons are displayed as action icons, as shown in the following screenshot: These icons are presented in the action bar at the bottom of the mobile device screen, as shown in the following screenshot: In addition to the user tasks, you can view the system status and also switch between your user accounts in multiple organizations. This allows you to access different organizations and communities without having to log out and log back in to each user account. By staying logged in to multiple accounts in different organizations, you will save time by easily switching to the particular organization user account that you need to access. SalesforceA supported devices At the time of writing, the following devices are supported by Salesforce for use with the SalesforceA downloadable app: Android phones Apple iPhone Apple iPod Touch SalesforceA can be installed from Google Play™ for Android™ phones and the Apple® App Store for Apple devices. Salesforce Touch Salesforce Touch is the name of an earlier Salesforce mobile product and is no longer available. With the Spring 2014 release, Salesforce Touch was incorporated into the Salesforce1 app. Hence, both the Salesforce Touch mobile browser and Salesforce Touch downloadable apps are no longer available; however, the functionality that they once offered is available in Salesforce1, which is covered in this article. Salesforce1 Salesforce1 is Salesforce's next-generation mobile CRM platform that has been designed for Salesforce's customers, developers, and ISVs (independent software vendors) to connect mobile apps, browser apps, and third-party app services. Salesforce1 has been developed for a mobile-first environment and demonstrates how Salesforce's focus as a platform provider aims to connect enterprises with systems that can be programmed through APIs, along with mobile apps and services that can be utilized by marketing, sales, and customer service. There are two ways to use Salesforce1: either using a mobile browser app that users can access by logging into Salesforce from a supported mobile browser or downloadable apps that users can install from the App Store or Google Play. Either way, Salesforce1 allows users to access and update Salesforce data from an interface that has been optimized to navigate and work on their touchscreen mobile devices. Using Salesforce1, records can be viewed, edited, and created. Users can manage their activities, view their dashboards, and use Chatter. Salesforce1 also supports many standard objects and list views, all custom objects, plus the integration of other mobile apps and many of your organization's Salesforce customizations, including Visualforce tabs and pages. Salesforce1 supported devices At the time of writing this, the following devices are supported by Salesforce for the Salesforce1 mobile browser app: Android phones Apple iPad Apple iPhone BlackBerry Z10 Windows 8.1 phones (Beta support) Also, at the time of writing this, Salesforce specifies the following devices as being supported for the Salesforce1 downloadable app: Android phones Apple iPad Apple iPhone Salesforce1 data availability Your organization edition, the user's license type, along with the user's profile and any permission sets, determines the data that is available to the user within Salesforce1. Generally, users have the same visibility of objects, record types, fields, and page layouts that they have while accessing the full Salesforce browser app. However, at the time of writing this, not all data is available in the current release of the Salesforce1 app. In Winter 2015, these key objects are fully accessible from the Salesforce1 navigation menu: Accounts; Campaigns; Cases; Contacts; Contracts; Leads; Opportunities; Tasks; and Users. Dashboards and Events, however, are restricted to being viewable from only the Salesforce1 navigation menu. Custom objects are fully accessible if they have a tab that the user can access. For new users who are yet to build a history of recent objects, they initially see a set of default objects in the Recent section in the Salesforce1 navigation menu. The majority of standard and custom fields, and most of the related lists for the supported objects, are available on these records; however, at the time of writing this, the following exceptions exist: Rich text area field support varies (detailed shortly) Links on formula fields are not supported State and country picklist fields are not supported Related lists in Salesforce1 are restricted (detailed shortly) Rich text area field support varies Support for rich text area fields varies by the version of Salesforce1 and the type of device. For Android's downloadable apps, you can view and edit rich text area fields. However, for Android's mobile browser apps, you can only view rich text area fields; editing is not supported currently. For iOS's downloadable apps, you can view but not edit rich text area fields. However, for iOS's mobile browser apps, you can view and also edit rich text area fields. Finally, for both BlackBerry and Windows 8.1 mobile browser apps, you can neither view nor edit rich text area fields. Related lists in Salesforce1 Related lists in Salesforce1 are restricted and display the first four fields that are defined on the page layout for that object. The number of fields shown cannot be increased. If Chatter is enabled, users can also access feeds, people, groups, and Salesforce Files. When users are working with records in the full Salesforce app, it can take up to 15 days for this data to appear in the Recent section; thus, to make records appear under the Recent section sooner, ask users to pin them from their search results in the full Salesforce site. Salesforce1 administration You can manage your organization's access to Salesforce1 apps; there are two areas of administration: the mobile browser app that users can access by logging in to Salesforce from a supported mobile browser and the downloadable app that users can install from the App Store or Google Play. The upcoming sections describe the ways to control user access to each of these mobile apps. Salesforce1 mobile browser app access You can control whether users can access the Salesforce1 mobile browser app when they log into Salesforce from a mobile browser. To select or deselect this feature, navigate to Setup | Mobile Administration | Salesforce1 | Settings, as shown in the following screenshot: By selecting the Enable the Salesforce1 mobile browser app checkbox, all users are activated to access Salesforce1 from their mobile browsers. Deselecting this option turns off the mobile browser app, which means that users will automatically access the full Salesforce site from their mobile browser. By default, the mobile browser app is turned on in all Salesforce organizations. Salesforce1 desktop browser access Selecting the Enable the Salesforce1 mobile browser app checkbox, as described in the previous section, permits users who are activated to access Salesforce1 from their desktop browsers. Users can navigate to the Salesforce1 app within their desktop browser by appending “/one/one.app” to the end of the Salesforce URL. As an example, for the following Salesforce URL accessed from the server na10, you would enter the https://na10.salesforce.com/one/one.app desktop browser URL. Salesforce1 downloadable app access The Salesforce1 app is distributed as a managed package, and within Salesforce, it is implemented as a connected app. You might already see the Salesforce1 connected app in your list of installed apps as it might have been automatically installed in your organization. The list of included apps can change with each Salesforce release but, to simplify administration, each package is asynchronously installed in Salesforce organizations whenever any user in that organization first accesses Salesforce1. However, to manually install or reinstall the Salesforce1 package for connected apps, you can install it from the AppExchange. To view the details for the Salesforce1 app in the connected app settings, navigate to Setup | Manage Apps | Connected Apps. The apps that connect to your Salesforce organization are then listed as shown in the following screenshot: Salesforce1 notifications Notifications allow all users in your organization to receive mobile notifications in Salesforce1, for example, whenever they are mentioned in Chatter or whenever they receive approval requests. To activate mobile notifications, navigate to Setup | Mobile Administration | Notifications | Settings, as shown in the following screenshot: The settings for notifications can be set as follows: Enable in-app notifications: Set this option to keep users notified about relevant Salesforce activity while they are using Salesforce1. Enable push notifications: Set this option to keep users notified of relevant Salesforce activity when they are not using the Salesforce1 downloadable app. Include full content in push notifications: Keep this checkbox unchecked if you do not want users to receive full content in push notifications. This can prevent users from receiving potentially sensitive data that might be in comments, for example. If you set this option, a pop-up dialog appears, displaying terms and conditions where you must click on OK or Cancel. Salesforce1 branding This option allows you to customize the appearance of the Salesforce1 app so that it complies with any company branding requirements that might be in place. Salesforce1 branding is supported in downloadable apps' Version 5.2 or higher and also in the mobile browser app. To specify Salesforce1 branding, navigate to Setup | Mobile Administration | Salesforce1 | Branding, as shown in the following screenshot: Salesforce1 compact layouts In Salesforce1, compact layouts are used to display the key fields on a record and are specifically designed to view records on touchscreen mobile devices. As space is limited on mobile devices and quick recognition of records is important, the first four fields that you assign to a compact layout are displayed. If a mobile user does not have the required access to one of the first four fields that have been assigned to a compact layout, the next field, if more than four fields have been set on the layout, is used. If you are yet to create custom compact layouts, the records will be displayed using a read-only, predefined system default compact layout, and after you have created a custom compact layout, you can then set it as the primary compact layout for that object. As with the full Salesforce CRM site, if you have record types associated with an object, you can alter the primary compact layout assignment and assign specific compact layouts to different record types. You can also clone a compact layout from its detail page. The upcoming field types cannot be included on compact layouts: text area, long text area, rich text area, and multiselect picklists. Salesforce1 offline access In Salesforce1, the mechanism to handle offline access is determined by users' most recently used records. These records are cached for offline access; at the time of writing this, they are read-only. The cached data is encrypted and secured through persistent storage by Salesforce1's downloadable apps. Offline access is available in Salesforce1's downloadable apps Version 6.0 and higher and was first released in Summer 2014. Offline access is enabled by default when Salesforce1's downloadable app is installed. To manage these settings, navigate to Setup | Mobile Administration | Offline. Now, check or uncheck Enable Offline Sync for Salesforce1, as shown in the following screenshot: When offline access is enabled, data based on the objects is downloaded to each user's mobile device and presented in the Recent section of the Salesforce1 navigation menu and on the user's most recently viewed records. The data is encrypted and stored in a secure, persistent cache on the mobile device. Setting up Salesforce1 with the Salesforce1 Wizard The Salesforce1 Wizard simplifies the setting up of the Salesforce1 mobile app. The wizard offers a visual tour of the key setup steps and is useful if you are new to Salesforce1 or need to quickly set up the core Salesforce1 settings. The Salesforce1 Wizard guides you through the setting up of the following Salesforce1 configuration steps: Choose which items appear in the navigation menu Configure global actions Create a contact custom compact layout Optionally, invite users to start using the Salesforce1 app To access the Salesforce1 Wizard, navigate to Setup | Salesforce1 Setup. Now, click on Launch Quick Start Wizard within the Salesforce1 Setup page, as shown in the following screenshot: Upon clicking on the Let's Get Started section link (shown in the following screenshot), you will be presented with the Salesforce1 Setup visual tour, as shown in the next section. The Quick Start Wizard The Quick Start Wizard guides you through the minimum configuration steps required to set up Salesforce1. By clicking on the Launch Quick Start Wizard button, the process to complete the essential setup tasks for Salesforce1 is initiated and provides a step-by-step wizard guide. The five steps are: Customize the Navigation Menu: This step results in the setup of the navigation menu for all users in your organization. To reorder items, drag them up and down. To remove items, drag them to the Available Items list, as shown in the following screenshot: Arrange Global Actions: Global actions provide users with quick access to Salesforce functions and in this step, you will choose and arrange the Salesforce1 global actions, as shown in the following screenshot: Actions might might have a different appearance, depending upon your version of Salesforce1. Create a Custom Compact Layout for Contacts: Compact layouts are used to show the key fields on a record in the highlights area at the top of the record detail. In this step, you are able to create a custom compact layout for contacts to set, for example, a contact's name, e-mail, and phone number, as shown in the following screenshot: However, after you have completed the Quick Start Wizard, you can create compact layouts for other objects as required. Review: In this step, you are given the chance to preview the changes to verify the results of the changes, as shown in the following screenshot: The review step screen gives you a live preview that uses your current access as the logged-in user. Send Invitations: This is the final step of the Quick Start Wizard, which will provide you with a basic setup of Salesforce1 and allow you to get feedback on what you have implemented. In this step, you can invite your users to start using the Salesforce1 app, as shown in the following screenshot: This step can be skipped and you can always send invitations later from the Salesforce1 setup page. You can also implement additional options to customize the app, such as incorporating your own branding. Differences between Salesforce1 and the full Salesforce CRM browser app In the Winter 2015 release and at the time of writing this, Salesforce1 does not have all of the features of the full Salesforce CRM site; moreover, in some areas, it includes functionality that is not available in, or is different from, the complete Salesforce site. As an example, on the full Salesforce CRM site, compact layouts determine which fields appear in the Chatter feed item and which appear after a user creates a record via a publisher action. However, compact layouts in Salesforce1 are used to display the key fields on a record. For details about the features that differ between the full Salesforce CRM site and Salesforce1, refer to Salesforce1 Limits and Differences from the Full Salesforce Site within the Salesforce Help menu sections. Summary In this article, we looked at ways in which mobile has become the new normal way to stay connected in both our personal and professional lives. Salesforce has recognized this well; we are all spending time being connected to the cloud and using business applications. However, instead of sitting at a desk, users are often on the go. To try and help their customers become successful businesses of this mobile-first world, Salesforce has produced mobile solutions that can help user get things done regardless of where they are and what they are doing. We looked at SalesforceA, which is an admin specific app that can help you manage users and monitor the status of Salesforce while on the move. We discussed Salesforce Touch, which is being replaced with Salesforce1, and we also spoke about the features and benefits of Salesforce1, which is available as a downloadable app and a browser app. Resources for Article: Further resources on this subject: Customization in Microsoft Dynamics CRM [Article] Getting Started with Microsoft Dynamics CRM 2013 Marketing [Article] Diagnostic leveraging of the Accelerated POC with the CRM Online service [Article]
Read more
  • 0
  • 0
  • 2036
article-image-basic-and-interactive-plots
Packt
06 Feb 2015
19 min read
Save for later

Basic and Interactive Plots

Packt
06 Feb 2015
19 min read
In this article by Atmajitsinh Gohil, author of the book R Data Visualization Cookbook, we will cover the following topics: A simple bar plot A simple line plot Line plot to tell an effective story Merging histograms Making an interactive bubble plot (For more resources related to this topic, see here.) The main motivation behind this article is to introduce the basics of plotting in R and an element of interactivity via the googleVis package. The basic plots are important as many packages developed in R use basic plot arguments and hence understanding them creates a good foundation for new R users. We will start by exploring the scatter plots in R, which are the most basic plots for exploratory data analysis, and then delve into interactive plots. Every section will start with an introduction to basic R plots and we will build interactive plots thereafter. We will utilize the power of R analytics and implement them using the googleVis package to introduce the element of interactivity. The googleVis package is developed by Google and it uses the Google Chart API to create interactive plots. There are a range of plots available with the googleVis package and this provides us with an advantage to plot the same data on various plots and select the one that delivers an effective message. The package undergoes regular updates and releases, and new charts are implemented with every release. The readers should note that there are other alternatives available to create interactive plots in R, but it is not possible to explore all of them and hence I have selected googleVis to display interactive elements in a chart. I have selected these purely based on my experience with interactivity in plots. The other good interactive package is offered by GGobi. A simple bar plot A bar plot can often be confused with histograms. Histograms are used to study the distribution of data whereas bar plots are used to study categorical data. Both the plots may look similar to the naked eye but the main difference is that the width of a bar plot is not of significance, whereas in histograms the width of the bars signifies the frequency of data. In this recipe, I have made use of the infant mortality rate in India. The data is made available by the Government of India. The main objective is to study the basics of a bar plot in R as shown in the following screenshot: How to do it… We start the recipe by importing our data in R using the read.csv() function. R will search for the data under the current directory, and hence we use the setwd() function to set our working directory: setwd("D:/book/scatter_Area/chapter2") data = read.csv("infant.csv", header = TRUE) Once we import the data, we would like to process the data by ordering it. We order the data using the order() function in R. We would like R to order the column Total2011 in a decreasing order: data = data[order(data$Total2011, decreasing = TRUE),] We use the ifelse() function to create a new column. We would utilize this new column to add different colors to bars in our plot. We could also write a loop in R to do this task but we will keep this for later. The ifelse() function is quick and easy. We instruct R to assign yes if values in the column Total2011 are more than 12.2 and no otherwise. The 12.2 value is not randomly chosen but is the average infant mortality rate of India: new = ifelse(data$Total2011>12.2,"yes","no") Next, we would like to join the vector of yes and no to our original dataset. In R, we can join columns using the cbind() function. Rows can be combined using rbind(): data = cbind(data,new) When we initially plot the bar plot, we observe that we need more space at the bottom of the plot. We adjust the margins of a plot in R by passing the mar() argument within the par() function. The mar() function uses four arguments: bottom, left, top, and right spacing: par(mar = c(10,5,5,5)) Next, we generate a bar plot in R using the barplot() function. The abline() function is used to add a horizontal line on the bar plot: barplot(data$Total2011, las = 2, names.arg= data$India,width =0.80, border = NA,ylim=c(0,20), col = "#e34a33", main = "InfantMortality Rate of India in 2011")abline(h = 12.2, lwd =2, col = "white", lty =2) How it works… The order() function uses permutation to rearrange (decreasing or increasing) the rows based on the variable. We would like to plot the bars from highest to lowest, and hence we require to arrange the data. The ifelse() function is used to generate a new column. We would use this column under the There's more… section of this recipe. The first argument under the ifelse() function is the logical test to be performed. The second argument is the value to be assigned if the test is true, and the third argument is the value to be assigned if the logical test fails. The first argument in the barplot() function defines the height of the bars and horiz = TRUE (not used in our code) instructs R to plot the bars horizontally. The default setting in R will plot the bars vertically. The names.arg argument is used to label the bars. We also specify border = NA to remove the borders and las = 2 is specified to apply the direction to our labels. Try replacing the las values with 1,2,3, or 4 and observe how the orientation of our labels change.. The first argument in the abline() function assigns the position where the line is drawn, that is, vertical or horizontal. The lwd, lty, and col arguments are used to define the width, line type, and color of the line. There's more… While plotting a bar plot, it's a good practice to order the data in ascending or descending order. An unordered bar plot does not convey the right message and the plot is hard to read when there are more bars involved. When we observe a plot, we are interested to get the most information out, and ordering the data is the first step toward achieving this objective. We have not specified how we can use the ifelse() and cbind() functions in the plot. If we would like to color the plot with different colors to let the readers know which states have high infant mortality above the country level, we can do this by pasting col = (data$new) in place of col = "#e34a33". See also New York Times has a very interesting implementation of an interactive bar chart and can be accessed at http://www.nytimes.com/interactive/2007/09/28/business/20070930_SAFETY_GRAPHIC.html A simple line plot Line plots are simply lines connecting all the x and y dots. They are very easy to interpret and are widely used to display an upward or downward trend in data. In this recipe, we will use the googleVis package and create an interactive R line plot. We will learn how we can emphasize on certain variables in our data. The following line plot shows fertility rate: Getting ready We will use the googleVis package to generate a line plot. How to do it… In order to construct a line chart, we will install and load the googleVis package in R. We would also import the fertility data using the read.csv() function: install.packages("googleVis") library(googleVis) frt = read.csv("fertility.csv", header = TRUE, sep =",") The fertility data is downloaded from the OECD website. We can construct our line object using the gvisLineChart() function: gvisLineChart(frt, xvar = "Year","yvar=c("Australia","Austria","Belgium","Canada","Chile","OECD34"), options = list( width = 1100, height= 500, backgroundColor = " "#FFFF99",title ="Fertility Rate in OECD countries" , vAxis = "{title : 'Total Fertility " Rate',gridlines:{color:'#DEDECE',count : 4}, ticks : "   [0,1,2,3,4]}", series = "{0:{color:'black', visibleInLegend :false},        1:{color:'BDBD9D', visibleInLegend :false},        2:{color:'BDBD9D', visibleInLegend :false},            3:{color:'BDBD9D', visibleInLegend :false},           4:{color:'BDBD9D', visibleInLegend :false},          34:{color:'3333FF', visibleInLegend :true}}")) We can construct the visualization using the plot() function in R: plot(line) How it works… The first three arguments of the gvisLineChart() function are the data and the name of the columns to be plotted on the x-axis and y-axis. The options argument lists the chart API options to add and modify elements of a chart. For the purpose of this recipe, we will use part of the dataset. Hence, while we assign the series to be plotted under yvar = c(), we will specify the column names that we would like to be plotted in our chart. Note that the series starts at 0, and hence Australia, which is the first column, is in fact series 0 and not 1. For the purpose of this exercise, let's assume that we would like to demonstrate the mean fertility rate among all OECD economies to our audience. We can achieve this using series {} under option = list(). The series argument will allow us to specify or customize a specific series in our dataset. Under the gvisLineChart() function, we instruct the Google Chart API to color OECD series (series 34) and Australia (series 0) with a different color and also make the legend visible only for OECD and not the entire series. It would be best to display all the legends but we use this to show the flexibility that comes with the Google Chart API. Finally, we can use the plot() function to plot the chart in a browser. The following screenshot displays a part of the data. The dim() function gives us a general idea about the dimensions of the fertility data: New York Times Visualization often combines line plots with bar chart and pie charts. Readers should try constructing such visualization. We can use the gvisMerge() function to merge plots. The function allows merging of just two plots and hence the readers would have to use multiple gvisMerge() functions to create a very similar visualization. The same can also be constructed in R but we will lose the interactive element. See also The OECD website provides economic data related to OECD member countries. The data can be freely downloaded from the website http://www.oecd.org/statistics/. New York Times Visualization combines bar charts and line charts and can be accessed at http://www.nytimes.com/imagepages/2009/10/16/business/20091017_CHARTS_GRAPHIC.html. Line plot to tell an effective story In the previous recipe, we learned how to plot a very basic line plot and use some of the options. In this recipe, we will go a step further and make use of specific visual cues such as color and line width for easy interpretation. Line charts are a great tool to visualize time series data. The fertility data is discrete but connecting points over time provides our audience with a direction. The visualization shows the amazing progress countries such as Mexico and Turkey have achieved in reducing their fertility rate. OECD defines fertility rate as Refers to the number of children that would be born per woman, assuming no female mortality at child-bearing ages and the age-specific fertility rates of a specified country and reference period. Line plots have been widely used by New York Times to create very interesting infographics. This recipe is inspired by one of the New York Times visualizations. It is very important to understand that many of the infographics created by professionals are created using D3.js or Processing. We will not go into the detail of the same but it is good to know the working of these softwares and how they can be used to create visualizations. Getting ready We would need to install and load the googleVis package to construct a line chart. How to do it… To generate an interactive plot, we will load the fertility data in R using the read.csv() function. To generate a line chart that plots the entire dataset, we will use the gvisLineChart() function: line = gvisLineChart(frt, xvar = "Year", yvar=c("Australia",""Austria","Belgium","Canada","Chile","Czech.Republic", "Denmark","Estonia","Finland","France","Germany","Greece","Hungary"", "Iceland","Ireland","Israel","Italy","Japan","Korea","Luxembourg",""Mexico", "Netherlands","New.Zealand","Norway","Poland","Portugal","Slovakia"","Slovenia", "Spain","Sweden","Switzerland","Turkey","United.Kingdom","United."States","OECD34"), options = list( width = 1200, backgroundColor = "#ADAD85",title " ="Fertility Rate in OECD countries" , vAxis = "{gridlines:{color:'#DEDECE',count : 3}, ticks : " [0,1,2,3,4]}", series = "{0:{color:'BDBD9D', visibleInLegend :false}, 20:{color:'009933', visibleInLegend :true}, 31:{color:'996600', visibleInLegend :true}, 34:{color:'3333FF', visibleInLegend :true}}")) To display our visualization in a new browser, we use the generic R plot() function: plot(line) How it works… The arguments passed in the gvisLineChart() function, are exactly the same as discussed under the simple line plot with some minor changes. We would like to plot the entire data for this exercise, and hence we have to state all the column names in yvar =c(). Also, we would like to color all the series with the same color but highlight Mexico, Turkey, and OECD average. We have achieved this in the previous code using series {}, and further specify and customize colors and legend visibility for specific countries. In this particular plot, we have made use of the same color for all the economies but have highlighted Mexico and Turkey to signify the development and growth that took place in the 5-year period. It would also be effective if our audience could compare the OECD average with Mexico and Turkey. This provides the audience with a benchmark they can compare with. If we plot all the legends, it may make the plot too crowded and 34 legends may not make a very attractive plot. We could avoid this by only making specific legends visible. See also D3 is a great tool to develop interactive visualization and this can be accessed at http://d3js.org/. Processing is an open source software developed by MIT and can be downloaded from https://processing.org/. A good resource to pick colors and use them in our plots is the following link: http://www.w3schools.com/tags/ref_colorpicker.asp. I have used New York Times infographics as an inspiration for this plot. You can find a collection of visualization put out by New York Times in 2011 by going to this link, http://www.smallmeans.com/new-york-times-infographics/. Merging histograms Histograms help in studying the underlying distribution. It is more useful when we are trying to compare more than one histogram on the same plot; this provides us with greater insight into the skewness and the overall distribution. In this recipe, we will study how to plot a histogram using the googleVis package and how we merge more than one histogram on the same page. We will only merge two plots but we can merge more plots and try to adjust the width of each plot. This makes it easier to compare all the plots on the same page. The following plot shows two merged histograms: How to do it… In order to generate a histogram, we will install the googleVis package as well as load the same in R: install.packages("googleVis") library(googleVis) We have downloaded the prices of two different stocks and have calculated their daily returns over the entire period. We can load the data in R using the read.csv() function. Our main aim in this recipe is to plot two different histograms and plot them side by side in a browser. Hence, we require to divide our data in three different data frames. For the purpose of this recipe, we will plot the aapl and msft data frames: stk = read.csv("stock_cor.csv", header = TRUE, sep = ",") aapl = data.frame(stk$AAPL) msft = data.frame(stk$MSFT) googl = data.frame(stk$GOOGL) To generate the histograms, we implement the gvisHistogram() function: al = gvisHistogram(aapl, options = list(histogram = "{bucketSize " :1}",legend = "none",title ='Distribution of AAPL Returns', "   width = 500,hAxis = "{showTextEvery: 5,title: "     'Returns'}",vAxis = "{gridlines : {count:4}, title : "       'Frequency'}")) mft = gvisHistogram(msft, options = list(histogram = "{bucketSize " :1}",legend = "none",title ='Distribution of MSFT Returns', "   width = 500,hAxis = "{showTextEvery: 5,title: 'Returns'}","     vAxis = "{gridlines : {count:4}, title : 'Frequency'}")) We combine the two gvis objects in one browser using the gvisMerge() function: mrg = gvisMerge(al,mft, horizontal = TRUE) plot(mrg) How it works… The data.frame() function is used to construct a data frame in R. We require this step as we do not want to plot all the three histograms on the same plot. Note the use of the $ notation in the data.frame() function. The first argument in the gvisHistogram() function is our data stored as a data frame. We can display individual histograms using the plot(al) and plot(mft) functions. But in this recipe, we will plot the final output. We observe that most of the attributes of a histogram function are the same as discussed in previous recipes. The histogram functionality will use an algorithm to create buckets, but we can control this using the bucketSize as histogram = "{bucketSize :1}". Try using different bucket sizes and observe how the buckets in the histograms change. More options related to histograms can also be found in the following link under the Controlling Buckets section: https://developers.google.com/chart/interactive/docs/gallery/histogram#Buckets We have utilized showTextEvery, which is also very specific to histograms. This option allows us to specify how many horizontal axis labels we would like to show. We have used 5 to make the histogram more compact. Our main objective is to observe the distribution and the plot serves our purpose. Finally, we will implement plot() to plot the chart in our favorite browser. We do the same steps to plot the return distribution of Microsoft (MSFT). Now, we would like to place both the plots side by side and view the differences in the distribution. We will use the gvisMerge() function to generate histograms side by side. In our recipe, we have two plots for AAPL and MSFT. The default setting plots each chart vertically but we can specify horizontal = true to plot charts horizontally. Making an interactive bubble plot My first encounter with a bubble plot was while watching a TED video of Hans Roslling. The video led me to search for creating bubble plots in R; a very good introduction to this is available on the Flowing Data website. The advantage of a bubble plot is that it allows us to visualize a third variable, which in our case would be the size of the bubble. In this recipe, I have made use of the googleVis package to plot a bubble plot but you can also implement this in R. The advantage of the Google Chart API is the interactivity and the ease with which they can be attached to a web page. Also note that we could also use squares instead of circles, but this is not implemented in the Google Chart API yet. In order to implement a bubble plot, I have downloaded the crime dataset by state. The details regarding the link and definition of crime data are available in the crime.txt file and are shown in the following screenshot: How to do it… As with all the plots in this article, we will install and load the googleVis Package. We will also import our data file in R using the read.csv() function: crm = read.csv("crimeusa.csv", header = TRUE, sep =",") We can construct our bubble chart using the gvisBubbleChart() function in R: bub1 = gvisBubbleChart(crm,idvar = "States",xvar= "Robbery", yvar="Burglary", sizevar ="Population", colorvar = "Year",options = list(legend = "none",width = 900, height = 600,title=" Crime per State in 2012", sizeAxis ="{maxSize : 40, minSize:0.5}",vAxis = "{title : 'Burglary'}",hAxis= "{title :'Robbery'}"))bub2 = gvisBubbleChart(crm,idvar = "States",xvar= "Robbery", yvar="Burglary",sizevar ="Population",options = list(legend = "none",width = 900, height = 600,title=" Crime per State in 2012", sizeAxis ="{maxSize : 40, minSize:0.5}",vAxis = "{title : 'Burglary'}",hAxis= "{title :'Robbery'}"))ata How it works… The gvisBubbleChart() function uses six attributes to create a bubble chart, which are as follows: data: This is the data defined as a data frame, in our example, crm idvar: This is the vector that is used to assign IDs to the bubbles, in our example, states xvar: This is the column in the data to plot on the x-axis, in our example, Robbery yvar: This is the column in the data to plot on the y-axis, in our example, Burglary sizevar: This is the column used to define the size of the bubble colorvar: This is the column used to define the color We can define the minimum and maximum sizes of each bubble using minSize and maxSize, respectively, under options(). Note that we have used gvisMerge to portray the differences among the bubble plots. In the plot on the right, we have not made use of colorvar and hence all the bubbles are of the same size. There's more… The Google Chart API makes it easier for us to plot a bubble, but the same can be achieved using the R basic plot function. We can make use of the symbols to create a plot. The symbols need not be a bubble; it can be a square as well. By this time, you should have watched Hans' TED lecture and would be wondering how you could create a motion chart with bubbles floating around. The Google Charts API has the ability to create motion charts and the readers can definitely use the googleVis reference manual to learn about this. See also TED video by Hans Rosling can be accessed at http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen The Flowing Data website generates bubble charts using the basic R plot function and can be accessed at http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/ Animated Bubble Chart by New York Times can be accessed at http://2010games.nytimes.com/medals/map.html Summary This article introduces some of the basic R plots, such as line and bar charts. It also discusses the basic elements of interactive plots using the googleVis package in R. This article is a great resource for understanding the basic R plotting techniques. Resources for Article: Further resources on this subject: Using R for Statistics, Research, and Graphics [article] Data visualization [article] Visualization as a Tool to Understand Data [article]
Read more
  • 0
  • 0
  • 2492

article-image-warming
Packt
06 Feb 2015
11 min read
Save for later

Warming Up

Packt
06 Feb 2015
11 min read
In this article by Bater Makhabel, author of Learning Data Mining with R, you will learn basic data mining terms such as data definition, preprocessing, and so on. (For more resources related to this topic, see here.) The most important data mining algorithms will be illustrated with R to help you grasp the principles quickly, including but not limited to, classification, clustering, and outlier detection. Before diving right into data mining, let's have a look at the topics we'll cover: Data mining Social network mining In the history of humankind, the results of data from every aspect is extensive, for example websites, social networks by user's e-mail or name or account, search terms, locations on map, companies, IP addresses, books, films, music, and products. Data mining techniques can be applied to any kind of old or emerging data; each data type can be best dealt with using certain, but not all, techniques. In other words, the data mining techniques are constrained by data type, size of the dataset, context of the tasks applied, and so on. Every dataset has its own appropriate data mining solutions. New data mining techniques always need to be researched along with new data types once the old techniques cannot be applied to it or if the new data type cannot be transformed onto the traditional data types. The evolution of stream mining algorithms applied to Twitter's huge source set is one typical example. The graph mining algorithms developed for social networks is another example. The most popular and basic forms of data are from databases, data warehouses, ordered/sequence data, graph data, text data, and so on. In other words, they are federated data, high dimensional data, longitudinal data, streaming data, web data, numeric, categorical, or text data. Big data Big data is large amount of data that does not fit in the memory of a single machine. In other words, the size of data itself becomes a part of the issue when studying it. Besides volume, two other major characteristics of big data are variety and velocity; these are the famous three Vs of big data. Velocity means data process rate or how fast the data is being processed. Variety denotes various data source types. Noises arise more frequently in big data source sets and affect the mining results, which require efficient data preprocessing algorithms. As a result, distributed filesystems are used as tools for successful implementation of parallel algorithms on large amounts of data; it is a certainty that we will get even more data with each passing second. Data analytics and visualization techniques are the primary factors of the data mining tasks related to massive data. Some data types that are important to big data are as follows: The data from the camera video, which includes more metadata for analysis to expedite crime investigations, enhanced retail analysis, military intelligence, and so on. The second data type is from embedded sensors, such as medical sensors, to monitor any potential outbreaks of virus. The third data type is from entertainment, information freely published through social media by anyone. The last data type is consumer images, aggregated from social media, and tagging on these like images are important. Here is a table illustrating the history of data size growth. It shows that information will be more than double every two years, changing the way researchers or companies manage and extract value through data mining techniques from data, revealing new data mining studies. Year Data Sizes Comments N/A   1 MB (Megabyte) = 220. The human brain holds about 200 MB of information. N/A   1 PB (Petabyte) = 250. It is similar to the size of 3 years' observation data for Earth by NASA and is equivalent of 70.8 times the books in America's Library of Congress. 1999 1 EB 1 EB (Exabyte) = 260. The world produced 1.5 EB of unique information. 2007 281 EB The world produced about 281 Exabyte of unique information. 2011 1.8 ZB 1 ZB (Zetabyte)= 270. This is all data gathered by human beings in 2011. Very soon   1 YB(Yottabytes)= 280. Scalability and efficiency Efficiency, scalability, performance, optimization, and the ability to perform in real time are important issues for almost any algorithms, and it is the same for data mining. There are always necessary metrics or benchmark factors of data mining algorithms. As the amount of data continues to grow, keeping data mining algorithms effective and scalable is necessary to effectively extract information from massive datasets in many data repositories or data streams. The storage of data from a single machine to wide distribution, the huge size of many datasets, and the computational complexity of the data mining methods are all factors that drive the development of parallel and distributed data-intensive mining algorithms. Data source Data serves as the input for the data mining system and data repositories are important. In an enterprise environment, database and logfiles are common sources. In web data mining, web pages are the source of data. The data that continuously fetched various sensors are also a typical data source. Here are some free online data sources particularly helpful to learn about data mining: Frequent Itemset Mining Dataset Repository: A repository with datasets for methods to find frequent itemsets (http://fimi.ua.ac.be/data/). UCI Machine Learning Repository: This is a collection of dataset, suitable for classification tasks (http://archive.ics.uci.edu/ml/). The Data and Story Library at statlib: DASL (pronounced "dazzle") is an online library of data files and stories that illustrate the use of basic statistics methods. We hope to provide data from a wide variety of topics so that statistics teachers can find real-world examples that will be interesting to their students. Use DASL's powerful search engine to locate the story or data file of interest. (http://lib.stat.cmu.edu/DASL/) WordNet: This is a lexical database for English (http://wordnet.princeton.edu) Data mining Data mining is the discovery of a model in data; it's also called exploratory data analysis, and discovers useful, valid, unexpected, and understandable knowledge from the data. Some goals are shared with other sciences, such as statistics, artificial intelligence, machine learning, and pattern recognition. Data mining has been frequently treated as an algorithmic problem in most cases. Clustering, classification, association rule learning, anomaly detection, regression, and summarization are all part of the tasks belonging to data mining. The data mining methods can be summarized into two main categories of data mining problems: feature extraction and summarization. Feature extraction This is to extract the most prominent features of the data and ignore the rest. Here are some examples: Frequent itemsets: This model makes sense for data that consists of baskets of small sets of items. Similar items: Sometimes your data looks like a collection of sets and the objective is to find pairs of sets that have a relatively large fraction of their elements in common. It's a fundamental problem of data mining. Summarization The target is to summarize the dataset succinctly and approximately, such as clustering, which is the process of examining a collection of points (data) and grouping the points into clusters according to some measure. The goal is that points in the same cluster have a small distance from one another, while points in different clusters are at a large distance from one another. The data mining process There are two popular processes to define the data mining process in different perspectives, and the more widely adopted one is CRISP-DM: Cross-Industry Standard Process for Data Mining (CRISP-DM) Sample, Explore, Modify, Model, Assess (SEMMA), which was developed by the SAS Institute, USA CRISP-DM There are six phases in this process that are shown in the following figure; it is not rigid, but often has a great deal of backtracking: Let's look at the phases in detail: Business understanding: This task includes determining business objectives, assessing the current situation, establishing data mining goals, and developing a plan. Data understanding: This task evaluates data requirements and includes initial data collection, data description, data exploration, and the verification of data quality. Data preparation: Once available, data resources are identified in the last step. Then, the data needs to be selected, cleaned, and then built into the desired form and format. Modeling: Visualization and cluster analysis are useful for initial analysis. The initial association rules can be developed by applying tools such as generalized rule induction. This is a data mining technique to discover knowledge represented as rules to illustrate the data in the view of causal relationship between conditional factors and a given decision/outcome. The models appropriate to the data type can also be applied. Evaluation :The results should be evaluated in the context specified by the business objectives in the first step. This leads to the identification of new needs and in turn reverts to the prior phases in most cases. Deployment: Data mining can be used to both verify previously held hypotheses or for knowledge. SEMMA Here is an overview of the process for SEMMA: Let's look at these processes in detail: Sample: In this step, a portion of a large dataset is extracted Explore: To gain a better understanding of the dataset, unanticipated trends and anomalies are searched in this step Modify: The variables are created, selected, and transformed to focus on the model construction process Model: A variable combination of models is searched to predict a desired outcome Assess: The findings from the data mining process are evaluated by its usefulness and reliability Social network mining As we mentioned before, data mining finds a model on data and the mining of social network finds the model on graph data in which the social network is represented. Social network mining is one application of web data mining; the popular applications are social sciences and bibliometry, PageRank and HITS, shortcomings of the coarse-grained graph model, enhanced models and techniques, evaluation of topic distillation, and measuring and modeling the Web. Social network When it comes to the discussion of social networks, you will think of Facebook, Google+, LinkedIn, and so on. The essential characteristics of a social network are as follows: There is a collection of entities that participate in the network. Typically, these entities are people, but they could be something else entirely. There is at least one relationship between the entities of the network. On Facebook, this relationship is called friends. Sometimes, the relationship is all-or-nothing; two people are either friends or they are not. However, in other examples of social networks, the relationship has a degree. This degree could be discrete, for example, friends, family, acquaintances, or none as in Google+. It could be a real number; an example would be the fraction of the average day that two people spend talking to each other. There is an assumption of nonrandomness or locality. This condition is the hardest to formalize, but the intuition is that relationships tend to cluster. That is, if entity A is related to both B and C, then there is a higher probability than average that B and C are related. Here are some varieties of social networks: Telephone networks: The nodes in this network are phone numbers and represent individuals E-mail networks: The nodes represent e-mail addresses, which represent individuals Collaboration networks: The nodes here represent individuals who published research papers; the edge connecting two nodes represent two individuals who published one or more papers jointly Social networks are modeled as undirected graphs. The entities are the nodes, and an edge connects two nodes if the nodes are related by the relationship that characterizes the network. If there is a degree associated with the relationship, this degree is represented by labeling the edges. Here is an example in which Coleman's High School Friendship Data from the sna R package is used for analysis. The data is from a research on friendship ties between 73 boys in a high school in one chosen academic year; reported ties for all informants are provided for two time points (fall and spring). The dataset's name is coleman, which is an array type in R language. The node denotes a specific student and the line represents the tie between two students. Summary The book has, as showcased in this article, a lot more interesting coverage with regard to data mining and R. Deep diving into the algorithms associated with data mining and efficient methods to implement them using R. Resources for Article: Further resources on this subject: Multiplying Performance with Parallel Computing [article] Supervised learning [article] Using R for Statistics, Research, and Graphics [article]
Read more
  • 0
  • 0
  • 2000

article-image-postgresql-cookbook-high-availability-and-replication
Packt
06 Feb 2015
26 min read
Save for later

PostgreSQL Cookbook - High Availability and Replication

Packt
06 Feb 2015
26 min read
In this article by Chitij Chauhan, author of the book PostgreSQL Cookbook, we will talk about various high availability and replication solutions, including some popular third-party replication tools such as Slony-I and Londiste. In this article, we will cover the following recipes: Setting up hot streaming replication Replication using Slony-I Replication using Londiste The important components for any production database is to achieve fault tolerance, 24/7 availability, and redundancy. It is for this purpose that we have different high availability and replication solutions available for PostgreSQL. From a business perspective, it is important to ensure 24/7 data availability in the event of a disaster situation or a database crash due to disk or hardware failure. In such situations, it becomes critical to ensure that a duplicate copy of the data is available on a different server or a different database, so that seamless failover can be achieved even when the primary server/database is unavailable. Setting up hot streaming replication In this recipe, we are going to set up a master-slave streaming replication. Getting ready For this exercise, you will need two Linux machines, each with the latest version of PostgreSQL installed. We will be using the following IP addresses for the master and slave servers: Master IP address: 192.168.0.4 Slave IP address: 192.168.0.5 Before you start with the master-slave streaming setup, it is important that the SSH connectivity between the master and slave is setup. How to do it... Perform the following sequence of steps to set up a master-slave streaming replication: First, we are going to create a user on the master, which will be used by the slave server to connect to the PostgreSQL database on the master server: psql -c "CREATE USER repuser REPLICATION LOGIN ENCRYPTED PASSWORD 'charlie';" Next, we will allow the replication user that was created in the previous step to allow access to the master PostgreSQL server. This is done by making the necessary changes as mentioned in the pg_hba.conf file: Vi pg_hba.conf host   replication   repuser   192.168.0.5/32   md5 In the next step, we are going to configure parameters in the postgresql.conf file. These parameters need to be set in order to get the streaming replication working: Vi /var/lib/pgsql/9.3/data/postgresql.conf listen_addresses = '*' wal_level = hot_standby max_wal_senders = 3 wal_keep_segments = 8 archive_mode = on       archive_command = 'cp %p /var/lib/pgsql/archive/%f && scp %p postgres@192.168.0.5:/var/lib/pgsql/archive/%f' checkpoint_segments = 8 Once the parameter changes have been made in the postgresql.conf file in the previous step, the next step will be to restart the PostgreSQL server on the master server, in order to let the changes take effect: pg_ctl -D /var/lib/pgsql/9.3/data restart Before the slave can replicate the master, we will need to give it the initial database to build off. For this purpose, we will make a base backup by copying the primary server's data directory to the standby. The rsync command needs to be run as a root user: psql -U postgres -h 192.168.0.4 -c "SELECT pg_start_backup('label', true)" rsync -a /var/lib/pgsql/9.3/data/ 192.168.0.5:/var/lib/pgsql/9.3/data/ --exclude postmaster.pid psql -U postgres -h 192.168.0.4 -c "SELECT pg_stop_backup()" Once the data directory, mentioned in the previous step, is populated, the next step is to enable the following parameter in the postgresql.conf file on the slave server: hot_standby = on The next step will be to copy the recovery.conf.sample file in the $PGDATA location on the slave server and then configure the following parameters: cp /usr/pgsql-9.3/share/recovery.conf.sample /var/lib/pgsql/9.3/data/recovery.conf standby_mode = on primary_conninfo = 'host=192.168.0.4 port=5432 user=repuser password=charlie' trigger_file = '/tmp/trigger.replication′ restore_command = 'cp /var/lib/pgsql/archive/%f "%p"' The next step will be to start the slave server: service postgresql-9.3 start Now that the above mentioned replication steps are set up, we will test for replication. On the master server, log in and issue the following SQL commands: psql -h 192.168.0.4 -d postgres -U postgres -W postgres=# create database test;   postgres=# c test;   test=# create table testtable ( testint int, testchar varchar(40) );   CREATE TABLE test=# insert into testtable values ( 1, 'What A Sight.' ); INSERT 0 1 On the slave server, we will now check whether the newly created database and the corresponding table, created in the previous step, are replicated: psql -h 192.168.0.5 -d test -U postgres -W test=# select * from testtable; testint | testchar ---------+--------------------------- 1 | What A Sight. (1 row) How it works... The following is the explanation for the steps performed in the preceding section. In the initial step of the preceding section, we create a user called repuser, which will be used by the slave server to make a connection to the primary server. In the second step of the preceding section, we make the necessary changes in the pg_hba.conf file to allow the master server to be accessed by the slave server using the repuser user ID that was created in step 1. We then make the necessary parameter changes on the master in step 3 of the preceding section to configure a streaming replication. The following is a description of these parameters: listen_addresses: This parameter is used to provide the IP address associated with the interface that you want to have PostgreSQL listen to. A value of * indicates all available IP addresses. wal_level: This parameter determines the level of WAL logging done. Specify hot_standby for streaming replication. wal_keep_segments: This parameter specifies the number of 16 MB WAL files to be retained in the pg_xlog directory. The rule of thumb is that more such files might be required to handle a large checkpoint. archive_mode: Setting this parameter enables completed WAL segments to be sent to the archive storage. archive_command: This parameter is basically a shell command that is executed whenever a WAL segment is completed. In our case, we are basically copying the file to the local machine and then using the secure copy command to send it across to the slave. max_wal_senders: This parameter specifies the total number of concurrent connections allowed from the slave servers. checkpoint_segments: This parameter specifies the maximum number of logfile segments between automatic WAL checkpoints. Once the necessary configuration changes have been made on the master server, we then restart the PostgreSQL server on the master in order to let the new configuration changes take effect. This is done in step 4 of the preceding section. In step 5 of the preceding section, we are basically building the slave by copying the primary server's data directory to the slave. Now, with the data directory available on the slave, the next step is to configure it. We will now make the necessary parameter replication related parameter changes on the slave in the postgresql.conf directory on the slave server. We set the following parameters on the slave: hot_standby: This parameter determines whether you can connect and run queries when the server is in the archive recovery or standby mode. In the next step, we are configuring the recovery.conf file. This is required to be set up so that the slave can start receiving logs from the master. The parameters explained next are configured in the recovery.conf file on the slave. standby_mode: This parameter, when enabled, causes PostgreSQL to work as a standby in a replication configuration. primary_conninfo: This parameter specifies the connection information used by the slave to connect to the master. For our scenario, our master server is set as 192.168.0.4 on port 5432 and we are using the repuser userid with the password charlie to make a connection to the master. Remember that repuser was the userid which was created in the initial step of the preceding section for this purpose, that is, connecting to the master from the slave. trigger_file: When a slave is configured as a standby, it will continue to restore the XLOG records from the master. The trigger_file parameter specifies what is used to trigger a slave, in order to switch over its duties from standby and take over as master or primary server. At this stage, the slave is fully configured now and we can start the slave server; then, the replication process begins. This is shown in step 8 of the preceding section. In steps 9 and 10 of the preceding section, we are simply testing our replication. We first begin by creating a test database, then we log in to the test database and create a table by the name testtable, and then we begin inserting some records into the testtable table. Now, our purpose is to see whether these changes are replicated across the slave. To test this, we log in to the slave on the test database and then query the records from the testtable table, as seen in step 10 of the preceding section. The final result that we see is that all the records that are changed/inserted on the primary server are visible on the slave. This completes our streaming replication's setup and configuration. Replication using Slony-I Here, we are going to set up replication using Slony-I. We will be setting up the replication of table data between two databases on the same server. Getting ready The steps performed in this recipe are carried out on a CentOS Version 6 machine. It is also important to remove the directives related to hot streaming replication prior to setting up replication using Slony-I. We will first need to install Slony-I. The following steps need to be performed in order to install Slony-I: First, go to http://slony.info/downloads/2.2/source/ and download the given software. Once you have downloaded the Slony-I software, the next step is to unzip the .tar file and then go the newly created directory. Before doing this, please ensure that you have the postgresql-devel package for the corresponding PostgreSQL version installed before you install Slony-I: tar xvfj slony1-2.2.3.tar.bz2  cd slony1-2.2.3 In the next step, we are going to configure, compile, and build the software: ./configure --with-pgconfigdir=/usr/pgsql-9.3/bin/ make make install How to do it... You need to perform the following sequence of steps, in order to replicate data between two tables using Slony-I replication: First, start the PostgreSQL server if you have not already started it: pg_ctl -D $PGDATA start In the next step, we will be creating two databases, test1 and test2, which will be used as the source and target databases respectively: createdb test1 createdb test2 In the next step, we will create the t_test table on the source database, test1, and insert some records into it: psql -d test1 test1=# create table t_test (id numeric primary key, name varchar);   test1=# insert into t_test values(1,'A'),(2,'B'), (3,'C'); We will now set up the target database by copying the table definitions from the test1 source database: pg_dump -s -p 5432 -h localhost test1 | psql -h localhost -p 5432 test2 We will now connect to the target database, test2, and verify that there is no data in the tables of the test2 database: psql -d test2 test2=# select * from t_test; We will now set up a slonik script for the master-slave, that is source/target, setup. In this scenario, since we are replicating between two different databases on the same server, the only different connection string option will be the database name: cd /usr/pgsql-9.3/bin vi init_master.slonik   #!/bin/sh cluster name = mycluster; node 1 admin conninfo = 'dbname=test1 host=localhost port=5432 user=postgres password=postgres'; node 2 admin conninfo = 'dbname=test2 host=localhost port=5432 user=postgres password=postgres'; init cluster ( id=1); create set (id=1, origin=1); set add table(set id=1, origin=1, id=1, fully qualified name = 'public.t_test'); store node (id=2, event node = 1); store path (server=1, client=2, conninfo='dbname=test1 host=localhost port=5432 user=postgres password=postgres'); store path (server=2, client=1, conninfo='dbname=test2 host=localhost port=5432 user=postgres password=postgres'); store listen (origin=1, provider = 1, receiver = 2);  store listen (origin=2, provider = 2, receiver = 1); We will now create a slonik script for subscription to the slave, that is, target: cd /usr/pgsql-9.3/bin vi init_slave.slonik #!/bin/sh cluster name = mycluster; node 1 admin conninfo = 'dbname=test1 host=localhost port=5432 user=postgres password=postgres'; node 2 admin conninfo = 'dbname=test2 host=localhost port=5432 user=postgres password=postgres'; subscribe set ( id = 1, provider = 1, receiver = 2, forward = no); We will now run the init_master.slonik script created in step 6 and run this on the master, as follows: cd /usr/pgsql-9.3/bin   slonik init_master.slonik We will now run the init_slave.slonik script created in step 7 and run this on the slave, that is, target: cd /usr/pgsql-9.3/bin   slonik init_slave.slonik In the next step, we will start the master slon daemon: nohup slon mycluster "dbname=test1 host=localhost port=5432 user=postgres password=postgres" & In the next step, we will start the slave slon daemon: nohup slon mycluster "dbname=test2 host=localhost port=5432 user=postgres password=postgres" & Next, we will connect to the master, that is, the test1 source database, and insert some records in the t_test table: psql -d test1 test1=# insert into t_test values (5,'E'); We will now test for the replication by logging on to the slave, that is, the test2 target database, and see whether the inserted records in the t_test table are visible: psql -d test2   test2=# select * from t_test; id | name ----+------ 1 | A 2 | B 3 | C 5 | E (4 rows) How it works... We will now discuss the steps performed in the preceding section: In step 1, we first start the PostgreSQL server if it is not already started. In step 2, we create two databases, namely test1 and test2, that will serve as our source (master) and target (slave) databases. In step 3, we log in to the test1 source database, create a t_test table, and insert some records into the table. In step 4, we set up the target database, test2, by copying the table definitions present in the source database and loading them into test2 using the pg_dump utility. In step 5, we log in to the target database, test2, and verify that there are no records present in the t_test table because in step 4, we only extracted the table definitions into the test2 database from the test1 database. In step 6, we set up a slonik script for the master-slave replication setup. In the init_master.slonik file, we first define the cluster name as mycluster. We then define the nodes in the cluster. Each node will have a number associated with a connection string, which contains database connection information. The node entry is defined both for the source and target databases. The store_path commands are necessary, so that each node knows how to communicate with the other. In step 7, we set up a slonik script for the subscription of the slave, that is, the test2 target database. Once again, the script contains information such as the cluster name and the node entries that are designated a unique number related to connection string information. It also contains a subscriber set. In step 8, we run the init_master.slonik file on the master. Similarly, in step 9, we run the init_slave.slonik file on the slave. In step 10, we start the master slon daemon. In step 11, we start the slave slon daemon. The subsequent steps, 12 and 13, are used to test for replication. For this purpose, in step 12 of the preceding section, we first log in to the test1 source database and insert some records into the t_test table. To check whether the newly inserted records have been replicated in the target database, test2, we log in to the test2 database in step 13. The result set obtained from the output of the query confirms that the changed/inserted records on the t_test table in the test1 database are successfully replicated across the target database, test2. For more information on Slony-I replication, go to http://slony.info/documentation/tutorial.html. There's more... If you are using Slony-I for replication between two different servers, in addition to the steps mentioned in the How to do it… section, you will also have to enable authentication information in the pg_hba.conf file existing on both the source and target servers. For example, let's assume that the source server's IP is 192.168.16.44 and the target server's IP is 192.168.16.56 and we are using a user named super to replicate the data. If this is the situation, then in the source server's pg_hba.conf file, we will have to enter the information, as follows: host         postgres         super     192.168.16.44/32           md5 Similarly, in the target server's pg_hba.conf file, we will have to enter the authentication information, as follows: host         postgres         super     192.168.16.56/32           md5 Also, in the shell scripts that were used for Slony-I, wherever the connection information for the host is localhost that entry will need to be replaced by the source and target server's IP addresses. Replication using Londiste In this recipe, we are going to show you how to replicate data using Londiste. Getting ready For this setup, we are using the same host CentOS Linux machine to replicate data between two databases. This can also be set up using two separate Linux machines running on VMware, VirtualBox, or any other virtualization software. It is assumed that the latest version of PostgreSQL, version 9.3, is installed. We used CentOS Version 6 as the Linux operating system for this exercise. To set up Londiste replication on the Linux machine, perform the following steps: Go to http://pgfoundry.org/projects/skytools/ and download the latest version of Skytools 3.2, that is, tarball skytools-3.2.tar.gz. Extract the tarball file, as follows: tar -xvzf skytools-3.2.tar.gz Go to the new location and build and compile the software: cd skytools-3.2 ./configure --prefix=/var/lib/pgsql/9.3/Sky –with-pgconfig=/usr/pgsql-9.3/bin/pg_config   make   make install Also, set the PYTHONPATH environment variable, as shown here. Alternatively, you can also set it in the .bash_profile script: export PYTHONPATH=/opt/PostgreSQL/9.2/Sky/lib64/python2.6/site-packages/ How to do it... We are going to perform the following sequence of steps to set up replication between two different databases using Londiste. First, create the two databases between which replication has to occur: createdb node1 createdb node2 Populate the node1 database with data using the pgbench utility: pgbench -i -s 2 -F 80 node1 Add any primary key and foreign keys to the tables in the node1 database that are needed for replication. Create the following .sql file and add the following lines to it: Vi /tmp/prepare_pgbenchdb_for_londiste.sql -- add primary key to history table ALTER TABLE pgbench_history ADD COLUMN hid SERIAL PRIMARY KEY;   -- add foreign keys ALTER TABLE pgbench_tellers ADD CONSTRAINT pgbench_tellers_branches_fk FOREIGN KEY(bid) REFERENCES pgbench_branches; ALTER TABLE pgbench_accounts ADD CONSTRAINT pgbench_accounts_branches_fk FOREIGN KEY(bid) REFERENCES pgbench_branches; ALTER TABLE pgbench_history ADD CONSTRAINT pgbench_history_branches_fk FOREIGN KEY(bid) REFERENCES pgbench_branches; ALTER TABLE pgbench_history ADD CONSTRAINT pgbench_history_tellers_fk FOREIGN KEY(tid) REFERENCES pgbench_tellers; ALTER TABLE pgbench_history ADD CONSTRAINT pgbench_history_accounts_fk FOREIGN KEY(aid) REFERENCES pgbench_accounts; We will now load the .sql file created in the previous step and load it into the database: psql node1 -f /tmp/prepare_pgbenchdb_for_londiste.sql We will now populate the node2 database with table definitions from the tables in the node1 database: pg_dump -s -t 'pgbench*' node1 > /tmp/tables.sql psql -f /tmp/tables.sql node2 Now starts the process of replication. We will first create the londiste.ini configuration file with the following parameters in order to set up the root node for the source database, node1: Vi londiste.ini   [londiste3] job_name = first_table db = dbname=node1 queue_name = replication_queue logfile = /home/postgres/log/londiste.log pidfile = /home/postgres/pid/londiste.pid In the next step, we are going to use the londiste.ini configuration file created in the previous step to set up the root node for the node1 database, as shown here: [postgres@localhost bin]$ ./londiste3 londiste3.ini create-root node1 dbname=node1   2014-12-09 18:54:34,723 2335 WARNING No host= in public connect string, bad idea 2014-12-09 18:54:35,210 2335 INFO plpgsql is installed 2014-12-09 18:54:35,217 2335 INFO pgq is installed 2014-12-09 18:54:35,225 2335 INFO pgq.get_batch_cursor is installed 2014-12-09 18:54:35,227 2335 INFO pgq_ext is installed 2014-12-09 18:54:35,228 2335 INFO pgq_node is installed 2014-12-09 18:54:35,230 2335 INFO londiste is installed 2014-12-09 18:54:35,232 2335 INFO londiste.global_add_table is installed 2014-12-09 18:54:35,281 2335 INFO Initializing node 2014-12-09 18:54:35,285 2335 INFO Location registered 2014-12-09 18:54:35,447 2335 INFO Node "node1" initialized for queue "replication_queue" with type "root" 2014-12-09 18:54:35,465 2335 INFO Don We will now run the worker daemon for the root node: [postgres@localhost bin]$ ./londiste3 londiste3.ini worker 2014-12-09 18:55:17,008 2342 INFO Consumer uptodate = 1 In the next step, we will create a slave.ini configuration file in order to make a leaf node for the node2 target database: Vi slave.ini [londiste3] job_name = first_table_slave db = dbname=node2 queue_name = replication_queue logfile = /home/postgres/log/londiste_slave.log pidfile = /home/postgres/pid/londiste_slave.pid We will now initialize the node in the target database: ./londiste3 slave.ini create-leaf node2 dbname=node2 –provider=dbname=node1 2014-12-09 18:57:22,769 2408 WARNING No host= in public connect string, bad idea 2014-12-09 18:57:22,778 2408 INFO plpgsql is installed 2014-12-09 18:57:22,778 2408 INFO Installing pgq 2014-12-09 18:57:22,778 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/pgq.sql 2014-12-09 18:57:23,211 2408 INFO pgq.get_batch_cursor is installed 2014-12-09 18:57:23,212 2408 INFO Installing pgq_ext 2014-12-09 18:57:23,213 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/pgq_ext.sql 2014-12-09 18:57:23,454 2408 INFO Installing pgq_node 2014-12-09 18:57:23,455 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/pgq_node.sql 2014-12-09 18:57:23,729 2408 INFO Installing londiste 2014-12-09 18:57:23,730 2408 INFO   Reading from /var/lib/pgsql/9.3/Sky/share/skytools3/londiste.sql 2014-12-09 18:57:24,391 2408 INFO londiste.global_add_table is installed 2014-12-09 18:57:24,575 2408 INFO Initializing node 2014-12-09 18:57:24,705 2408 INFO Location registered 2014-12-09 18:57:24,715 2408 INFO Location registered 2014-12-09 18:57:24,744 2408 INFO Subscriber registered: node2 2014-12-09 18:57:24,748 2408 INFO Location registered 2014-12-09 18:57:24,750 2408 INFO Location registered 2014-12-09 18:57:24,757 2408 INFO Node "node2" initialized for queue "replication_queue" with type "leaf" 2014-12-09 18:57:24,761 2408 INFO Done We will now launch the worker daemon for the target database, that is, node2: [postgres@localhost bin]$ ./londiste3 slave.ini worker 2014-12-09 18:58:53,411 2423 INFO Consumer uptodate = 1 We will now create the configuration file, that is pgqd.ini, for the ticker daemon: vi pgqd.ini   [pgqd] logfile = /home/postgres/log/pgqd.log pidfile = /home/postgres/pid/pgqd.pid Using the configuration file created in the previous step, we will launch the ticker daemon: [postgres@localhost bin]$ ./pgqd pgqd.ini 2014-12-09 19:05:56.843 2542 LOG Starting pgqd 3.2 2014-12-09 19:05:56.844 2542 LOG auto-detecting dbs ... 2014-12-09 19:05:57.257 2542 LOG node1: pgq version ok: 3.2 2014-12-09 19:05:58.130 2542 LOG node2: pgq version ok: 3.2 We will now add all the tables to the replication on the root node: [postgres@localhost bin]$ ./londiste3 londiste3.ini add-table --all 2014-12-09 19:07:26,064 2614 INFO Table added: public.pgbench_accounts 2014-12-09 19:07:26,161 2614 INFO Table added: public.pgbench_branches 2014-12-09 19:07:26,238 2614 INFO Table added: public.pgbench_history 2014-12-09 19:07:26,287 2614 INFO Table added: public.pgbench_tellers Similarly, add all the tables to the replication on the leaf node: [postgres@localhost bin]$ ./londiste3 slave.ini add-table –all We will now generate some traffic on the node1 source database: pgbench -T 10 -c 5 node1 We will now use the compare utility available with the londiste3 command to check the tables in both the nodes; that is, both the source database (node1) and destination database (node2) have the same amount of data: [postgres@localhost bin]$ ./londiste3 slave.ini compare   2014-12-09 19:26:16,421 2982 INFO Checking if node1 can be used for copy 2014-12-09 19:26:16,424 2982 INFO Node node1 seems good source, using it 2014-12-09 19:26:16,425 2982 INFO public.pgbench_accounts: Using node node1 as provider 2014-12-09 19:26:16,441 2982 INFO Provider: node1 (root) 2014-12-09 19:26:16,446 2982 INFO Locking public.pgbench_accounts 2014-12-09 19:26:16,447 2982 INFO Syncing public.pgbench_accounts 2014-12-09 19:26:18,975 2982 INFO Counting public.pgbench_accounts 2014-12-09 19:26:19,401 2982 INFO srcdb: 200000 rows, checksum=167607238449 2014-12-09 19:26:19,706 2982 INFO dstdb: 200000 rows, checksum=167607238449 2014-12-09 19:26:19,715 2982 INFO Checking if node1 can be used for copy 2014-12-09 19:26:19,716 2982 INFO Node node1 seems good source, using it 2014-12-09 19:26:19,716 2982 INFO public.pgbench_branches: Using node node1 as provider 2014-12-09 19:26:19,730 2982 INFO Provider: node1 (root) 2014-12-09 19:26:19,734 2982 INFO Locking public.pgbench_branches 2014-12-09 19:26:19,734 2982 INFO Syncing public.pgbench_branches 2014-12-09 19:26:22,772 2982 INFO Counting public.pgbench_branches 2014-12-09 19:26:22,804 2982 INFO srcdb: 2 rows, checksum=-3078609798 2014-12-09 19:26:22,812 2982 INFO dstdb: 2 rows, checksum=-3078609798 2014-12-09 19:26:22,866 2982 INFO Checking if node1 can be used for copy 2014-12-09 19:26:22,877 2982 INFO Node node1 seems good source, using it 2014-12-09 19:26:22,878 2982 INFO public.pgbench_history: Using node node1 as provider 2014-12-09 19:26:22,919 2982 INFO Provider: node1 (root) 2014-12-09 19:26:22,931 2982 INFO Locking public.pgbench_history 2014-12-09 19:26:22,932 2982 INFO Syncing public.pgbench_history 2014-12-09 19:26:25,963 2982 INFO Counting public.pgbench_history 2014-12-09 19:26:26,008 2982 INFO srcdb: 715 rows, checksum=9467587272 2014-12-09 19:26:26,020 2982 INFO dstdb: 715 rows, checksum=9467587272 2014-12-09 19:26:26,056 2982 INFO Checking if node1 can be used for copy 2014-12-09 19:26:26,063 2982 INFO Node node1 seems good source, using it 2014-12-09 19:26:26,064 2982 INFO public.pgbench_tellers: Using node node1 as provider 2014-12-09 19:26:26,100 2982 INFO Provider: node1 (root) 2014-12-09 19:26:26,108 2982 INFO Locking public.pgbench_tellers 2014-12-09 19:26:26,109 2982 INFO Syncing public.pgbench_tellers 2014-12-09 19:26:29,144 2982 INFO Counting public.pgbench_tellers 2014-12-09 19:26:29,176 2982 INFO srcdb: 20 rows, checksum=4814381032 2014-12-09 19:26:29,182 2982 INFO dstdb: 20 rows, checksum=4814381032 How it works... The following is an explanation of the steps performed in the preceding section: Initially, in step 1, we create two databases, that is node1 and node2, that are used as the source and target databases, respectively, from a replication perspective. In step 2, we populate the node1 database using the pgbench utility. In step 3 of the preceding section, we add and define the respective primary key and foreign key relationships on different tables and put these DDL commands in a .sql file. In step 4, we execute these DDL commands stated in step 3 on the node1 database; thus, in this way, we force the primary key and foreign key definitions on the tables in the pgbench schema in the node1 database. In step 5, we extract the table definitions from the tables in the pgbench schema in the node1 database and load these definitions in the node2 database. We will now discuss steps 6 to 8 of the preceding section. In step 6, we create the configuration file, which is then used in step 7 to create the root node for the node1 source database. In step 8, we will launch the worker daemon for the root node. Regarding the entries mentioned in the configuration file in step 6, we first define a job that must have a name, so that distinguished processes can be easily identified. Then, we define a connect string with information to connect to the source database, that is node1, and then we define the name of the replication queue involved. Finally, we define the location of the log and pid files. We will now discuss steps 9 to 11 of the preceding section. In step 9, we define the configuration file, which is then used in step 10 to create the leaf node for the target database, that is node2. In step 11, we launch the worker daemon for the leaf node. The entries in the configuration file in step 9 contain the job_name connect string in order to connect to the target database, that is node2, the name of the replication queue involved, and the location of log and pid involved. The key part in step 11 is played by the slave, that is the target database—to find the master or provider, that is source database node1. We will now talk about steps 12 and 13 of the preceding section. In step 12, we define the ticker configuration, with the help of which we launch the ticker process mentioned in step 13. Once the ticker daemon has started successfully, we have all the components and processes setup and needed for replication; however, we have not yet defined what the system needs to replicate. In step 14 and 15, we define the tables to the replication that is set on both the source and target databases, that is node1 and node2, respectively. Finally, we will talk about steps 16 and 17 of the preceding section. Here, at this stage, we are testing the replication that was set up between the node1 source database and the node2 target database. In step 16, we generate some traffic on the node1 source database by running pgbench with five parallel database connections and generating traffic for 10 seconds. In step 17, we check whether the tables on both the source and target databases have the same data. For this purpose, we use the compare command on the provider and subscriber nodes and then count and checksum the rows on both sides. A partial output from the preceding section tells you that the data has been successfully replicated between all the tables that are part of the replication set up between the node1 source database and the node2 destination database, as the count and checksum of rows for all the tables on the source and target destination databases are matching: 2014-12-09 19:26:18,975 2982 INFO Counting public.pgbench_accounts 2014-12-09 19:26:19,401 2982 INFO srcdb: 200000 rows, checksum=167607238449 2014-12-09 19:26:19,706 2982 INFO dstdb: 200000 rows, checksum=167607238449   2014-12-09 19:26:22,772 2982 INFO Counting public.pgbench_branches 2014-12-09 19:26:22,804 2982 INFO srcdb: 2 rows, checksum=-3078609798 2014-12-09 19:26:22,812 2982 INFO dstdb: 2 rows, checksum=-3078609798   2014-12-09 19:26:25,963 2982 INFO Counting public.pgbench_history 2014-12-09 19:26:26,008 2982 INFO srcdb: 715 rows, checksum=9467587272 2014-12-09 19:26:26,020 2982 INFO dstdb: 715 rows, checksum=9467587272   2014-12-09 19:26:29,144 2982 INFO Counting public.pgbench_tellers 2014-12-09 19:26:29,176 2982 INFO srcdb: 20 rows, checksum=4814381032 2014-12-09 19:26:29,182 2982 INFO dstdb: 20 rows, checksum=4814381032 Summary This article demonstrates the high availability and replication concepts in PostgreSQL. After reading this chapter, you will be able to implement high availability and replication options using different techniques including streaming replication, Slony-I replication and replication using Longdiste. Resources for Article: Further resources on this subject: Running a PostgreSQL Database Server [article] Securing the WAL Stream [article] Recursive queries [article]
Read more
  • 0
  • 0
  • 5507
article-image-multiplying-performance-parallel-computing
Packt
06 Feb 2015
22 min read
Save for later

Multiplying Performance with Parallel Computing

Packt
06 Feb 2015
22 min read
In this article, by Aloysius Lim and William Tjhi, authors of the book R High Performance Programming, we will learn how to write and execute a parallel R code, where different parts of the code run simultaneously. So far, we have learned various ways to optimize the performance of R programs running serially, that is in a single process. This does not take full advantage of the computing power of modern CPUs with multiple cores. Parallel computing allows us to tap into all the computational resources available and to speed up the execution of R programs by many times. We will examine the different types of parallelism and how to implement them in R, and we will take a closer look at a few performance considerations when designing the parallel architecture of R programs. (For more resources related to this topic, see here.) Data parallelism versus task parallelism Many modern software applications are designed to run computations in parallel in order to take advantage of the multiple CPU cores available on almost any computer today. Many R programs can similarly be written in order to run in parallel. However, the extent of possible parallelism depends on the computing task involved. On one side of the scale are embarrassingly parallel tasks, where there are no dependencies between the parallel subtasks; such tasks can be made to run in parallel very easily. An example of this is, building an ensemble of decision trees in a random forest algorithm—randomized decision trees can be built independently from one another and in parallel across tens or hundreds of CPUs, and can be combined to form the random forest. On the other end of the scale are tasks that cannot be parallelized, as each step of the task depends on the results of the previous step. One such example is a depth-first search of a tree, where the subtree to search at each step depends on the path taken in previous steps. Most algorithms fall somewhere in between with some steps that must run serially and some that can run in parallel. With this in mind, careful thought must be given when designing a parallel code that works correctly and efficiently. Often an R program has some parts that have to be run serially and other parts that can run in parallel. Before making the effort to parallelize any of the R code, it is useful to have an estimate of the potential performance gains that can be achieved. Amdahl's law provides a way to estimate the best attainable performance gain when you convert a code from serial to parallel execution. It divides a computing task into its serial and potentially-parallel parts and states that the time needed to execute the task in parallel will be no less than this formula: T(n) = T(1)(P + (1-P)/n), where: T(n) is the time taken to execute the task using n parallel processes P is the proportion of the whole task that is strictly serial The theoretical best possible speed up of the parallel algorithm is thus: S(n) = T(1) / T(n) = 1 / (P + (1-P)/n) For example, given a task that takes 10 seconds to execute on one processor, where half of the task can be run in parallel, then the best possible time to run it on four processors is T(4) = 10(0.5 + (1-0.5)/4) = 6.25 seconds. The theoretical best possible speed up of the parallel algorithm with four processors is 1 / (0.5 + (1-0.5)/4) = 1.6x . The following figure shows you how the theoretical best possible execution time decreases as more CPU cores are added. Notice that the execution time reaches a limit that is just above five seconds. This corresponds to the half of the task that must be run serially, where parallelism does not help. Best possible execution time versus number of CPU cores In general, Amdahl's law means that the fastest execution time for any parallelized algorithm is limited by the time needed for the serial portions of the algorithm. Bear in mind that Amdahl's law provides only a theoretical estimate. It does not account for the overheads of parallel computing (such as starting and coordinating tasks) and assumes that the parallel portions of the algorithm are infinitely scalable. In practice, these factors might significantly limit the performance gains of parallelism, so use Amdahl's law only to get a rough estimate of the maximum speedup possible. There are two main classes of parallelism: data parallelism and task parallelism. Understanding these concepts helps to determine what types of tasks can be modified to run in parallel. In data parallelism, a dataset is divided into multiple partitions. Different partitions are distributed to multiple processors, and the same task is executed on each partition of data. Take for example, the task of finding the maximum value in a vector dataset, say one that has one billion numeric data points. A serial algorithm to do this would look like the following code, which iterates over every element of the data in sequence to search for the largest value. (This code is intentionally verbose to illustrate how the algorithm works; in practice, the max() function in R, though also serial in nature, is much faster.) serialmax <- function(data) {max = -Inffor (i in data) {if (i > max)max = i}return max} One way to parallelize this algorithm is to split the data into partitions. If we have a computer with eight CPU cores, we can split the data into eight partitions of 125 million numbers each. Here is the pseudocode for how to perform the same task in parallel: # Run this in parallel across 8 CPU corespart.results <- run.in.parallel(serialmax(data.part))# Compute global maxglobal.max <- serialmax(part.results) This pseudocode runs eight instances of serialmax()in parallel—one for each data partition—to find the local maximum value in each partition. Once all the partitions have been processed, the algorithm finds the global maximum value by finding the largest value among the local maxima. This parallel algorithm works because the global maximum of a dataset must be the largest of the local maxima from all the partitions. The following figure depicts data parallelism pictorially. The key behind data parallel algorithms is that each partition of data can be processed independently of the other partitions, and the results from all the partitions can be combined to compute the final results. This is similar to the mechanism of the MapReduce framework from Hadoop. Data parallelism allows algorithms to scale up easily as data volume increases—as more data is added to the dataset, more computing nodes can be added to a cluster to process new partitions of data. Data parallelism Other examples of computations and algorithms that can be run in a data parallel way include: Element-wise matrix operations such as addition and subtraction: The matrices can be partitioned and the operations are applied to each pair of partitions. Means: The sums and number of elements in each partition can be added to find the global sum and number of elements from which the mean can be computed. K-means clustering: After data partitioning, the K centroids are distributed to all the partitions. Finding the closest centroid is performed in parallel and independently across the partitions. The centroids are updated by first, calculating the sums and the counts of their respective members in parallel, and then consolidating them in a single process to get the global means. Frequent itemset mining using the Partition algorithm: In the first pass, the frequent itemsets are mined from each partition of data to generate a global set of candidate itemsets; in the second pass, the supports of the candidate itemsets are summed from each partition to filter out the globally infrequent ones. The other main class of parallelism is task parallelism, where tasks are distributed to and executed on different processors in parallel. The tasks on each processor might be the same or different, and the data that they act on might also be the same or different. The key difference between task parallelism and data parallelism is that the data is not divided into partitions. An example of a task parallel algorithm performing the same task on the same data is the training of a random forest model. A random forest is a collection of decision trees built independently on the same data. During the training process for a particular tree, a random subset of the data is chosen as the training set, and the variables to consider at each branch of the tree are also selected randomly. Hence, even though the same data is used, the trees are different from one another. In order to train a random forest of say 100 decision trees, the workload could be distributed to a computing cluster with 100 processors, with each processor building one tree. All the processors perform the same task on the same data (or exact copies of the data), but the data is not partitioned. The parallel tasks can also be different. For example, computing a set of summary statistics on the same set of data can be done in a task parallel way. Each process can be assigned to compute a different statistic—the mean, standard deviation, percentiles, and so on. Pseudocode of a task parallel algorithm might look like this: # Run 4 tasks in parallel across 4 coresfor (task in tasks)run.in.parallel(task)# Collect the results of the 4 tasksresults <- collect.parallel.output()# Continue processing after all 4 tasks are complete Implementing data parallel algorithms Several R packages allow code to be executed in parallel. The parallel package that comes with R provides the foundation for most parallel computing capabilities in other packages. Let's see how it works with an example. This example involves finding documents that match a regular expression. Regular expression matching is a fairly computational expensive task, depending on the complexity of the regular expression. The corpus, or set of documents, for this example is a sample of the Reuters-21578 dataset for the topic corporate acquisitions (acq) from the tm package. Because this dataset contains only 50 documents, they are replicated 100,000 times to form a corpus of 5 million documents so that parallelizing the code will lead to meaningful savings in execution times. library(tm)data("acq")textdata <- rep(sapply(content(acq), content), 1e5) The task is to find documents that match the regular expression d+(,d+)? mln dlrs, which represents monetary amounts in millions of dollars. In this regular expression, d+ matches a string of one or more digits, and (,d+)? optionally matches a comma followed by one more digits. For example, the strings 12 mln dlrs, 1,234 mln dlrs and 123,456,789 mln dlrs will match the regular expression. First, we will measure the execution time to find these documents serially with grepl(): pattern <- "\d+(,\d+)? mln dlrs"system.time(res1 <- grepl(pattern, textdata))##   user  system elapsed ## 65.601   0.114  65.721 Next, we will modify the code to run in parallel and measure the execution time on a computer with four CPU cores: library(parallel)detectCores()## [1] 4cl <- makeCluster(detectCores())part <- clusterSplit(cl, seq_along(textdata))text.partitioned <- lapply(part, function(p) textdata[p])system.time(res2 <- unlist(    parSapply(cl, text.partitioned, grepl, pattern = pattern))) ##  user  system elapsed ## 3.708   8.007  50.806 stopCluster(cl) In this code, the detectCores() function reveals how many CPU cores are available on the machine, where this code is executed. Before running any parallel code, makeCluster() is called to create a local cluster of processing nodes with all four CPU cores. The corpus is then split into four partitions using the clusterSplit() function to determine the ideal split of the corpus such that each partition has roughly the same number of documents. The actual parallel execution of grepl() on each partition of the corpus is carried out by the parSapply() function. Each processing node in the cluster is given a copy of the partition of data that it is supposed to process along with the code to be executed and other variables that are needed to run the code (in this case, the pattern argument). When all four processing nodes have completed their tasks, the results are combined in a similar fashion to sapply(). Finally, the cluster is destroyed by calling stopCluster(). It is good practice to ensure that stopCluster() is always called in production code, even if an error occurs during execution. This can be done as follows: doSomethingInParallel <- function(...) {    cl <- makeCluster(...)    on.exit(stopCluster(cl))    # do something} In this example, running the task in parallel on four processors resulted in a 23 percent reduction in the execution time. This is not in proportion to the amount of compute resources used to perform the task; with four times as many CPU cores working on it, a perfectly parallelizable task might experience as much as a 75 percent runtime reduction. However, remember Amdahl's law—the speed of parallel code is limited by the serial parts, which includes the overheads of parallelization. In this case, calling makeCluster() with the default arguments creates a socket-based cluster. When such a cluster is created, additional copies of R are run as workers. The workers communicate with the master R process using network sockets, hence the name. The worker R processes are initialized with the relevant packages loaded, and data partitions are serialized and sent to each worker process. These overheads can be significant, especially in data parallel algorithms where large volumes of data needs to be transferred to the worker processes. Besides parSapply(), parallel also provides the parApply() and parLapply() functions; these functions are analogous to the standard sapply(), apply(), and lapply() functions, respectively. In addition, the parLapplyLB() and parSapplyLB() functions provide load balancing, which is useful when the execution of each parallel task takes variable amounts of time. Finally, parRapply() and parCapply() are parallel row and column apply() functions for matrices. On non-Windows systems, parallel supports another type of cluster that often incurs less overheads — forked clusters. In these clusters, new worker processes are forked from the parent R process with a copy of the data. However, the data is not actually copied in the memory unless it is modified by a child process. This means that, compared to socket-based clusters, initializing child processes is quicker and the memory usage is often lower. Another advantage of using forked clusters is that parallel provides a convenient and concise way to run tasks on them via the mclapply(), mcmapply(), and mcMap() functions. (These functions start with mc because they were originally a part of the multicore package) There is no need to explicitly create and destroy the cluster, as these functions do this automatically. We can simply call mclapply() and state the number of worker processes to fork via the mc.cores argument: system.time(res3 <- unlist(    mclapply(text.partitioned, grepl, pattern = pattern,             mc.cores = detectCores())))##    user  system elapsed ## 127.012   0.350  33.264 This shows a 49 percent reduction in execution time compared to the serial version, and 35 percent reduction compared to parallelizing using a socket-based cluster. For this example, forked clusters provide the best performance. Due to differences in system configuration, you might see very different results when you try the examples in your own environment. When you develop parallel code, it is important to test the code in an environment that is similar to the one that it will eventually run in. Implementing task parallel algorithms Let's now see how to implement a task parallel algorithm using both socket-based and forked clusters. We will look at how to run the same task and different tasks on workers in a cluster. Running the same task on workers in a cluster To demonstrate how to run the same task on a cluster, the task for this example is to generate 500 million Poisson random numbers. We will do this by using L'Ecuyer's combined multiple-recursive generator, which is the only random number generator in base R that supports multiple streams to generate random numbers in parallel. The random number generator is selected by calling the RNGkind() function. We cannot just use any random number generator in parallel because the randomness of the data depends on the algorithm used to generate random data and the seed value given to each parallel task. Most other algorithms were not designed to produce random numbers in multiple parallel streams, and might produce multiple highly correlated streams of numbers, or worse, multiple identical streams! First, we will measure the execution time of the serial algorithm: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e8lambda <- 10system.time(random1 <- rpois(nsamples, lambda))##   user  system elapsed## 51.905   0.636  52.544 To generate the random numbers on a cluster, we will first distribute the task evenly among the workers. In the following code, the integer vector samples.per.process contains the number of random numbers that each worker needs to generate on a four-core CPU. The seq() function produces ncores+1 numbers evenly distributed between 0 and nsamples, with the first number being 0 and the next ncores numbers indicating the approximate cumulative number of samples across the worker processes. The round() function rounds off these numbers into integers and diff() computes the difference between them to give the number of random numbers that each worker process should generate. cores <- detectCores()cl <- makeCluster(ncores)samples.per.process <-    diff(round(seq(0, nsamples, length.out = ncores+1))) Before we can generate the random numbers on a cluster, each worker needs a different seed from which it can generate a stream of random numbers. The seeds need to be set on all the workers before running the task, to ensure that all the workers generate different random numbers. For a socket-based cluster, we can call clusterSetRNGStream() to set the seeds for the workers, then run the random number generation task on the cluster. When the task is completed, we call stopCluster() to shut down the cluster: clusterSetRNGStream(cl)system.time(random2 <- unlist(    parLapply(cl, samples.per.process, rpois,               lambda = lambda)))##  user  system elapsed ## 5.006   3.000  27.436stopCluster(cl) Using four parallel processes in a socket-based cluster reduces the execution time by 48 percent. The performance of this type of cluster for this example is better than that of the data parallel example because there is less data to copy to the worker processes—only an integer that indicates how many random numbers to generate. Next, we run the same task on a forked cluster (again, this is not supported on Windows). The mclapply() function can set the random number seeds for each worker for us, when the mc.set.seed argument is set to TRUE; we do not need to call clusterSetRNGStream(). Otherwise, the code is similar to that of the socket-based cluster: system.time(random3 <- unlist(    mclapply(samples.per.process, rpois,             lambda = lambda,             mc.set.seed = TRUE, mc.cores = ncores))) ##   user  system elapsed ## 76.283   7.272  25.052 On our test machine, the execution time of the forked cluster is slightly faster, but close to that of the socket-based cluster, indicating that the overheads for this task are similar for both types of clusters. Running different tasks on workers in a cluster So far, we have executed the same tasks on each parallel process. The parallel package also allows different tasks to be executed on different workers. For this example, the task is to generate not only Poisson random numbers, but also uniform, normal, and exponential random numbers. As before, we start by measuring the time to perform this task serially: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e7pois.lambda <- 10system.time(random1 <- list(pois = rpois(nsamples,                                          pois.lambda),                            unif = runif(nsamples),                            norm = rnorm(nsamples),                            exp = rexp(nsamples)))##   user  system elapsed ## 14.180   0.384  14.570 In order to run different tasks on different workers on socket-based clusters, a list of function calls and their associated arguments must be passed to parLapply(). This is a bit cumbersome, but parallel unfortunately does not provide an easier interface to run different tasks on a socket-based cluster. In the following code, the function calls are represented as a list of lists, where the first element of each sublist is the name of the function that runs on a worker, and the second element contains the function arguments. The function do.call() is used to call the given function with the given arguments. cores <- detectCores()cl <- makeCluster(cores)calls <- list(pois = list("rpois", list(n = nsamples,                                        lambda = pois.lambda)),              unif = list("runif", list(n = nsamples)),              norm = list("rnorm", list(n = nsamples)),              exp = list("rexp", list(n = nsamples)))clusterSetRNGStream(cl)system.time(    random2 <- parLapply(cl, calls,                         function(call) {                             do.call(call[[1]], call[[2]])                         }))##  user  system elapsed ## 2.185   1.629  10.403stopCluster(cl) On forked clusters on non-Windows machines, the mcparallel() and mccollect() functions offer a more intuitive way to run different tasks on different workers. For each task, mcparallel() sends the given task to an available worker. Once all the workers have been assigned their tasks, mccollect() waits for the workers to complete their tasks and collects the results from all the workers. mc.reset.stream()system.time({    jobs <- list()    jobs[[1]] <- mcparallel(rpois(nsamples, pois.lambda),                            "pois", mc.set.seed = TRUE)    jobs[[2]] <- mcparallel(runif(nsamples),                            "unif", mc.set.seed = TRUE)    jobs[[3]] <- mcparallel(rnorm(nsamples),                            "norm", mc.set.seed = TRUE)    jobs[[4]] <- mcparallel(rexp(nsamples),                            "exp", mc.set.seed = TRUE)    random3 <- mccollect(jobs)})##   user  system elapsed ## 14.535   3.569   7.97 Notice that we also had to call mc.reset.stream() to set the seeds for random number generation in each worker. This was not necessary when we used mclapply(), which calls mc.reset.stream() for us. However, mcparallel() does not, so we need to call it ourselves. Summary In this article, we learned about two classes of parallelism: data parallelism and task parallelism. Data parallelism is good for tasks that can be performed in parallel on partitions of a dataset. The dataset to be processed is split into partitions and each partition is processed on a different worker processes. Task parallelism, on the other hand, divides a set of similar or different tasks to amongst the worker processes. In either case, Amdahl's law states that the maximum improvement in speed that can be achieved by parallelizing code is limited by the proportion of that code that can be parallelized. Resources for Article: Further resources on this subject: Using R for Statistics, Research, and Graphics [Article] Learning Data Analytics with R and Hadoop [Article] Aspects of Data Manipulation in R [Article]
Read more
  • 0
  • 0
  • 3888

article-image-remote-access
Packt
06 Feb 2015
32 min read
Save for later

Remote Access

Packt
06 Feb 2015
32 min read
In this article by Jordan Krause, author of the book Windows Server 2012 R2 Administrator Cookbook, we will see how Windows Server 2012 R2 by Microsoft brings a whole new way of looking at remote access. Companies have historically relied on third-party tools to connect remote users into the network, such as traditional and SSL VPN provided by appliances from large networking vendors. I'm here to tell you those days are gone. Those of us running Microsoft-centric shops can now rely on Microsoft technologies to connect our remote workforce. Better yet is that these technologies are included with the Server 2012 R2 operating system, and have functionality that is much improved over anything that a traditional VPN can provide. Regular VPN does still have a place in the remote access space, and the great news is that you can also provide it with Server 2012 R2. Our primary focus for this article will be DirectAccess (DA). DA is kind of like automatic VPN. There is nothing the user needs to do in order to be connected to work. Whenever they are on the Internet, they are also connected automatically to the corporate network. DirectAccess is an amazing way to have your Windows 7 and Windows 8 domain joined systems connected back to the network for data access and for management of those traveling machines. DirectAccess has actually been around since 2008, but the first version came with some steep infrastructure requirements and was not widely used. Server 2012 R2 brings a whole new set of advantages and makes implementation much easier than in the past. I still find many server and networking admins who have never heard of DirectAccess, so let's spend some time together exploring some of the common tasks associated with it. In this article, we will cover the following recipes: Configuring DirectAccess, VPN, or a combination of the two Pre-staging Group Policy Objects (GPOs) to be used by DirectAccess Enhancing the security of DirectAccess by requiring certificate authentication Building your Network Location Server (NLS) on its own system  (For more resources related to this topic, see here.) There are two "flavors" of remote access available in Windows Server 2012 R2. The most common way to implement the Remote Access role is to provide DirectAccess for your Windows 7 and Windows 8 domain joined client computers, and VPN for the rest. The DirectAccess machines are typically your company-owned corporate assets. One of the primary reasons that DirectAccess is usually only for company assets is that the client machines must be joined to your domain, because the DirectAccess configuration settings are brought down to the client through a GPO. I doubt you want home and personal computers joining your domain. VPN is therefore used for down level clients such as Windows XP, and for home and personal devices that want to access the network. Since this is a traditional VPN listener with all regular protocols available such as PPTP, L2TP, SSTP, it can even work to connect devices such as smartphones. There is a third function available within the Server 2012 R2 Remote Access role, called the Web Application Proxy ( WAP ). This function is not used for connecting remote computers fully into the network as DirectAccess and VPN are; rather, WAP is used for publishing internal web resources out to the internet. For example, if you are running Exchange and Lync Server inside your network and want to publish access to these web-based resources to the internet for external users to connect to, WAP would be a mechanism that could publish access to these resources. The term for publishing out to the internet like this is Reverse Proxy, and WAP can act as such. It can also behave as an ADFS Proxy. For further information on the WAP role, please visit: http://technet.microsoft.com/en-us/library/dn584107.aspx One of the most confusing parts about setting up DirectAccess is that there are many different ways to do it. Some are good ideas, while others are not. Before we get rolling with recipes, we are going to cover a series of questions and answers to help guide you toward a successful DA deployment. The first question that always presents itself when setting up DA is "How do I assign IP addresses to my DirectAccess server?". This is quite a loaded question, because the answer depends on how you plan to implement DA, which features you plan to utilize, and even upon how secure you believe your DirectAccess server to be. Let me ask you some questions, pose potential answers to those questions, and discuss the effects of making each decision. DirectAccess Planning Q&A Which client operating systems can connect using DirectAccess? Answer: Windows 7 Ultimate, Windows 7 Enterprise, and Windows 8.x Enterprise. You'll notice that the Professional SKU is missing from this list. That is correct, Windows 7 and Windows 8 Pro do not contain the DirectAccess connectivity components. Yes, this does mean that Surface Pro tablets cannot utilize DirectAccess out of the box. However, I have seen many companies now install Windows 8 Enterprise onto their Surface tablets, effectively turning them into "Surface Enterprises." This works fine and does indeed enable them to be DirectAccess clients. In fact, I am currently typing this text on a DirectAccess-connected Surface "Pro turned Enterprise" tablet. Do I need one or two NICs on my DirectAccess server? Answer: Technically, you could set it up either way. In practice however, it really is designed for dual-NIC implementation. Single NIC DirectAccess works okay sometimes to establish a proof-of-concept to test out the technology. But I have seen too many problems with single NIC implementation in the field to ever recommend it for production use. Stick with two network cards, one facing the internal network and one facing the Internet. Do my DirectAccess servers have to be joined to the domain? Answer: Yes. Does DirectAccess have site-to-site failover capabilities? Answer: Yes, though only Windows 8.x client computers can take advantage of it. This functionality is called Multi-Site DirectAccess. Multiple DA servers that are spread out geographically can be joined together in a multi-site array. Windows 8 client computers keep track of each individual entry point and are able to swing between them as needed or at user preference. Windows 7 clients do not have this capability and will always connect through their primary site. What are these things called 6to4, Teredo, and IP-HTTPS I have seen in the Microsoft documentation? Answer: 6to4, Teredo, and IP-HTTPS are all IPv6 transition tunneling protocols. All DirectAccess packets that are moving across the internet between DA client and DA server are IPv6 packets. If your internal network is IPv4, then when those packets reach the DirectAccess server they get turned down into IPv4 packets, by some special components called DNS64 and NAT64. While these functions handle the translation of packets from IPv6 into IPv4 when necessary inside the corporate network, the key point here is that all DirectAccess packets that are traveling over the Internet part of the connection are always IPv6. Since the majority of the Internet is still IPv4, this means that we must tunnel those IPv6 packets inside something to get them across the Internet. That is the job of 6to4, Teredo, and IP-HTTPS. 6to4 encapsulates IPv6 packets into IPv4 headers and shuttles them around the internet using protocol 41. Teredo similarly encapsulates IPv6 packets inside IPv4 headers, but then uses UDP port 3544 to transport them. IP-HTTPS encapsulates IPv6 inside IPv4 and then inside HTTP encrypted with TLS, essentially creating an HTTPS stream across the Internet. This, like any HTTPS traffic, utilizes TCP port 443. The DirectAccess traffic traveling inside either kind of tunnel is always encrypted, since DirectAccess itself is protected by IPsec. Do I want to enable my clients to connect using Teredo? Answer: Most of the time, the answer here is yes. Probably the biggest factor that weighs on this decision is whether or not you are still running Windows 7 clients. When Teredo is enabled in an environment, this gives the client computers an opportunity to connect using Teredo, rather than all clients connecting in over the IP-HTTPS protocol. IP-HTTPS is sort of the "catchall" for connections, but Teredo will be preferred by clients if it is available. For Windows 7 clients, Teredo is quite a bit faster than IP-HTTPS. So enabling Teredo on the server side means your Windows 7 clients (the ones connecting via Teredo) will have quicker response times, and the load on your DirectAccess server will be lessened. This is because Windows 7 clients who are connecting over IP-HTTPS are encrypting all of the traffic twice. This also means that the DA server is encrypting/decrypting everything that comes and goes twice. In Windows 8, there is an enhancement that brings IP-HTTPS performance almost on par with Teredo, and so environments that are fully cut over to Windows 8 will receive less benefit from the extra work that goes into making sure Teredo works. Can I place my DirectAccess server behind a NAT? Answer: Yes, though there is a downside. Teredo cannot work if the DirectAccess server is sitting behind a NAT. For Teredo to be available, the DA server must have an External NIC that has two consecutive public IP addresses. True public addresses. If you place your DA server behind any kind of NAT, Teredo will not be available and all clients will connect using the IP-HTTPS protocol. Again, if you are using Windows 7 clients, this will decrease their speed and increase the load on your DirectAccess server. How many IP addresses do I need on a standalone DirectAccess server? Answer: I am going to leave single NIC implementation out of this answer since I don't recommend it anyway. For scenarios where you are sitting the External NIC behind a NAT or, for any other reason, are limiting your DA to IP-HTTPS only, then we need one external address and one internal address. The external address can be a true public address or a private NATed DMZ address. Same with the internal; it could be a true internal IP or a DMZ IP. Make sure both NICs are not plugged into the same DMZ, however. For a better installation scenario that allows Teredo connections to be possible, you would need two consecutive public IP addresses on the External NIC and a single internal IP on the Internal NIC. This internal IP could be either true internal or DMZ. But the public IPs would really have to be public for Teredo to work. Do I need an internal PKI? Answer: Maybe. If you want to connect Windows 7 clients, then the answer is yes. If you are completely Windows 8, then technically you do not need internal PKI. But you really should use it anyway. Using an internal PKI, which can be a single, simple Windows CA server, increases the security of your DirectAccess infrastructure. You'll find out during this article just how easy it is to require certificates as part of the tunnel building authentication process. Configuring DirectAccess, VPN, or a combination of the two Now that we have some general ideas about how we want to implement our remote access technologies, where do we begin? Most services that you want to run on a Windows Server begin with a role installation, but the implementation of remote access begins before that. Let's walk through the process of taking a new server and turning it into a Microsoft Remote Access server. Getting ready All of our work will be accomplished on a new Windows Server 2012 R2. We are taking the two-NIC approach to networking, and so we have two NICs installed on this server. The Internal NIC is plugged into the corporate network and the External NIC is plugged into the Internet for the sake of simplicity. The External NIC could just as well be plugged into a DMZ. How to do it... Follow these steps to turn your new server into a Remote Access server: Assign IP addresses to your server. Remember, the most important part is making sure that the Default Gateway goes on the External NIC only. Join the new server to your domain. Install an SSL certificate onto your DirectAccess server that you plan to use for the IP-HTTPS listener. This is typically a certificate purchased from a public CA. If you're planning to use client certificates for authentication, make sure to pull down a copy of the certificate to your DirectAccess server. You want to make sure certificates are in place before you start with the configuration of DirectAccess. This way the wizards will be able to automatically pull in information about those certificates in the first run. If you don't, DA will set itself up to use self-signed certificates, which are a security no-no. Use Server Manager to install the Remote Access role. You should only do this after completing the steps listed earlier. If you plan to load balance multiple DirectAccess servers together at a later time, make sure to also install the feature called Network Load Balancing . After selecting your role and feature, you will be asked which Remote Access role services you want to install. For our purposes in getting the remote workforce connected back into the corporate network, we want to choose DirectAccess and VPN (RAS) .  Now that the role has been successfully installed, you will see a yellow exclamation mark notification near the top of Server Manager indicating that you have some Post-deployment Configuration that needs to be done. Do not click on Open the Getting Started Wizard ! Unfortunately, Server Manager leads you to believe that launching the Getting Started Wizard (GSW) is the logical next step. However, using the GSW as the mechanism for configuring your DirectAccess settings is kind of like roasting a marshmallow with a pair of tweezers. In order to ensure you have the full range of options available to you as you configure your remote access settings, and that you don't get burned later, make sure to launch the configuration this way: Click on the Tools menu from inside Server Manager and launch the Remote Access Management Console . In the left window pane, click on Configuration | DirectAccess and VPN . Click on the second link, the one that says Run the Remote Access Setup Wizard . Please note that once again the top option is to run that pesky Getting Started Wizard. Don't do it! I'll explain why in the How it works… section of this recipe. Now you have a choice that you will have to answer for yourself. Are you configuring only DirectAccess, only VPN, or a combination of the two? Simply click on the option that you want to deploy. Following your choice, you will see a series of steps (steps 1 through 4) that need to be accomplished. This series of mini-wizards will guide you through the remainder of the DirectAccess and VPN particulars. This recipe isn't large enough to cover every specific option included in those wizards, but at least you now know the correct way to bring a DirectAccess/VPN server into operation. How it works... The remote access technologies included in Server 2012 R2 have great functionality, but their initial configuration can be confusing. Following the procedure listed in this recipe will set you on the right path to be successful in your deployment, and prevent you from running into issues down the road. The reason that I absolutely recommend you stay away from using the "shortcut" deployment method provided by the Getting Started Wizard is twofold: GSW skips a lot of options as it sets up DirectAccess, so you don't really have any understanding of how it works after finishing. You may have DA up and running, but have no idea how it's authenticating or working under the hood. This holds so much potential for problems later, should anything suddenly stop working. GSW employs a number of bad security practices in order to save time and effort in the setup process. For example, using the GSW usually means that your DirectAccess server will be authenticating users without client certificates, which is not a best practice. Also, it will co-host something called the NLS website on itself, which is also not a best practice. Those who utilize the GSW to configure DirectAccess will find that their GPO, which contains the client connectivity settings, will be security-filtered to the Domain Computers group. Even though it also contains a WMI filter that is supposed to limit that policy application to mobile hardware such as laptops, this is a terribly scary thing to see inside GPO filtering settings. You probably don't want all of your laptops to immediately start getting DA connectivity settings, but that is exactly what the GSW does for you. Perhaps worst, the GSW will create and make use of self-signed SSL certificates to validate its web traffic, even the traffic coming in from the Internet! This is a terrible practice and is the number one reason that should convince you that clicking on the Getting Started Wizard is not in your best interests. Pre-staging Group Policy Objects (GPOs) to be used by DirectAccess One of the great things about DirectAccess is that all of the connectivity settings the client computers need in order to connect are contained within a Group Policy Object (GPO). This means that you can turn new client computers into DirectAccess-connected clients without ever touching that system. Once configured properly, all you need to do is add the new computer account to an Active Directory security group, and during the next automatic Group Policy refresh cycle (usually within 90 minutes), that new laptop will be connecting via DirectAccess whenever outside the corporate network. You can certainly choose not to pre-stage anything with the GPOs and DirectAccess will still work. When you get to the end of the DA configuration wizards, it will inform you that two new GPOs are about to be created inside Active Directory. One GPO is used to contain the DirectAccess server settings and the other GPO is used to contain the DirectAccess client settings. If you allow the wizard to handle the generation of these GPOs, it will create them, link them, filter them, and populate them with settings automatically. About half of the time I see folks do it this way and they are forever happy with letting the wizard manage those GPOs now and in the future. The other half of the time, it is desired that we maintain a little more personal control over the GPOs. If you are setting up a new DA environment but your credentials don't have permission to create GPOs, the wizard is not going to be able to create them either. In this case, you will need to work with someone on your Active Directory team to get them created. Another reason to manage the GPOs manually is to have better control over placement of these policies. When you let the DirectAccess wizard create the GPOs, it will link them to the top level of your domain. It also sets Security Filtering on those GPOs so they are not going to be applied to everything in your domain, but when you open up the Group Policy Management Console you will always see those DirectAccess policies listed right up there at the top level of the domain. Sometimes this is simply not desirable. So for this reason also, you may want to choose to create and manage the GPOs by hand, so that we can secure placement and links where we specifically want them to be located. The key factors here are to make sure your DirectAccess Server Settings GPO applies to only the DirectAccess server or servers in your environment. And that the DirectAccess Client Settings GPO applies to only the DA client computers that you plan to enable in your network. The best practice here is to specify this GPO to only apply to a specific Active Directory security group so that you have full control over which computer accounts are in that group. I have seen some folks do it based only on the OU links and include whole OUs in the filtering for the clients GPO (foregoing the use of an AD group at all), but doing it this way makes it quite a bit more difficult to add or remove machines from the access list in the future. Requiring certificates as part of your DirectAccess tunnel authentication process is a good idea in any environment. It makes the solution more secure, and enables advanced functionality. The primary driver for most companies to require these certificates is the enablement of Windows 7 clients to connect via DirectAccess, but I suggest that anyone using DirectAccess in any capacity make use of these certs. They are simple to deploy, easy to configure, and give you some extra peace of mind that only computers who have a certificate issued directly to them from your own internal CA server are going to be able to connect through your DirectAccess entry point. Getting ready While the DirectAccess wizards themselves are run from the DirectAccess server, our work with this recipe is not. The Group Policy settings that we will be configuring are all accomplished within Active Directory, and we will be doing the work from a Domain Controller in our environment. How to do it... To pre-stage Group Policy Objects (GPOs) for use with DirectAccess: On your Domain Controller, launch the Group Policy Management Console . Expand Forest | Domains | Your Domain Name . There should be a listing here called Group Policy Object . Right-click on that and choose New . Name your new GPO something like DirectAccess Server Settings. Click on the new DirectAccess Server Settings GPO and it should open up automatically to the Scope tab. We need to adjust the Security Filtering section so that this GPO only applies to our DirectAccess server. This is a critical step for each GPO to ensure the settings that are going to be placed here do not get applied to the wrong computers. Remove Authenticated Users that is prepopulated in that list. The list should now be empty. Click the Add… button and search for the computer account of your DirectAccess server. Mine is called RA-01. By default this window will only search user accounts, so you will need to adjust Object Types to include Computers before it will allow you to add your server into this filtering list. Your Security Filtering list should now look like this:  Now click on the Details tab of your GPO. Change the GPO Status to be User configuration settings disabled . We do this because our GPO is only going to contain computer-level settings, nothing at the user level. The last thing to do is link your GPO to an appropriate container. Since we have Security Filtering enabled, our GPO is only ever going to apply its settings to the RA-01 server; however, without creating a link, the GPO will not even attempt to apply itself to anything. My RA-01 server is sitting inside the OU called Remote Access Servers . So I will right-click on my Remote Access Servers OU and choose Link an Existing GPO… .  Choose the new DirectAccess Server Settings from the list of available GPOs and click on the OK button. This creates the link and puts the GPO into action. Since there are not yet any settings inside the GPO, it won't actually make any changes on the server. The DirectAccess configuration wizards take care of populating the GPO with the settings that are needed. Now we simply need to rinse and repeat all of these steps to create another GPO, something like DirectAccess Client Settings . You want to set up the client settings GPO in the same way. Make sure that it is filtering to only the Active Directory Security Group that you created to contain your DirectAccess client computers. And make sure to link it to an appropriate container that will include those computer accounts. So maybe your clients GPO will look something like this:  How it works... Creating GPOs in Active Directory is a simple enough task, but it is critical that you configure the Links and Security Filtering correctly. If you do not take care to ensure that these DirectAccess connection settings are only going to apply to the machines that actually need the settings, you could create a world of trouble by internal servers getting remote access connection settings and cause them issues with connection while inside the network. Enhancing the security of DirectAccess by requiring certificate authentication When a DirectAccess client computer builds its IPsec tunnels back to the corporate network, it has the ability to require a certificate as part of that authentication process. In earlier versions of DirectAccess, the one in Server 2008 R2 and the one provided by Unified Access Gateway ( UAG ), these certificates were required in order to make DirectAccess work. Setting up the certificates really isn't a big deal at all; as long as there is a CA server in your network you are already prepared to issue the certs needed at no cost. Unfortunately, though, there must have been enough complaints back to Microsoft in order for them to make these certificates "recommended" instead of "required" and they created a new mechanism in Windows 8 and Server 2012 called KerberosProxy that can be used to authenticate the tunnels instead. This allows the DirectAccess tunnels to build without the computer certificate, making that authentication process less secure. I'm here to strongly recommend that you still utilize certificates in your installs! They are not difficult to set up, and using them makes your tunnel authentication stronger. Further, many of you may not have a choice and will still be required to install these certificates. Only simple DirectAccess scenarios that are all Windows 8 on the client side can get away with the shortcut method of foregoing certs. Anybody who still wants to connect Windows 7 via DirectAccess will need to use certificates on all of their client computers, both Windows 7 and Windows 8. In addition to Windows 7 access, anyone who intends to use the advanced features of DirectAccess such as load balancing, multi-site, or two-factor authentication will also need to utilize these certificates. With any of these scenarios, certificates become a requirement again, not a recommendation. In my experience, almost everyone still has Windows 7 clients that would benefit from being DirectAccess connected, and it's always a good idea to make your DA environment redundant by having load balanced servers. This further emphasizes the point that you should just set up certificate authentication right out of the gate, whether or not you need it initially. You might decide to make a change later that would require certificates and it would be easier to have them installed from the get-go rather than trying to incorporate them later into a running DA environment. Getting ready In order to distribute certificates, you will need a CA server running in your network. Once certificates are distributed to the appropriate places, the rest of our work will be accomplished from our Server 2012 R2 DirectAccess server. How to do it... Follow these steps to make use of certificates as part of the DirectAccess tunnel authentication process: The first thing that you need to do is distribute certificates to your DirectAccess servers and all DirectAccess client computers. The easiest way to do this is by using the built-in Computer template provided by default in a Windows CA server. If you desire to build a custom certificate template for this purpose, you can certainly do so. I recommend that you duplicate the Computer template and build it from there. Whenever I create a custom template for use with DirectAccess, I try to make sure that it meets the following criterias: The Subject Name of the certificate should match the Common Name of the computer (which is also the FQDN of the computer). The Subject Alternative Name ( SAN ) of the certificate should match the DNS Name of the computer (which is also the FQDN of the computer). The certificate should serve the Intended Purposes of both Client Authentication and Server Authentication . You can issue the certificates manually using Microsoft Management Console (MMC). Otherwise, you can lessen your hands-on administrative duties by enabling Autoenrollment. Now that we have certificates distributed to our DirectAccess clients and servers, log in to your primary DirectAccess server and open up the Remote Access Management Console . Click on Configuration in the top-left corner. You should now see steps 1 through 4 listed. Click Edit… listed under Step 2 . Now you can either click Next twice or click on the word Authentication to jump directly to the authentication screen. Check the box that says Use computer certificates . Now we have to specify the Certification Authority server that issued our client certificates. If you used an intermediary CA to issue your certs, make sure to check the appropriate checkbox. Otherwise, most of the time, certificates are issued from a root CA and in this case you would simply click on the Browse… button and look for your CA in the list. This screen is sometimes confusing because people expect to have to choose the certificate itself from the list. This is not the case. What you are actually choosing from this list is the Certificate Authority server that issued the certificates. Make any other appropriate selections on the Authentication screen. For example, many times when we require client certificates for authentication, it is because we have Windows 7 computers that we want to connect via DirectAccess. If that is the case for you, select the checkbox for Enable Windows 7 client computers to connect via DirectAccess .  How it works... Requiring certificates as part of your DirectAccess tunnel authentication process is a good idea in any environment. It makes the solution more secure, and enables advanced functionality. The primary driver for most companies to require these certificates is the enablement of Windows 7 clients to connect via DirectAccess, but I suggest that anyone using DirectAccess in any capacity make use of these certs. They are simple to deploy, easy to configure, and give you some extra peace of mind that only computers who have a certificate issued directly to them from your own internal CA server are going to be able to connect through your DirectAccess entry point. Building your Network Location Server (NLS) on its own system If you zipped through the default settings when configuring DirectAccess, or worse used the Getting Started Wizard, chances are that your Network Location Server ( NLS ) is running right on the DirectAccess server itself. This is not the recommended method for using NLS, it really should be running on a separate web server. In fact, if you later want to do something more advanced such as setting up load balanced DirectAccess servers, you're going to have to move NLS off onto a different server anyway. So you might as well do it right the first time. NLS is a very simple requirement, yet a critical one. It is just a website, it doesn't matter what content the site has, and it only has to run inside your network. Nothing has to be externally available. In fact, nothing should be externally available, because you only want this site being accessed internally. This NLS website is a large part of the mechanism by which DirectAccess client computers figure out when they are inside the office and when they are outside. If they can see the NLS website, they know they are inside the network and will disable DirectAccess name resolution, effectively turning off DA. If they do not see the NLS website, they will assume they are outside the corporate network and enable DirectAccess name resolution. There are two gotchas with setting up an NLS website: The first is that it must be HTTPS, so it does need a valid SSL certificate. Since this website is only running inside the network and being accessed from domain-joined computers, this SSL certificate can easily be one that has been issued from your internal CA server. So no cost associated there. The second catch that I have encountered a number of times is that for some reason the default IIS splash screen page doesn't make for a very good NLS website. If you set up a standard IIS web server and use the default site as NLS, sometimes it works to validate the connections and sometimes it doesn't. Given that, I always set up a specific site that I create myself, just to be on the safe side. So let's work together to follow the exact process I always take when setting up NLS websites in a new DirectAccess environment. Getting ready Our NLS website will be hosted on an IIS server we have that runs Server 2012 R2. Most of the work will be accomplished from this web server, but we will also be creating a DNS record and will utilize a Domain Controller for that task. How to do it... Let's work together to set up our new Network Location Server website: First decide on an internal DNS name to use for this website and set it up in DNS of your domain. I am going to use nls.mydomain.local and am creating a regular Host (A) record that points nls.mydomain.local at the IP address of my web server. Now log in to that web server and let's create some simple content for this new website. Create a new folder called C:NLS. Inside your new folder, create a new Default.htm file. Edit this file and throw some simple text in there. I usually say something like This is the NLS website used by DirectAccess. Please do not delete or modify me!.  Remember, this needs to be an HTTPS website, so before we try setting up the actual website, we should acquire the SSL certificate that we need to use with this site. Since this certificate is coming from my internal CA server, I'm going to open up MMC on my web server to accomplish this task. Once MMC is opened, snap-in the Certificates module. Make sure to choose Computer account and then Local computer when it prompts you for which certificate store you want to open. Expand Certificates (Local Computer) | Personal | Certificates . Right-click on this Certificates folder and choose All Tasks | Request New Certificate… . Click Next twice and you should see your list of certificate templates that are available on your internal CA server. If you do not see one that looks appropriate for requesting a website certificate, you may need to check over the settings on your CA server to make sure the correct templates are configured for issuing. My template is called Custom Web Server . Since this is a web server certificate, there is some additional information that I need to provide in my request in order to successfully issue a certificate. So I go ahead and click on that link that says More information is required to enroll for this certificate. Click here to configure settings. .  Drop-down the Subject name | Type menu and choose the option Common name . Enter a common name for our website into the Value field, which in my case is nls.mydomain.local. Click the Add button and your CN should move over to the right side of the screen like this:  Click on OK then click on the Enroll button. You should now have an SSL certificate sitting in your certificates store that can be used to authenticate traffic moving to our nls.mydomain.local name. Open up Internet Information Services (IIS) Manager , and browse to the Sites folder. Go ahead and remove the default website that IIS automatically set up, so that we can create our own NLS website without any fear of conflict. Click on the Add Website… action. Populate the information as shown in the following screenshot. Make sure to choose your own IP address and SSL certificate from the lists, of course:  Click the OK button and you now have an NLS website running successfully in your network. You should be able to open up a browser on a client computer sitting inside the network and successfully browse to https://nls.mydomain.local. How it works... In this recipe, we configured a basic Network Location Server website for use with our DirectAccess environment. This site will do exactly what we need it to when our DA client computers try to validate whether they are inside or outside the corporate network. While this recipe meets our requirements for NLS, and in fact puts us into a good practice of installing DirectAccess with NLS being hosted on its own web server, there is yet another step you could take to make it even better. Currently this web server is a single point of failure for NLS. If this web server goes down or has a problem, we would have DirectAccess client computers inside the office who would think they are outside, and they would have some major name resolution problems until we sorted out the NLS problem. Given that, it is a great idea to make NLS redundant. You could cluster servers together, use Microsoft Network Load Balancing ( NLB ), or even use some kind of hardware load balancer if you have one available in your network. This way you could run the same NLS website on multiple web servers and know that your clients will still work properly in the event of a web server failure. Summary This article encourages you to use Windows Server 2012 R2 as the connectivity platform that brings your remote computers into the corporate network. We discussed DirectAccess and VPN in this article. We also saw how to configure DirectAccess and VPN, and how to secure DirectAccess using certificate authentication. Resources for Article: Further resources on this subject: Cross-premise Connectivity [article] Setting Up and Managing E-mails and Batch Processing [article] Upgrading from Previous Versions [article]
Read more
  • 0
  • 0
  • 1564
Modal Close icon
Modal Close icon