Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7014 Articles
article-image-labview-basics
Packt
02 Nov 2016
8 min read
Save for later

LabVIEW Basics

Packt
02 Nov 2016
8 min read
In this article by Behzad Ehsani, author of the book Data Acquisition using LabVIEW, after a brief introduction and a short note on installation, we will go over the most widely used pallets and objects Icon tool bar from a standard installation of LabVIEW and a brief explanation of what each object does. (For more resources related to this topic, see here.) Introduction to LabVIEW LabVIEW is a graphical developing and testing environment unlike any other test and development tool available in the industry. LabVIEW sets itself apart from traditional programming environment by its complete graphical approach to programming. As an example, while representation of a while loop in a text based language such as the C language consists of several predefined, extremely compact and sometimes extremely cryptic lines of text, a while a loop in LabVIEW, is actually a graphical loop. The environment is extremely intuitive and powerful, which makes for a short learning cure for the beginner. LabVIEW is based on what is called G language, but there are still other languages, especially C under the hood. However, the ease of use and power of LabVIEW is somewhat deceiving to a novice user. Many people have attempt to start projects in LabVIEW only because at the first glace, the graphical nature of interface and the concept of drag an drop used in LabVIEW, appears to do away with required basics of programming concepts and classical education in programming science and engineering. This is far from the reality of using LabVIEW as the predominant development environment. While it is true that in many higher level development and testing environment, specially when using complicated test equipment and complex mathematical calculations or even creating embedded software LabVIEW's approach will be much more time efficient and bug free environment which otherwise would require several lines of code in traditional text based programming environment, one must be aware of LabVIEW's strengths and possible weaknesses. LabVIEW does not completely replace the need for traditional text based languages and depending on the entire nature of a project, LabVIEW or another traditional text based language such as C may be the most suitable programming or test environment. Installing LabVIEW Installation of LabVIEW is very simple and it is just as routine as any modern day program installation; that is, Insert the DVD 1 and follow on-screen guided installation steps. LabVIEW comes in one DVD for Mac and Linux version but in four or more DVDs for the Windows edition (depending on additional software, different licensing and additional libraries and packages purchased.) In this article we will use LabVIEW 2013 Professional Development version for Windows. Given the target audience of this article, we assume the user is well capable of installation of the program. Installation is also well documented by National Instruments and the mandatory one year support purchase with each copy of LabVIEW is a valuable source of live and email help. Also, NI web site (www.ni.com) has many user support groups that are also a great source of support, example codes, discussion groups and local group events and meeting of fellow LabVIEW developers, etc. One worthy note for those who are new to installation of LabVIEW is that the installation DVDs include much more than what an average user would need and pay for. We do strongly suggest that you install additional software (beyond what has been purchased and licensed or immediate need!) These additional software are fully functional (in demo mode for 7 days) which may be extended for about a month with online registration. This is a very good opportunity to have hands on experience with even more of power and functionality that LabVIEW is capable to offer. The additional information gained by installing other available software on the DVDs may help in further development of a given project. Just imagine if the current development of a robot only encompasses mechanical movements and sensors today, optical recognition probably is going to follow sooner than one may think. If data acquisition using expensive hardware and software may be possible in one location, the need to web sharing and remote control of the setup is just around the corner. It is very helpful to at least be aware of what packages are currently available and be able to install and test them prior to a full purchase and implementation. The following screenshot shows what may be installed if almost all software on all DVDs are selected: When installing a fresh version of LabVIEW, if you do decide to observe the advice above, make sure to click on the + sign next to each package you decide to install and prevent any installation of LabWindows/CVI.... and Measurement Studio... for Visual Studio LabWindows according to National Instruments .., is an ANSI C integrated development environment. Also note that by default NI device drivers are not selected to be installed. Device drivers are an essential part of any data acquisition and appropriate drivers for communications and instrument(s) control must be installed before LabVIEW can interact with external equipments. Also, note that device drivers (on Windows installations) come on a separate DVD which means that one does not have to install device drivers at the same time that the main application and other modules are installed; they can be installed at any time later on. Almost all well established vendors are packaging their product with LabVIEW drivers and example codes. If a driver is not readily available, National Instruments has programmers that would do just that. But this would come at a cost to the user. VI Package manager, now installed as a part of standard installation is also a must these days. National Instruments distributes third party software and drivers and public domain packages via VI Package manager. Appropriate software and drivers for these microcontrollers are installed via VI Package manager. You can install many public domain packages that further installs many useful LabVIEW toolkits to a LabVIEW installation and can be used just as those that are delivered professionally by National Instruments. Finally, note that the more modules, packages and software are selected to be installed the longer it will take to complete the installation. This may sound like making an obvious point but surprisingly enough installation of all software on the three DVDs (for Windows) take up over five hours! On a standard laptop or pc we used. Obviously a more powerful PC (such as one with solid sate hard drive) my not take such log time: LabVIEW Basics Once the LabVIEW applications is launched, by default two blank windows open simultaneously; a Front Panel and a Block Diagram window and a VI is created: VIs or Virtual Instruments are heart and soul of LabVIEW. They are what separate LabVIEW from all other text based development environments. In LabVIEW everything is an object which is represented graphically. A VI may only consist of a few objects or hundreds of objects embedded in many subVIs These graphical representation of a thing, be it a simple while loop, a complex mathematical concept such as polynomial interpolation or simply a Boolean constant are all graphically represented. To use an object right-click inside the block diagram or front panel window, a pallet list appears. Follow the arrow and pick an object from the list of object from subsequent pallet an place it on the appropriate window. The selected object now can be dragged and place on different location on the appropriate window and is ready to be wired. Depending on what kind of object is selected, a graphical representation of the object appears on both windows. Of cores there are many exceptions to this rule. For example a while loop can only be selected in Block Diagram and by itself, a while loop does not have a graphical representation on the front panel window. Needless to say, LabVIEW also has keyboard combination that expedite selecting and placing any given toolkit objects onto the appropriate window. Each object has one (or several) wire connections going into as input(s) and coming out as its output(s). A VI becomes functional when a minimum number of wires are appropriately connected to input and output of one or more object. Later, we will use an example to illustrate how a basic LabVIEW VI is created and executed. Highlights LabVIEW is a complete object-oriented development and test environment based on G language. As such it is a very powerful and complex environment. In article one we went through introduction to LabVIEW and its main functionality of each of its icon by way of an actual user interactive example. Accompanied by appropriate hardware (both NI as well as many industry standard test, measurement and development hardware products) LabVIEW is capable to cover from developing embedded systems to fuzzy logic and almost everything in between! Summary In this article we cover the basics of LabVIEW, from installation to in depth explanation of each and every element in the toolbar. Resources for Article: Further resources on this subject: Python Data Analysis Utilities [article] Data mining [article] PostgreSQL in Action [article]
Read more
  • 0
  • 0
  • 3704

article-image-setting-environment-aspnet-mvc-6
Packt
02 Nov 2016
9 min read
Save for later

Setting Up the Environment for ASP.NET MVC 6

Packt
02 Nov 2016
9 min read
In this article by Mugilan TS Raghupathi author of the book Learning ASP.NET Core MVC Programming explains the setup for getting started with programming in ASP.NET MVC 6. In any development project, it is vital to set up the right kind of development environment so that you can concentrate on the developing the solution rather than solving the environment issues or configuration problems. With respect to .NET, Visual Studio is the de-facto standard IDE (Integrated Development Environment) for building web applications in .NET. In this article, you'll be learning about the following topics: Purpose of IDE Different offerings of Visual Studio Installation of Visual Studio Community 2015 Creating your first ASP.NET MVC 5 project and project structure (For more resources related to this topic, see here.) Purpose of IDE First of all, let us see why we need an IDE, when you can type the code in Notepad, compile, and execute it. When you develop a web application, you might need the following things for you to be productive: Code editor: This is the text editor where you type your code. Your code-editor should be able to recognize different constructs such as the if condition, for loop of your programming language. In Visual Studio, all of your keywords would be highlighted in blue color. Intellisense: Intellisense is a context aware code-completion feature available in most of the modern IDEs including Visual Studio. One such example is, when you type a dot after an object, this Intellisense feature lists out all the methods available on the object. This helps the developers to write code faster and easier. Build/Publish: It would be helpful if you could build or publish the application using a single click or single command. Visual Studio provides several options out of the box to build a separate project or to build the complete solution at a single click. This makes the build and deployment of your application easier. Templates: Depending on the type of the application, you might have to create different folders and files along with the boilerplate code. So, it'll be very helpful if your IDE supports the creation of different kinds of templates. Visual Studio generates different kinds of templates with the code for ASP.Net Web Forms, MVC, and Web API to get you up and running. Ease of addition of items: Your IDE should allow you to add different kinds of items with ease. For example, you should be able to add an XML file without any issues. And if there is any problem with the structure of your XML file, it should be able to highlight the issue along with the information and help you to fix the issues. Visual Studio offerings There are different versions of Visual Studio 2015 available to satisfy the various needs of the developers/organizations. Primarily, there are four versions of Visual Studio 2015: Visual Studio Community Visual Studio Professional Visual Studio Enterprise Visual Studio Test Professional System requirements Visual Studio can be installed on computers installed with Operation System Windows 7 Service Pack1 and above. You can get to know the complete list of requirements from the following URL: https://www.visualstudio.com/en-us/downloads/visual-studio-2015-system-requirements-vs.aspx Visual Studio Community 2015 This is a fully featured IDE available for building desktops, web applications, and cloud services. It is available free of cost for individual users. You can download Visual Studio Community from the following URL: https://www.visualstudio.com/en-us/products/visual-studio-community-vs.aspx Throughout this book, we will be using the Visual Studio Community version for development as it is available free of cost to individual developers. Visual Studio Professional As the name implies, Visual Studio Professional is targeted at professional developers which contains features such as Code Lens for improving your team's productivity. It also has features for greater collaboration within the team. Visual Studio Enterprise Visual Studio Enterprise is a full blown version of Visual Studio with a complete set of features for collaboration, including a team foundation server, modeling, and testing. Visual Studio Test Professional Visual Studio Test Professional is primarily aimed for the testing team or the people who are involved in the testing which might include developers. In any software development methodology—either the waterfall model or agile—developers need to execute the development suite test cases for the code they are developing. Installation of Visual Studio Community Follow the given steps to install Visual Studio Community 2015: Visit the following link to download Visual Studio Community 2015: https://www.visualstudio.com/en-us/products/visual-studio-community-vs.aspx Click on the Download Community 2015 button. Save the file in a folder where you can retrieve it easily later: Run the downloaded executable file: Click on Run and the following screen will appear: There are two types of installation—default and custom installation. Default installation installs the most commonly used features and this will cover most of the use cases of the developer. Custom installation helps you to customize the components that you want to get installed, such as the following: Click on the Install button after selecting the installation type. Depending on your memory and processor speed, it will take 1 to 2 hours to install. Once all the components are installed, you will see the following Setup completed screen: Installation of ASP.NET 5 When we install the Visual Studio Community 2015 edition, ASP.NET 5 will not have been installed by default. As the ASP.NET MVC 6 application runs on top of ASP.NET 5, we need to install ASP.NET 5. There are couple of ways to install ASP.NET 5: Get ASP.NET 5 from https://get.asp.net/ Another option is to install from the New Project template in Visual Studio This option is bit easier as you don't need to search and install. The following are the detailed steps: Create a new project by selecting File | New Project or using the shortcut Ctrl + Shift + N: Select ASP.NET Web Application and enter the project name and click on OK: The following window will appear to select the template. Select the Get ASP.NET 5 RC option as shown in the following screenshot: When you click on OK in the preceding screen, the following window will appear: When you click on the Run or Save button in the preceding dialog, you will get the following screen asking for ASP.NET 5 Setup. Select the checkbox, I agree to the license terms and conditions and click on the Install button: Installation of ASP.NET 5 might take couple of hours and once it is completed you'll get the following screen: During the process of installation of ASP.NET 5 RC1 Update 1, it might ask you to close the Visual Studio. If asked, please do so. Project structure in ASP.Net 5 application Once the ASP.NET 5 RC1 is successfully installed, open the Visual Studio and create a new project and select the ASP.NET 5 Web Application as shown in the following screenshot: A new project will be created and the structure will be like the following: File-based project Whenever you add a file or folder in your file system (inside of our ASP.NET 5 project folder), the changes will be automatically reflected in your project structure. Support for full .NET and .NET core You could see a couple of references in the preceding project: DNX 4.5.1 and DNX Core 5.0. DNX 4.5.1 provides functionalities of full-blown .NET whereas DNX Core 5.0 supports only the core functionalities—which would be used if you are deploying the application across cross-platforms such as Apple OS X, Linux. The development and deployment of an ASP.NET MVC 6 application on a Linux machine will be explained in the book. The Project.json package Usually in an ASP.NET web application, we would be having the assemblies as references and the list of references in a C# project file. But in an ASP.NET 5 application, we have a JSON file by the name of Project.json, which will contain all the necessary configuration with all its .NET dependencies in the form of NuGet packages. This makes dependency management easier. NuGet is a package manager provided by Microsoft, which makes the package installation and uninstallation easier. Prior to NuGet, all the dependencies had to be installed manually. The dependencies section identifies the list of dependent packages available for the application. The frameworks section informs about the frameworks being supported for the application. The scripts section identifies the script to be executed during the build process of the application. Include and exclude properties can be used in any section to include or exclude any item. Controllers This folder contains all of your controller files. Controllers are responsible for handling the requests and communicating the models and generating the views for the same. Models All of your classes representing the domain data will be present in this folder. Views Views are files which contain your frontend components and are presented to the end users of the application. This folder contains all of your Razor View files. Migrations Any database-related migrations will be available in this folder. Database migrations are the C# files which contain the history of any database changes done through an Entity Framework (an ORM framework). This will be explained in detail in the book. The wwwroot folder This folder acts as a root folder and it is the ideal container to place all of your static files such as CSS and JavaScript files. All the files which are placed in wwwroot folder can be directly accessed from the path without going through the controller. Other files The appsettings.json file is the config file where you can configure application level settings. Bower, npm (Node Package Manager), and gulpfile.js are client-side technologies which are supported by ASP.NET 5 applications. Summary In this article, you have learnt about the offerings in Visual Studio. Step-by-step instructions are provided for the installation of the Visual Studio Community version—which is freely available for individual developers. We have also discussed the new project structure of the ASP.Net 5 application and the changes when compared to the previous versions. In this book, we are going to discuss the controllers and their roles and functionalities. We'll also build a controller and associated action methods and see how it works. Resources for Article: Further resources on this subject: Designing your very own ASP.NET MVC Application [article] Debugging Your .NET Application [article] Using ASP.NET Controls in SharePoint [article]
Read more
  • 0
  • 0
  • 16502

article-image-magento-theme-distribution
Packt
02 Nov 2016
8 min read
Save for later

Magento Theme Distribution

Packt
02 Nov 2016
8 min read
"Invention is not enough. Tesla invented the electric power we use, but he struggled to get it out to people. You have to combine both things: invention and innovation focus, plus the company that can commercialize things and get them to people" – Larry Page In this article written by Fernando J Miguel, author of the book Magento 2 Theme Design Second Edition, you will learn the process of sharing, code hosting, validating, and publishing your subject as well as future components (extensions/modules) that you develop for Magento 2. (For more resources related to this topic, see here.) The following topics will be covered in this article: The packaging process Packaging your theme Hosting your theme The Magento marketplace The packaging process For every theme you develop for distribution in marketplaces and repositories through the sale and delivery of projects to clients and contractors of the service, you must follow some mandatory requirements for the theme to be packaged properly and consequently distributed to different Magento instances. Magento uses the composer.json file to define dependencies and information relevant to the developed component. Remember how the composer.json file is declared in the Bookstore theme: { "name": "packt/bookstore", "description": "BookStore theme", "require": { "php": "~5.5.0|~5.6.0|~7.0.0", "magento/theme-frontend-luma": "~100.0", "magento/framework": "~100.0" }, "type": "magento2-theme", "version": "1.0.0", "license": [ "OSL-3.0", "AFL-3.0" ], "autoload": { "files": [ "registration.php" ], "psr-4": { "Packt\BookStore\": "" } } } The main fields of the declaration components in the composer.json file are as follows: Name: A fully qualified component name Type: This declares the component type Autoload: This specifies the information necessary to be loaded in the component The three main types of Magento 2 component declarations can be described as follows: Module: Use the magento2-module type to declare modules that add to and/or modify functionalities in the Magento 2 system Theme: Use the magento2-theme type to declare themes in Magento 2 storefronts Language package: Use the magento2-language type to declare translations in the Magento 2 system Besides the composer.json file that must be declared in the root directory of your theme, you should follow these steps to meet the minimum requirements for packaging your new theme: Register the theme by declaring the registration.php file. Package the theme, following the standards set by Magento. Validate the theme before distribution. Publish the theme. From the minimum requirements mentioned, you already are familiar with the composer.json and registration.php files. Now we will look at the packaging process, validation, and publication in sequence. Packaging your theme By default, all themes should be compressed in ZIP format and contain only the root directory of the component developed, excluding any file and directory that is not part of the standard structure. The following command shows the compression standard used in Magento 2 components: zip -r vendor-name_package-name-1.0.0.zip package-path/* -x 'package-path/.git/*' Here, the name of the ZIP file has the following components: vendor: This symbolizes the vendor by which the theme was developed name_package: This is the package name name: This is the component name 1.0.0: This is the component version After formatting the component name, it defines which directory will be compressed, followed by the -x parameter, which excludes the git directory from the theme compression. How about applying ZIP compression on the Bookstore theme? To do this, follow these steps: Using a terminal or Command Prompt, access the theme's root directory: <magento_root>/app/design/frontend/Packt/bookstore. Run the zip packt-bookstore-bookstore.1.0.0.zip*-x'.git/*' command. Upon successfully executing this command, you will have packed your theme, and your directory will be as follows: After this, you will validate your new Magento theme using a verification tool. Magento component validation The Magento developer community created the validate_m2_package script to perform validation of components developed for Magento 2. This script is available on the GitHub repository of the Magento 2 development community in the marketplace-tools directory: According to the description, the idea behind Marketplace Tools is to house standalone tools that developers can use to validate and verify their extensions before submitting them to the Marketplace. Here's how to use the validation tool: Download the validate_m2_package.php script, available at https://github.com/magento/marketplace-tools. Move the script to the root directory of the Bookstore theme <magento_root>/app/design/frontend/Packt/bookstore. Open a terminal or Command Prompt. Run the validate_m2_package.php packt-bookstore-bookstore.1.0.0.zip PHP command. This command will validate the package you previously created with the ZIP command. If all goes well, you will not have any response from the command line, which will mean that your package is in line with the minimum requirements for publication. If you wish, you can use the -d parameter that enables you to debug your component by printing messages during verification. To use this option, run the following command: php validate_m2_package.php -d packt-bookstore-bookstore.1.0.0.zip If everything goes as expected, the response will be as follows: Hosting your theme You can share your Magento theme and host your code on different services to achieve greater interaction with your team or even with the Magento development community. Remembering that the standard control system software version used by the Magento development community is Git. There are some options well used in the market, so you can distribute your code and share your work. Let's look at some of these options. Hosting your project on GitHub and Packagist The most common method of hosting your code/theme is to use GitHub. Once you have created a repository, you can get help from the Magento developer community if you are working on an open source project or even one for learning purposes. The major point of using GitHub is the question of your portfolio and the publication of your Magento 2 projects developed, which certainly will make a difference when you are looking for employment opportunities and trying to get selected for new projects. GitHub has a specific help area for users that provides a collection of documentation that developers may find useful. GitHub Help can be accessed directly at https://help.github.com/: To create a GitHub repository, you can consult the official documentation, available at https://help.github.com/articles/create-a-repo/. Once you have your project published on GitHub, you can use the Packagist (https://packagist.org/) service by creating a new account and entering the link of your GitHub package on Packagist: Packagist collects information automatically from the available composer.json file in the GitHub repository, creating your reference to use in other projects. Hosting your project in a private repository In some cases, you will be developing your project for private clients and companies. In case you want to keep your version control in private mode, you can use the following procedure: Create your own package composer repository using the Toran service (https://toranproxy.com/). Create your package as previously described. Send your package to your private repository. Add the following to your composer.json file: { "repositories": [ { "type": "composer", "url": [repository url here] } ] } Magento Marketplace According to Magento, Marketplace (https://marketplace.magento.com/) is the largest global e-commerce resource for applications and services that extend Magento solutions with powerful new features and functionality. Once you have completed developing the first version of your theme, you can upload your project to be a part of the official marketplace of Magento. In addition to allowing theme uploads, Magento Marketplace also allows you to upload shared packages and extensions (modules). To learn more about shared packages, visit http://docs.magento.com/marketplace/user_guide/extensions/shared-package-submit.html. Submitting your theme After the compression and validation processes, you can send your project to be distributed to Magento Marketplace. For this, you should confirm an account on the developer portal (https://developer.magento.com/customer/account/) with a valid e-mail and personal information about the scope of your activities. After this confirmation, you will have access to the extensions area at https://developer.magento.com/extension/extension/list/, where you will find options to submit themes and extensions: After clicking on the Add Theme button, you will need to answer a questionnaire: Which Magento platform your theme will work on The name of your theme Whether your theme will have additional services Additional functionalities your theme has What makes your theme unique After the questionnaire, you will need to fill in the details of your extension, as follows: Extension title Public version Package file (upload) The submitted theme will be evaluated by a technical review, and you will be able to see the evaluation progress through your e-mail and the control panel of the Magento developer area. You can find more information about Magento Marketplace at the following link: http://docs.magento.com/marketplace/user_guide/getting-started.html Summary In this article, you learned about the theme-packaging process besides validation according to the minimum requirements for its publication on Magento Marketplace. You are now ready to develop your solutions! There is still a lot of work left, but I encourage you to seek your way as a Magento theme developer by putting a lot of study, research, and application into the area. Participate in events, be collaborative, and count on the community's support. Good luck and success in your career path! Resources for Article: Further resources on this subject: Installing Magento [article] Social Media and Magento [article] Magento 2 – the New E-commerce Era [article]
Read more
  • 0
  • 0
  • 36136

article-image-supervision-and-monitoring
Packt
02 Nov 2016
8 min read
Save for later

Supervision and Monitoring

Packt
02 Nov 2016
8 min read
In this article by Piyush Mishra, author of the book Akka Cookbook, we will learn about supervision and monitoring of Akka actors. (For more resources related to this topic, see here.) Using supervision and monitoring, we can write fault-tolerant systems, which can run continuously for days, months, and years without stopping. Fault tolerance is a property of the systems which are intended to be always responsive rather than failing completely in case of a failure. Such systems are known as fault tolerance systems or resilient systems. In simple words, a fault-tolerant system is one which is destined to continue as more or less fully operational, with perhaps a reduction in throughput or an increase in response time because of partial failure of its components. Even if a components fails, the whole system never gets shut down, instead, it remains operational and responsive with just a decreased throughput. Similarly, while designing a distributed system, we need to care about what would happen if one or more it's components go down. So, the system design should itself be such that the system is able to take appropriate action to resolve the issue. In this article, we will cover the following recipe: Creating child actors of a parent actor Overriding the life cycle hooks of an actor Sending messages to actors and collecting responses Creating child actors of a parent actor In this recipe, we will learn how to create child actors of an actor. Akka follows a tree-like structure to create actors, and it is also the recommended practice. By following such practices, we can handle failures in actors as the parent can take care of it. Lets see how to do it. Getting ready We need to import the Hello-Akka project in the IDE of our choice. The Akka actor dependency that we added in build.sbt is sufficient for most of the recipes in this article, so we will skip the Getting ready section in our further recipes. How to do it… Create a file named ParentChild.scala in package com.packt.chapter2. Add the following imports to the top of the file: import akka.actor.{ActorSystem, Props, Actor} Create messages for sending to actors. case object CreateChild case class Greet(msg: String) Define a child actor as follows: class ChildActor extends Actor { def receive = { case Greet(msg) => println(s"My parent[${self.path.parent}] greeted to me [${self.path}] $msg") } } Define a parent actor as follows, and create a child actor in its context: class ParentActor extends Actor { def receive = { case CreateChild => val child = context.actorOf(Props[ChildActor], "child") child ! Greet("Hello Child") } } Create an application object as shown next: object ParentChild extends App { val actorSystem = ActorSystem("Supervision") val parent = actorSystem.actorOf(Props[ParentActor], "parent") parent ! CreateChild } Run the preceding application, and you will get the following output: My parent[akka://Supervision/user/parent] greeted to me [akka://Supervision/user/parent/child] Hello Child     How it works… In this recipe, we created a child actor, which receives a message, Greet, from the parent actor. We see the parent actor create a child actor using context.actorOf. This method creates a child actor under the parent actor. We can see the path of the actor in the output clearly. Overriding life cycle hooks of an actor Since we are talking about supervision and monitoring of actors, you should understand the life cycle hooks of an actor. In this recipe, you will learn how to override the life cycle hooks of an actor when it starts, stops, prestarts, and postrestarts. How to do it… Create a file called ActorLifeCycle.scala in package com.packt.chapter2. Add the following imports to the top of the file: import akka.actor import akka.actor.SupervisorStrategy import akka.util.Timeout. import scala.concurrent.Await import scala.concurrent.duration import akka.pattern.ask Create the following messages to be sent to the actors: case object Error case class StopActor(actorRef: ActorRef) Create an actor as follows, and override the life cycle methods: class LifeCycleActor extends Actor { var sum = 1 override def preRestart(reason: Throwable, message: Option[Any]): Unit = { println(s"sum in preRestart is $sum") } override def preStart(): Unit = println(s"sum in preStart is $sum") def receive = { case Error => throw new ArithmeticException() case _ => println("default msg") } override def postStop(): Unit = { println(s"sum in postStop is ${sum * 3}") } override def postRestart(reason: Throwable): Unit = { sum = sum * 2 println(s"sum in postRestart is $sum") } } Create a supervisor actor as follows: class Supervisor extends Actor { override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) { case _: ArithmeticException => Restart case t => super.supervisorStrategy.decider.applyOrElse(t, (_: Any) => Escalate) } def receive = { case (props: Props, name: String) => sender ! context.actorOf(props, name) case StopActor(actorRef) => context.stop(actorRef) } } Create a test application as shown next, and run the application. object ActorLifeCycle extends App { implicit val timeout = Timeout(2 seconds) val actorSystem = ActorSystem("Supervision") val supervisor = actorSystem.actorOf(Props[Supervisor], "supervisor") val childFuture = supervisor ? (Props(new LifeCycleActor), "LifeCycleActor") val child = Await.result(childFuture.mapTo[ActorRef], 2 seconds) child ! Error Thread.sleep(1000) supervisor ! StopActor(child) } Create another test application as follows, and run it. object ActorLifeCycle extends App { implicit val timeout = Timeout(2 seconds) val actorSystem = ActorSystem("Supervision") val supervisor = actorSystem.actorOf(Props[Supervisor], "supervisor") val childFuture = supervisor ? (Props(new LifeCycleActor), "LifeCycleActor") val child = Await.result(childFuture.mapTo[ActorRef], 2 seconds) child ! Error Thread.sleep(1000) supervisor ! StopActor(child) } On running the preceding test application, you will get the following output: sum in preStart is 1 sum in preRestart is 1 sum in postRestart is 2 [ERROR] [07/01/2016 00:49:57.568] [Supervision-akka.actor.default-dispatcher-5] [akka://Supervision/user/supervisor/LifeCycleActor] null java.lang.ArithmeticException at com.packt.chapter2.LifeCycleActor$ $anonfun$receive$2.applyOrElse(ActorLifeCycle.scala:51) sum in postStop is 6 How it works… In this preceding recipe, we create an actor, which maintains sum as a state, and we modify its life cycle hooks. We create this actor under the parent supervisor, which handles the ArthimaticException in the child actor. Let's see what happens in life cycle hooks. When an actor starts, it calls the preStart method, so we see the following output: "sum in preStart is 1". When an actor throws an exception, it sends a message to the supervisor, and the supervisor handles the failure by restarting that actor. It clears out the accumulated state of the actor, creates a fresh new actor means, and then restores the last value assigned to the state of old actor to the preRestart value. After that postRestart method is called, and whenever the actor stops, the supervisor calls the postStop. Sending messages to actors and collecting responses In this recipe, you will learn how a parent sends messages to its child, and collects responses from them. To step through this recipe, we need to import the Hello-Akka project in the IDE. How to do it… Create a file, SendMesagesToChilds.scala, in package com.packt.chapter2. Add the following imports to the top of the file: import akka.actor.{ Props, ActorSystem, Actor, ActorRef } Create messages to be sent to the actors as follows: case class DoubleValue(x: Int) case object CreateChild case object Send case class Response(x: Int) Define a child actor. It doubles the value sent to it. class DoubleActor extends Actor { def receive = { case DoubleValue(number) => println(s"${self.path.name} Got the number $number") sender ! Response(number * 2) } } Define a parent actor. It creates child actors in its context, sends messages to them, and collects responses from them. class ParentActor extends Actor { val random = new scala.util.Random var childs = scala.collection.mutable.ListBuffer[ActorRef]() def receive = { case CreateChild => childs ++= List(context.actorOf(Props[DoubleActor])) case Send => println(s"Sending messages to child") childs.zipWithIndex map { case (child, value) => child ! DoubleValue(random.nextInt(10)) } case Response(x) => println(s"Parent: Response from child $ {sender.path.name} is $x") } } Create a test application as follows, and run it: object SendMessagesToChild extends App { val actorSystem = ActorSystem("Hello-Akka") val parent = actorSystem.actorOf(Props[ParentActor], "parent") parent ! CreateChild parent ! CreateChild parent ! CreateChild parent ! Send } On running the preceding test application, you will get the following output: $b Got the number 6 $a Got the number 5 $c Got the number 8 Parent: Response from child $a is 10 Parent: Response from child $b is 12 Parent: Response from child $c is 16 How it works… In this last recipe, we create a child actor called DoubleActor, which doubles the value it gets. We also create a parent actor, which creates a child actor when it receives a CreateChild message, and maintains it in the list. When the parent actor receives the message Send, it sends a random number to the child, and the child, in turn, sends a response to the parent. Summary In this article, you learned how to supervise and monitor Akka actors as well as create child actors of an actor. We also discussed how to override the life cycle hooks of an actor. Lastly, you learned how a parent sends messages to its child and collects responses from them. Resources for Article: Further resources on this subject: Introduction to Akka [article] Creating First Akka Application [article] Making History with Event Sourcing [article]
Read more
  • 0
  • 0
  • 1885

article-image-getting-started-python-packages
Packt
02 Nov 2016
37 min read
Save for later

Getting Started with Python Packages

Packt
02 Nov 2016
37 min read
In this article by Luca Massaron and Alberto Boschetti the authors of the book Python Data Science Essentials - Second Edition we will cover steps on installing Python, the different installation packages and have a glance at the essential packages will constitute a complete Data Science Toolbox. (For more resources related to this topic, see here.) Whether you are an eager learner of data science or a well-grounded data science practitioner, you can take advantage of this essential introduction to Python for data science. You can use it to the fullest if you already have at least some previous experience in basic coding, in writing general-purpose computer programs in Python, or in some other data-analysis-specific language such as MATLAB or R. Introducing data science and Python Data science is a relatively new knowledge domain, though its core components have been studied and researched for many years by the computer science community. Its components include linear algebra, statistical modelling, visualization, computational linguistics, graph analysis, machine learning, business intelligence, and data storage and retrieval. Data science is a new domain and you have to take into consideration that currently its frontiers are still somewhat blurred and dynamic. Since data science is made of various constituent sets of disciplines, please also keep in mind that there are different profiles of data scientists depending on their competencies and areas of expertise. In such a situation, what can be the best tool of the trade that you can learn and effectively use in your career as a data scientist? We believe that the best tool is Python, and we intend to provide you with all the essential information that you will need for a quick start. In addition, other tools such as R and MATLAB provide data scientists with specialized tools to solve specific problems in statistical analysis and matrix manipulation in data science. However, only Python really completes your data scientist skill set. This multipurpose language is suitable for both development and production alike; it can handle small- to large-scale data problems and it is easy to learn and grasp no matter what your background or experience is. Created in 1991 as a general-purpose, interpreted, and object-oriented language, Python has slowly and steadily conquered the scientific community and grown into a mature ecosystem of specialized packages for data processing and analysis. It allows you to have uncountable and fast experimentations, easy theory development, and prompt deployment of scientific applications. At present, the core Python characteristics that render it an indispensable data science tool are as follows: It offers a large, mature system of packages for data analysis and machine learning. It guarantees that you will get all that you may need in the course of a data analysis, and sometimes even more. Python can easily integrate different tools and offers a truly unifying ground for different languages, data strategies, and learning algorithms that can be fitted together easily and which can concretely help data scientists forge powerful solutions. There are packages that allow you to call code in other languages (in Java, C, FORTRAN, R, or Julia), outsourcing some of the computations to them and improving your script performance. It is very versatile. No matter what your programming background or style is (object-oriented, procedural, or even functional), you will enjoy programming with Python. It is cross-platform; your solutions will work perfectly and smoothly on Windows, Linux, and Mac OS systems. You won't have to worry all that much about portability. Although interpreted, it is undoubtedly fast compared to other mainstream data analysis languages such as R and MATLAB (though it is not comparable to C, Java, and the newly emerged Julia language). Moreover, there are also static compilers such as Cython or just-in-time compilers such as PyPy that can transform Python code into C for higher performance. It can work with large in-memory data because of its minimal memory footprint and excellent memory management. The memory garbage collector will often save the day when you load, transform, dice, slice, save, or discard data using various iterations and reiterations of data wrangling. It is very simple to learn and use. After you grasp the basics, there's no better way to learn more than by immediately starting with the coding. Moreover, the number of data scientists using Python is continuously growing: new packages and improvements have been released by the community every day, making the Python ecosystem an increasingly prolific and rich language for data science. Installing Python First, let's proceed to introduce all the settings you need in order to create a fully working data science environment to test the examples and experiment with the code that we are going to provide you with. Python is an open source, object-oriented, and cross-platform programming language. Compared to some of its direct competitors (for instance, C++ or Java), Python is very concise.  It allows you to build a working software prototype in a very short time. Yet it has become the most used language in the data scientist's toolbox not just because of that. It is also a general-purpose language, and it is very flexible due to a variety of available packages that solve a wide spectrum of problems and necessities. Python 2 or Python 3? There are two main branches of Python: 2.7.x and 3.x. At the time of writing this article, the Python foundation (www.python.org) is offering downloads for Python version 2.7.11 and 3.5.1. Although the third version is the newest, the older one is still the most used version in the scientific area, since a few packages (check on the website py3readiness.org for a compatibility overview) won't run otherwise yet. In addition, there is no immediate backward compatibility between Python 3 and 2. In fact, if you try to run some code developed for Python 2 with a Python 3 interpreter, it may not work. Major changes have been made to the newest version, and that has affected past compatibility. Some data scientists, having built most of their work on Python 2 and its packages, are reluctant to switch to the new version. We intend to address a larger audience of data scientists, data analysts and developers, who may not have such a strong legacy with Python 2. Thus, we agreed that it would be better to work with Python 3 rather than the older version. We suggest using a version such as Python 3.4 or above. After all, Python 3 is the present and the future of Python. It is the only version that will be further developed and improved by the Python foundation and it will be the default version of the future on many operating systems. Anyway, if you are currently working with version 2 and you prefer to keep on working with it, you can still the examples. In fact, for the most part, our code will simply work on Python 2 after having the code itself preceded by these imports: from __future__ import (absolute_import, division, print_function, unicode_literals) from builtins import * from future import standard_library standard_library.install_aliases() The from __future__ import commands should always occur at the beginning of your scripts or else you may experience Python reporting an error. As described in the Python-future website (python-future.org), these imports will help convert several Python 3-only constructs to a form compatible with both Python 3 and Python 2 (and in any case, most Python 3 code should just simply work on Python 2 even without the aforementioned imports). In order to run the upward commands successfully, if the future package is not already available on your system, you should install it (version >= 0.15.2) using the following command to be executed from a shell: $> pip install –U future If you're interested in understanding the differences between Python 2 and Python 3 further, we recommend reading the wiki page offered by the Python foundation itself: wiki.python.org/moin/Python2orPython3. Step-by-step installation Novice data scientists who have never used Python (who likely don't have the language readily installed on their machines) need to first download the installer from the main website of the project, www.python.org/downloads/, and then install it on their local machine. We will now coversteps which will provide you with full control over what can be installed on your machine. This is very useful when you have to set up single machines to deal with different tasks in data science. Anyway, please be warned that a step-by-step installation really takes time and effort. Instead, installing a ready-made scientific distribution will lessen the burden of installation procedures and it may be well suited for first starting and learning because it saves you time and sometimes even trouble, though it will put a large number of packages (and we won't use most of them) on your computer all at once. This being a multiplatform programming language, you'll find installers for machines that either run on Windows or Unix-like operating systems. Please remember that some of the latest versions of most Linux distributions (such as CentOS, Fedora, Red Hat Enterprise, and Ubuntu) have Python 2 packaged in the repository. In such a case and in the case that you already have a Python version on your computer (since our examples run on Python 3), you first have to check what version you are exactly running. To do such a check, just follow these instructions: Open a python shell, type python in the terminal, or click on any Python icon you find on your system. Then, after having Python started, to test the installation, run the following code in the Python interactive shell or REPL: >>> import sys >>> print (sys.version_info) If you can read that your Python version has the major=2 attribute, it means that you are running a Python 2 instance. Otherwise, if the attribute is valued 3, or if the print statements reports back to you something like v3.x.x (for instance v3.5.1), you are running the right version of Python and you are ready to move forward. To clarify the operations we have just mentioned, when a command is given in the terminal command line, we prefix the command with $>. Otherwise, if it's for the Python REPL, it's preceded by >>>. The installation of packages Python won't come bundled with all you need, unless you take a specific premade distribution. Therefore, to install the packages you need, you can use either pip or easy_install. Both these two tools run in the command line and make the process of installation, upgrade, and removal of Python packages a breeze. To check which tools have been installed on your local machine, run the following command: $> pip To install pip, follow the instructions given at pip.pypa.io/en/latest/installing.html. Alternatively, you can also run this command: $> easy_install If both of these commands end up with an error, you need to install any one of them. We recommend that you use pip because it is thought of as an improvement over easy_install. Moreover, easy_install is going to be dropped in future and pip has important advantages over it. It is preferable to install everything using pip because: It is the preferred package manager for Python 3. Starting with Python 2.7.9 and Python 3.4, it is included by default with the Python binary installers. It provides an uninstall functionality. It rolls back and leaves your system clear if, for whatever reason, the package installation fails. Using easy_install in spite of pip's advantages makes sense if you are working on Windows because pip won't always install pre-compiled binary packages.Sometimes it will try to build the package's extensions directly from C source, thus requiring a properly configured compiler (and that's not an easy task on Windows). This depends on whether the package is running on eggs (and pip cannot directly use their binaries, but it needs to build from their source code) or wheels (in this case, pip can install binaries if available, as explained here: pythonwheels.com/). Instead, easy_install will always install available binaries from eggs and wheels. Therefore, if you are experiencing unexpected difficulties installing a package, easy_install can save your day (at some price anyway, as we just mentioned in the list). The most recent versions of Python should already have pip installed by default. Therefore, you may have it already installed on your system. If not, the safest way is to download the get-pi.py script from bootstrap.pypa.io/get-pip.py and then run it using the following: $> python get-pip.py The script will also install the setup tool from pypi.python.org/pypi/setuptools, which also contains easy_install. You're now ready to install the packages you need in order to run the examples provided in this article. To install the < package-name > generic package, you just need to run this command: $> pip install < package-name > Alternatively, you can run the following command: $> easy_install < package-name > Note that in some systems, pip might be named as pip3 and easy_install as easy_install-3 to stress the fact that both operate on packages for Python 3. If you're unsure, check the version of Python pip is operating on with: $> pip –V For easy_install, the command is slightly different: $> easy_install --version After this, the <pk> package and all its dependencies will be downloaded and installed. If you're not certain whether a library has been installed or not, just try to import a module inside it. If the Python interpreter raises an ImportError error, it can be concluded that the package has not been installed. This is what happens when the NumPy library has been installed: >>> import numpy This is what happens if it's not installed: >>> import numpy Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: No module named numpy In the latter case, you'll need to first install it through pip or easy_install. Take care that you don't confuse packages with modules. With pip, you install a package; in Python, you import a module. Sometimes, the package and the module have the same name, but in many cases, they don't match. For example, the sklearn module is included in the package named Scikit-learn. Finally, to search and browse the Python packages available for Python, look at pypi.python.org. Package upgrades More often than not, you will find yourself in a situation where you have to upgrade a package because either the new version is required by a dependency or it has additional features that you would like to use. First, check the version of the library you have installed by glancing at the __version__ attribute, as shown in the following example, numpy: >>> import numpy >>> numpy.__version__ # 2 underscores before and after '1.9.2' Now, if you want to update it to a newer release, say the 1.11.0 version, you can run the following command from the command line: $> pip install -U numpy==1.11.0 Alternatively, you can use the following command: $> easy_install --upgrade numpy==1.11.0 Finally, if you're interested in upgrading it to the latest available version, simply run this command: $> pip install -U numpy You can alternatively run the following command: $> easy_install --upgrade numpy Scientific distributions As you've read so far, creating a working environment is a time-consuming operation for a data scientist. You first need to install Python and then, one by one, you can install all the libraries that you will need (sometimes, the installation procedures may not go as smoothly as you'd hoped for earlier). If you want to save time and effort and want to ensure that you have a fully working Python environment that is ready to use, you can just download, install, and use the scientific Python distribution. Apart from Python, they also include a variety of preinstalled packages, and sometimes, they even have additional tools and an IDE. A few of them are very well known among data scientists, and in the following content, you will find some of the key features of each of these packages. We suggest that you promptly download and install a scientific distribution, such as Anaconda (which is the most complete one). Anaconda (continuum.io/downloads) is a Python distribution offered by Continuum Analytics that includes nearly 200 packages, which comprises NumPy, SciPy, pandas, Jupyter, Matplotlib, Scikit-learn, and NLTK. It's a cross-platform distribution (Windows, Linux, and Mac OS X) that can be installed on machines with other existing Python distributions and versions. Its base version is free; instead, add-ons that contain advanced features are charged separately. Anaconda introduces conda, a binary package manager, as a command-line tool to manage your package installations. As stated on the website, Anaconda's goal is to provide enterprise-ready Python distribution for large-scale processing, predictive analytics, and scientific computing. Leveraging conda to install packages If you've decided to install an Anaconda distribution, you can take advantage of the conda binary installer we mentioned previously. Anyway, conda is an open source package management system, and consequently it can be installed separately from an Anaconda distribution. You can test immediately whether conda is available on your system. Open a shell and digit: $> conda -V If conda is available, there will appear the version of your conda; otherwise an error will be reported. If conda is not available, you can quickly install it on your system by going to conda.pydata.org/miniconda.html and installing the Miniconda software suitable for your computer. Miniconda is a minimal installation that only includes conda and its dependencies. conda can help you manage two tasks: installing packages and creating virtual environments. In this paragraph, we will explore how conda can help you easily install most of the packages you may need in your data science projects. Before starting, please check to have the latest version of conda at hand: $> conda update conda Now you can install any package you need. To install the <package-name> generic package, you just need to run the following command: $> conda install <package-name> You can also install a particular version of the package just by pointing it out: $> conda install <package-name>=1.11.0 Similarly you can install multiple packages at once by listing all their names: $> conda install <package-name-1> <package-name-2> If you just need to update a package that you previously installed, you can keep on using conda: $> conda update <package-name> You can update all the available packages simply by using the --all argument: $> conda update --all Finally, conda can also uninstall packages for you: $> conda remove <package-name> If you would like to know more about conda, you can read its documentation at conda.pydata.org/docs/index.html. In summary, as a main advantage, it handles binaries even better than easy_install (by always providing a successful installation on Windows without any need to compile the packages from source) but without its problems and limitations. With the use of conda, packages are easy to install (and installation is always successful), update, and even uninstall. On the other hand, conda cannot install directly from a git server (so it cannot access the latest version of many packages under development) and it doesn't cover all the packages available on PyPI as pip itself. Enthought Canopy Enthought Canopy (enthought.com/products/canopy) is a Python distribution by Enthought Inc. It includes more than 200 preinstalled packages, such as NumPy, SciPy, Matplotlib, Jupyter, and pandas. This distribution is targeted at engineers, data scientists, quantitative and data analysts, and enterprises. Its base version is free (which is named Canopy Express), but if you need advanced features, you have to buy a front version. It's a multiplatform distribution and its command-line install tool is canopy_cli. PythonXY PythonXY (python-xy.github.io) is a free, open source Python distribution maintained by the community. It includes a number of packages, which include NumPy, SciPy, NetworkX, Jupyter, and Scikit-learn. It also includes Spyder, an interactive development environment inspired by the MATLAB IDE. The distribution is free. It works only on Microsoft Windows, and its command-line installation tool is pip. WinPython WinPython (winpython.sourceforge.net) is also a free, open-source Python distribution maintained by the community. It is designed for scientists, and includes many packages such as NumPy, SciPy, Matplotlib, and Jupyter. It also includes Spyder as an IDE. It is free and portable. You can put WinPython into any directory, or even into a USB flash drive, and at the same time maintain multiple copies and versions of it on your system. It works only on Microsoft Windows, and its command-line tool is the WinPython Package Manager (WPPM). Explaining virtual environments No matter you have chosen installing a stand-alone Python or instead you used a scientific distribution, you may have noticed that you are actually bound on your system to the Python's version you have installed. The only exception, for Windows users, is to use a WinPython distribution, since it is a portable installation and you can have as many different installations as you need. A simple solution to break free of such a limitation is to use virtualenv that is a tool to create isolated Python environments. That means, by using different Python environments, you can easily achieve these things: Testing any new package installation or doing experimentation on your Python environment without any fear of breaking anything in an irreparable way. In this case, you need a version of Python that acts as a sandbox. Having at hand multiple Python versions (both Python 2 and Python 3), geared with different versions of installed packages. This can help you in dealing with different versions of Python for different purposes (for instance, some of the packages we are going to present on Windows OS only work using Python 3.4, which is not the latest release). Taking a replicable snapshot of your Python environment easily and having your data science prototypes work smoothly on any other computer or in production. In this case, your main concern is the immutability and replicability of your working environment. You can find documentation about virtualenv at virtualenv.readthedocs.io/en/stable, though we are going to provide you with all the directions you need to start using it immediately. In order to take advantage of virtualenv, you have first to install it on your system: $> pip install virtualenv After the installation completes, you can start building your virtual environments. Before proceeding, you have to take a few decisions: If you have more versions of Python installed on your system, you have to decide which version to pick up. Otherwise, virtualenv will take the Python version virtualenv was installed by on your system. In order to set a different Python version you have to digit the argument –p followed by the version of Python you want or inserting the path of the Python executable to be used (for instance, –p python2.7 or just pointing to a Python executable such as -p c:Anaconda2python.exe). With virtualenv, when required to install a certain package, it will install it from scratch, even if it is already available at a system level (on the python directory you created the virtual environment from). This default behavior makes sense because it allows you to create a completely separated empty environment. In order to save disk space and limit the time of installation of all the packages, you may instead decide to take advantage of already available packages on your system by using the argument --system-site-packages. You may want to be able to later move around your virtual environment across Python installations, even among different machines. Therefore you may want to make the functioning of all of the environment's scripts relative to the path it is placed in by using the argument --relocatable. After deciding on the Python version, the linking to existing global packages, and the relocability of the virtual environment, in order to start, you just launch the command from a shell. Declare the name you would like to assign to your new environment: $> virtualenv clone virtualenv will just create a new directory using the name you provided, in the path from which you actually launched the command. To start using it, you just enter the directory and digit activate: $> cd clone $> activate At this point, you can start working on your separated Python environment, installing packages and working with code. If you need to install multiple packages at once, you may need some special function from pip—pip freeze—which will enlist all the packages (and their version) you have installed on your system. You can record the entire list in a text file by this command: $> pip freeze > requirements.txt After saving the list in a text file, just take it into your virtual environment and install all the packages in a breeze with a single command: $> pip install -r requirements.txt Each package will be installed according to the order in the list (packages are listed in a case-insensitive sorted order). If a package requires other packages that are later in the list, that's not a big deal because pip automatically manages such situations. So if your package requires Numpy and Numpy is not yet installed, pip will install it first. When you're finished installing packages and using your environment for scripting and experimenting, in order to return to your system defaults, just issue this command: $> deactivate If you want to remove the virtual environment completely, after deactivating and getting out of the environment's directory, you just have to get rid of the environment's directory itself by a recursive deletion. For instance, on Windows you just do this: $> rd /s /q clone On Linux and Mac, the command will be: $> rm –r –f clone If you are working extensively with virtual environments, you should consider using virtualenvwrapper, which is a set of wrappers for virtualenv in order to help you manage multiple virtual environments easily. It can be found at bitbucket.org/dhellmann/virtualenvwrapper. If you are operating on a Unix system (Linux or OS X), another solution we have to quote is pyenv (which can be found at https://github.com/yyuu/pyenv). It lets you set your main Python version, allow installation of multiple versions, and create virtual environments. Its peculiarity is that it does not depend on Python to be installed and works perfectly at the user level (no need for sudo commands). conda for managing environments If you have installed the Anaconda distribution, or you have tried conda using a Miniconda installation, you can also take advantage of the conda command to run virtual environments as an alternative to virtualenv. Let's see in practice how to use conda for that. We can check what environments we have available like this: >$ conda info -e This command will report to you what environments you can use on your system based on conda. Most likely, your only environment will be just "root", pointing to your Anaconda distribution's folder. As an example, we can create an environment based on Python version 3.4, having all the necessary Anaconda-packaged libraries installed. That makes sense, for instance, for using the package Theano together with Python 3 on Windows (because of an issue we will explain in a few paragraphs). In order to create such an environment, just do: $> conda create -n python34 python=3.4 anaconda The command asks for a particular python version (3.4) and requires the installation of all packages available on the anaconda distribution (the argument anaconda). It names the environment as python34 using the argument –n. The complete installation should take a while, given the large number of packages in the Anaconda installation. After having completed all of the installation, you can activate the environment: $> activate python34 If you need to install additional packages to your environment, when activated, you just do: $> conda install -n python34 <package-name1> <package-name2> That is, you make the list of the required packages follow the name of your environment. Naturally, you can also use pip install, as you would do in a virtualenv environment. You can also use a file instead of listing all the packages by name yourself. You can create a list in an environment using the list argument and piping the output to a file: $> conda list -e > requirements.txt Then, in your target environment, you can install the entire list using: $> conda install --file requirements.txt You can even create an environment, based on a requirements' list: $> conda create -n python34 python=3.4 --file requirements.txt Finally, after having used the environment, to close the session, you simply do this: $> deactivate Contrary to virtualenv, there is a specialized argument in order to completely remove an environment from your system: $> conda remove -n python34 --all A glance at the essential packages We mentioned that the two most relevant characteristics of Python are its ability to integrate with other languages and its mature package system, which is well embodied by PyPI (the Python Package Index: pypi.python.org/pypi), a common repository for the majority of Python open source packages that is constantly maintained and updated. The packages that we are now going to introduce are strongly analytical and they will constitute a complete Data Science Toolbox. All the packages are made up of extensively tested and highly optimized functions for both memory usage and performance, ready to achieve any scripting operation with successful execution. A walkthrough on how to install them is provided next. Partially inspired by similar tools present in R and MATLAB environments, we will together explore how a few selected Python commands can allow you to efficiently handle data and then explore, transform, experiment, and learn from the same without having to write too much code or reinvent the wheel. NumPy NumPy, which is Travis Oliphant's creation, is the true analytical workhorse of the Python language. It provides the user with multidimensional arrays, along with a large set of functions to operate a multiplicity of mathematical operations on these arrays. Arrays are blocks of data arranged along multiple dimensions, which implement mathematical vectors and matrices. Characterized by optimal memory allocation, arrays are useful not just for storing data, but also for fast matrix operations (vectorization), which are indispensable when you wish to solve ad hoc data science problems: Website: www.numpy.org Version at the time of print: 1.11.0 Suggested install command: pip install numpy As a convention largely adopted by the Python community, when importing NumPy, it is suggested that you alias it as np: import numpy as np SciPy An original project by Travis Oliphant, Pearu Peterson, and Eric Jones, SciPy completes NumPy's functionalities, offering a larger variety of scientific algorithms for linear algebra, sparse matrices, signal and image processing, optimization, fast Fourier transformation, and much more: Website: www.scipy.org Version at time of print: 0.17.1 Suggested install command: pip install scipy pandas The pandas package deals with everything that NumPy and SciPy cannot do. Thanks to its specific data structures, namely DataFrames and Series, pandas allows you to handle complex tables of data of different types (which is something that NumPy's arrays cannot do) and time series. Thanks to Wes McKinney's creation, you will be able easily and smoothly to load data from a variety of sources. You can then slice, dice, handle missing elements, add, rename, aggregate, reshape, and finally visualize your data at will: Website: pandas.pydata.org Version at the time of print: 0.18.1 Suggested install command: pip install pandas Conventionally, pandas is imported as pd: import pandas as pd Scikit-learn Started as part of the SciKits (SciPy Toolkits), Scikit-learn is the core of data science operations on Python. It offers all that you may need in terms of data preprocessing, supervised and unsupervised learning, model selection, validation, and error metrics. Scikit-learn started in 2007 as a Google Summer of Code project by David Cournapeau. Since 2013, it has been taken over by the researchers at INRA (French Institute for Research in Computer Science and Automation): Website: scikit-learn.org/stable Version at the time of print: 0.17.1 Suggested install command: pip install scikit-learn Note that the imported module is named sklearn. Jupyter A scientific approach requires the fast experimentation of different hypotheses in a reproducible fashion. Initially named IPython and limited to working only with the Python language, Jupyter was created by Fernando Perez in order to address the need for an interactive Python command shell (which is based on shell, web browser, and the application interface), with graphical integration, customizable commands, rich history (in the JSON format), and computational parallelism for an enhanced performance. Jupyter is our favoured choice; it is used to clearly and effectively illustrate operations with scripts and data, and the consequent results: Website: jupyter.org Version at the time of print: 1.0.0 (ipykernel = 4.3.1) Suggested install command: pip install jupyter Matplotlib Originally developed by John Hunter, matplotlib is a library that contains all the building blocks that are required to create quality plots from arrays and to visualize them interactively. You can find all the MATLAB-like plotting frameworks inside the pylab module: Website: matplotlib.org Version at the time of print: 1.5.1 Suggested install command: pip install matplotlib You can simply import what you need for your visualization purposes with the following command: import matplotlib.pyplot as plt Statsmodels Previously part of SciKits, statsmodels was thought to be a complement to SciPy's statistical functions. It features generalized linear models, discrete choice models, time series analysis, and a series of descriptive statistics as well as parametric and nonparametric tests: Website: statsmodels.sourceforge.net Version at the time of print: 0.6.1 Suggested install command: pip install statsmodels Beautiful Soup Beautiful Soup, a creation of Leonard Richardson, is a great tool to scrap out data from HTML and XML files retrieved from the Internet. It works incredibly well, even in the case of tag soups (hence the name), which are collections of malformed, contradictory, and incorrect tags. After choosing your parser (the HTML parser included in Python's standard library works fine), thanks to Beautiful Soup, you can navigate through the objects in the page and extract text, tables, and any other information that you may find useful: Website: www.crummy.com/software/BeautifulSoup Version at the time of print: 4.4.1 Suggested install command: pip install beautifulsoup4 Note that the imported module is named bs4. NetworkX Developed by the Los Alamos National Laboratory, NetworkX is a package specialized in the creation, manipulation, analysis, and graphical representation of real-life network data (it can easily operate with graphs made up of a million nodes and edges). Besides specialized data structures for graphs and fine visualization methods (2D and 3D), it provides the user with many standard graph measures and algorithms, such as the shortest path, centrality, components, communities, clustering, and PageRank. Website: networkx.github.io Version at the time of print: 1.11 Suggested install command: pip install networkx Conventionally, NetworkX is imported as nx: import networkx as nx NLTK The Natural Language Toolkit (NLTK) provides access to corpora and lexical resources and to a complete suite of functions for statistical Natural Language Processing (NLP), ranging from tokenizers to part-of-speech taggers and from tree models to named-entity recognition. Initially, Steven Bird and Edward Loper created the package as an NLP teaching infrastructure for their course at the University of Pennsylvania. Now, it is a fantastic tool that you can use to prototype and build NLP systems: Website: www.nltk.org Version at the time of print: 3.2.1 Suggested install command: pip install nltk Gensim Gensim, programmed by Radim Rehurek, is an open source package that is suitable for the analysis of large textual collections with the help of parallel distributable online algorithms. Among advanced functionalities, it implements Latent Semantic Analysis (LSA), topic modelling by Latent Dirichlet Allocation (LDA), and Google's word2vec, a powerful algorithm that transforms text into vector features that can be used in supervised and unsupervised machine learning. Website: radimrehurek.com/gensim Version at the time of print: 0.12.4 Suggested install command: pip install gensim PyPy PyPy is not a package; it is an alternative implementation of Python 2.7.8 that supports most of the commonly used Python standard packages (unfortunately, NumPy is currently not fully supported). As an advantage, it offers enhanced speed and memory handling. Thus, it is very useful for heavy duty operations on large chunks of data and it should be part of your big data handling strategies: Website: pypy.org/ Version at time of print: 5.1 Download page: pypy.org/download.html XGBoost XGBoost is a scalable, portable, and distributed gradient boosting library (a tree ensemble machine learning algorithm). Initially created by Tianqi Chen from Washington University, it has been enriched by a Python wrapper by Bing Xu and an R interface by Tong He (you can read the story behind XGBoost directly from its principal creator at homes.cs.washington.edu/~tqchen/2016/03/10/story-and-lessons-behind-the-evolution-of-xgboost.html). XGBoost is available for Python, R, Java, Scala, Julia, and C++, and it can work on a single machine (leveraging multithreading) in both Hadoop and Spark clusters: Website: xgboost.readthedocs.io/en/latest Version at the time of print: 0.4 Download page: github.com/dmlc/xgboost Detailed instructions for installing XGBoost on your system can be found at this page: github.com/dmlc/xgboost/blob/master/doc/build.md The installation of XGBoost on both Linux and MacOS is quite straightforward, whereas it is a little bit trickier for Windows users. On a Posix system you just have For this reason, we provide specific installation steps to get XGBoost working on Windows: First download and install Git for Windows (git-for-windows.github.io). Then you need a MINGW compiler present on your system. You can download it from www.mingw.org accordingly to the characteristics of your system. From the command line, execute: $> git clone --recursive https://github.com/dmlc/xgboost $> cd xgboost $> git submodule init $> git submodule update Then, always from command line, copy the configuration for 64-byte systems to be the default one: $> copy makemingw64.mk config.mk Alternatively, you just copy the plain 32-byte version: $> copy makemingw.mk config.mk After copying the configuration file, you can run the compiler, setting it to use four threads in order to speed up the compiling procedure: $> mingw32-make -j4 In MinGW, the make command comes with the name mingw32-make. If you are using a different compiler, the previous command may not work; then you can simply try: $> make -j4 Finally, if the compiler completes its work without errors, you can install the package in your Python by this: $> cd python-package $> python setup.py install After following all the preceding instructions, if you try to import XGBoost in Python and yet it doesn't load and results in an error, it may well be that Python cannot find the MinGW's g++ runtime libraries. You just need to find the location on your computer of MinGW's binaries (in our case, it was in C:mingw-w64mingw64bin; just modify the next code to put yours) and place the following code snippet before importing XGBoost: import os mingw_path = 'C:\mingw-w64\mingw64\bin' os.environ['PATH']=mingw_path + ';' + os.environ['PATH'] import xgboost as xgb Depending on the state of the XGBoost project, similarly to many other projects under continuous development, the preceding installation commands may or may not temporarily work at the time you will try them. Usually waiting for an update of the project or opening an issue with the authors of the package may solve the problem. Theano Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Basically, it provides you with all the building blocks you need to create deep neural networks. Created by academics (an entire development team; you can read their names on their most recent paper at arxiv.org/pdf/1605.02688.pdf), Theano has been used for large scale and intensive computations since 2007: Website: deeplearning.net/software/theano Release at the time of print: 0.8.2 In spite of many installation problems experienced by users in the past (expecially Windows users), the installation of Theano should be straightforward, the package being now available on PyPI: $> pip install Theano If you want the most updated version of the package, you can get it by Github cloning: $> git clone git://github.com/Theano/Theano.git Then you can proceed with direct Python installation: $> cd Theano $> python setup.py install To test your installation, you can run from shell/CMD and verify the reports: $> pip install nose $> pip install nose-parameterized $> nosetests theano If you are working on a Windows OS and the previous instructions don't work, you can try these steps using the conda command provided by the Anaconda distribution: Install TDM GCC x64 (this can be found at tdm-gcc.tdragon.net) Open an Anaconda prompt interface and execute: $> conda update conda $> conda update --all $> conda install mingw libpython $> pip install git+git://github.com/Theano/Theano.git Theano needs libpython, which isn't compatible yet with the version 3.5. So if your Windows installation is not working, this could be the likely cause. Anyway, Theano installs perfectly on Python version 3.4. Our suggestion in this case is to create a virtual Python environment based on version 3.4, install, and use Theano only on that specific version. Directions on how to create virtual environments are provided in the paragraph about virtualenv and conda create. In addition, Theano's website provides some information to Windows users; it could support you when everything else fails: deeplearning.net/software/theano/install_windows.html An important requirement for Theano to scale out on GPUs is to install Nvidia CUDA drivers and SDK for code generation and execution on GPU. If you do not know too much about the CUDA Toolkit, you can actually start from this web page in order to understand more about the technology being used: developer.nvidia.com/cuda-toolkit Therefore, if your computer has an NVidia GPU, you can find all the necessary instructions in order to install CUDA using this tutorial page from NVidia itself: docs.nvidia.com/cuda/cuda-quick-start-guide/index.html Keras Keras is a minimalist and highly modular neural networks library, written in Python and capable of running on top of either Theano or TensorFlow (the source software library for numerical computation released by Google). Keras was created by François Chollet, a machine learning researcher working at Google: Website: keras.io Version at the time of print: 1.0.3 Suggested installation from PyPI: $> pip install keras As an alternative, you can install the latest available version (which is advisable since the package is in continuous development) using the command: $> pip install git+git://github.com/fchollet/keras.git Summary In this article, we performed a lot of installations, from Python packages to examples.They were installed either directly or by using a scientific distribution. We also introduced Jupyter notebooks and demonstrated how you can have access to the data run in the tutorials. Resources for Article: Further resources on this subject: Python for Driving Hardware [Article] Mining Twitter with Python – Influence and Engagement [Article] Python Data Structures [Article]
Read more
  • 0
  • 0
  • 26199

article-image-learning-basic-nature-f-code
Packt
02 Nov 2016
6 min read
Save for later

Learning the Basic Nature of F# Code

Packt
02 Nov 2016
6 min read
In this article by Eriawan Kusumawardhono, author of the book, F# High Performance explains why F# has been a first class citizen, a built in part of programming languages support in Visual Studio, starting from Visual Studio 2010. Though F# is a programming language that has its own unique trait: it is a functional programming language but at the same time it has OOP support. F# from the start has run on .NET, although we can also run F# on cross-platform, such as Android (using Mono). (For more resources related to this topic, see here.) Although F# mostly runs faster than C# or VB when doing computations, its own performance characteristics and some not so obvious bad practices and subtleties may have led to performance bottlenecks. The bottlenecks may or may not be faster than C#/VB counterparts, although some of the bottlenecks may share the same performance characteristics, such as the use of .NET APIs. The main goal of this book is to identify performance problems in F#, measuring and also optimizing F# code to run more efficiently while also maintaining the functional programming style as appropriately as possible. A basic knowledge of F# (including the functional programming concept and basic OOP) is required as a prerequisite to start understanding the performance problems and the optimization of F#. There are many ways and definitions to define F# performance characteristics and at the same time measure them, but understanding the mechanics of running F# code, especially on top of .NET, is crucial and it's also a part of the performance characteristic itself. This includes other aspects of approaches to identify concurrency problems and language constructs. Understanding the nature of F# code Understanding the nature of F# code is very crucial and it is a definitive prerequisite before we begin to measure how long it runs and its effectiveness. We can measure a running F# code by running time, but to fully understand why it may run slow or fast, there are some basic concepts we have to consider first. Before we dive more into this, we must meet the basic requirements and setup. After the requirements have been set, we need to put in place the environment setting of Visual Studio 2015. We have to set this, because we need to maintain the consistency of the default setting of Visual Studio. The setting should be set to General. These are the steps: Select the Tools menu from Visual Studio's main menu. Select Import and Export Settings... and the Import and Export Settings Wizard screen is displayed. Select Reset all Settings and then Next to proceed. Select No, just reset my settings overwriting my current setting and then Next to proceed. Select General and then Next to proceed After setting it up, we will have a consistent layout to be used throughout this book, including the menu locations and the look and feel of Visual Studio. Now we are going to scratch the surface of F# runtime with an introductory overview of common F# runtime, which will give us some insights into F# performance. F# runtime characteristics The release of Visual Studio 2015 occurred at the same time as the release of .NET 4.6 and the rest of the tools, including the F# compiler. The compiler version of F# in Visual Studio 2015 is F# 4.0. F# 4.0 has no large differences or notable new features compared to the previous version, F# 3.0 in Visual Studio 2013. Its runtime characteristic is essentially the same as F# 4.0, although there are some subtle performance improvements and bug fixes. For more information on what's new in F# 4.0 (described as release notes) visit: https://github.com/Microsoft/visualfsharp/blob/fsharp4/CHANGELOG.md. At the time of writing this book, the online and offline MSDN Library of F# in Visual Studio does not have F# 4.0 release notes documentation, but can always go to the GitHub repository of F# to check the latest update. These are the common characteristics of F# as part of managed programming language: F# must conform to .NET CLR. This includes the compatibilities, the IL emitted after compile, and support for .NET BCL (the basic class library). Therefore, F# functions and libraries can be used by other CLR compliant languages such as C#, VB, and managed C++. The debug symbols (PDB) have the same format and semantic as other CLR compliant languages. This is important, because F# code must be able to be debugged from other CLR compliant languages as well. From the managed languages perspective, measuring performance of F# is similar when measured by tools such as the CLR profiler. But from a F# unique perspective, these are F#-only unique characteristics: By default, all types in F# are immutable. Therefore, it's safe to assume it is intrinsically thread safe. F# has a distinctive collection library, and it is immutable by default. It is also safe to assume it is intrinsically thread safe. F# has a strong type inference model, and when a generic type is inferred without any concrete type, it automatically performs generalizations. Default functions in F# are implemented internally by creating an internal class derived from F#’s FastFunc. This FastFunc is essentially a delegate that is used by F# to apply functional language constructs such as currying and partial application. With tail call recursive optimization in the IL, the F# compiler may emit .tail IL, and then the CLR will recognize this and perform optimization at runtime. F# has inline functions as option F# has a computation workflow that is used to compose functions F# async computation doesn't need Task<T> to implement it. Although F# async doesn't need the Task<T> object, it can operate well with the async-await model in C# and VB. The async-await model in C# and VB is inspired by F# async, but behaves semantically differently based on more things than just the usage of Task<T>. All of those characteristics are not only unique, but they can also have performance implications when used to interoperate with C# and VB. Summary This article explained the basic introduction to F# IDE, along with runtime characteristics of F#. Resources for Article: Further resources on this subject: Creating an F# Project [article] Unit Testing [article] Working with Windows Phone Controls [article]
Read more
  • 0
  • 0
  • 12609
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-remote-access-management-console
Packt
02 Nov 2016
12 min read
Save for later

Remote Access Management Console

Packt
02 Nov 2016
12 min read
In this article by Jordan Krause, the author of Mastering Windows Server 2016, we will explore the Remote Access Management Console of DirectAccess and we will also look at the differences between DirectAccess and VPN. (For more resources related to this topic, see here.) You are well on your way to giving users remote access capabilities on this new server. As with many networking devices, once you have established all of your configurations on a remote access server, it is pretty common for admins to walk away and let it run. There is no need for a lot of ongoing maintenance or changes to that configuration once you have it running well. However, Remote Access Management Console in Windows Server 2016 is useful not only for configuration of the remote access parts and pieces, but for monitoring and reporting as well. Let’s take a look inside this console so that you are familiar with the different screens you will be interacting with: Configuration The configuration screen is pretty self-explanatory, this is where you would visit in order to create your initial remote access configuration, and where you go to update any settings in the future. As you can see in the screenshot, you are able to configure DirectAccess, VPN, and the Web Application Proxy right from this Remote Access Management Console. There is not a lot to configure as far as the VPN goes, you really only have one screen of options where you define what kind of IP addresses are handed down to the VPN clients connecting in, and how to handle VPN authentication. It is not immediately obvious where this screen is, so I wanted to point it out. Inside the DirectAccess and VPN configuration section, if you click on the Edit… button listed under Step 2, this will launch the Step 2 mini-wizard. The last screen of this mini-wizard is called VPN Configuration. This is the screen where you can configure these IP address and authentication settings for your VPN connections: Dashboard The Remote Access Dashboard gives you a 30,000 foot view of the Remote Access server status. You are able to view a quick status of the components running on the server, whether or not the latest configuration changes have been rolled around, and some summary numbers near the bottom about how many DirectAccess and VPN connections are happening. Operations Status If you want to drill down further into what is happening on the server side of the connections, that is what the Operations Status page is all about. Here you can see a little more detail on each of the components that are running under the hood to make your DA and VPN connections happen. If any of them have an issue, you can click on the specific component to get a little more information. For example, as a test, I have turned off the NLS web server in my lab network, and I can now see in the Operations Status page that NLS is flagged with an error. Remote Client Status Next up is the Remote Client Status screen. As indicated, this is the screen where we can monitor the client computers who are connected. It will show us both DirectAccess and VPN connections here. We will be able to see computer names, usernames, and even the resources that they are utilizing during their connections. The information on this screen is able to be filtered by simply putting any criteria into the Search bar on the top of the window. It is important to note that the Remote Client Status screen only shows live, active connections. There is no historical information stored here. Reporting You guessed it, this is the window you need to visit if you want to see historical remote access information. This screen is almost exactly the same as the Remote Client Status screen, except that you have the ability to generate reports for historical data pulled from date ranges of your choosing. Once the data is displayed, you have the same search and filtering capabilities that you had on the Remote Client Status screen. Reporting is disabled by default, but you simply need to navigate to the Reporting page and click on Configure Accounting. Once that is enabled, you will be presented with options about storing the historical information. You can choose to store the data in the local WID, or on a remote RADIUS server. You also have options here for how long to store logging data, and a mechanism that can be used to clear out old data. Tasks The last window pane of Remote Access Management Console that I want to point out is the Tasks bar on the right side of your screen. The actions and options that are displayed in this taskbar change depending on what part of the console you are navigating through. Make sure to keep an eye on this side of your screen for setting up some of the more advanced functions. Some examples of available tasks are creating usage reports, refreshing the screen, and configuring network load balancing or Multi-Site configurations if you are running multiple remote access servers. DirectAccess versus VPN VPN has been around for a very long time, making it a pretty familiar idea to anyone working in IT, and we have discussed quite a bit about DirectAccess today in order to bring you up to speed on this evolution, so to speak, of corporate remote access. Now that you know there are two great solutions built into Windows Server 2016 for enabling your mobile workforce, which one is better? While DirectAccess is certainly the newer of the technologies, we cannot say that it is better in all circumstances. Each has its pros and cons, and the ways that you use each, or both, will depend upon many variables. Your users, your client computers, and your organization’s individual needs will need to factor into your decision-making process. Let’s discuss some of the differences between DirectAccess and VPN so that you can better determine which is right for you. Domain-joined versus non-domain-joined One of the biggest requirements for a DirectAccess client computer is that it must be domain joined. While this requirement by itself doesn’t seem so major, what it implies can be pretty vast. Trusting a computer enough to be joined to your domain more than likely means that the laptop is owned by the company. It also probably means that this laptop was first in IT’s hands in order to build and prep it. Companies that are in the habit of allowing employees to purchase their own computers to be used for work purposes may not find DirectAccess to fit well with that model. DA is also not ideal for situations where employees use their existing home computers to connect into work remotely. In these kinds of situations, such as home and personally-owned computers, VPN may be better suited to the task. You can connect to a VPN from a non-domain-joined machine, and you can even establish VPN connections from many non-Microsoft devices. IOS, Android, Windows Phone—these are all platforms that have a VPN client built into them that can be used to tap into a VPN listener on a Windows Server 2016 remote access server. If your only remote access solution was DirectAccess, you would not be able to provide non-domain-joined devices with a connectivity platform. Auto versus manual launch Here, DirectAccess takes the cake. It is completely seamless. DirectAccess components are baked right into the Windows operating system, no software VPN is going to be able to touch that level of integration. With VPN, users have to log in to their computers to unlock them, then launch their VPN, then log in again to that VPN software, all before they can start working on anything. With DirectAccess, all they need to do is log in to the computer to unlock the screen. DirectAccess activates itself in the background so that as soon as the desktop loads for the user, they simply open the applications that they need to access, just like when they are inside the office. Software versus built-in I’m a fan of Ikea furniture. They do a great job of supplying quality products at a low cost, all while packaging it up in incredibly small boxes. After you pay for the product, unbox the product, put the product together, and then test the product to make sure it works—it’s great. If you can’t see where this is going, I’ll give you a hint. It’s an analogy for VPN. As in, you typically pay a vendor for their VPN product, unbox the product, implement the product at more expense, then test the product. That VPN software then has the potential to break and need reinstallation or reconfiguration, and will certainly come with software updates that need to be accomplished down the road. Maintenance, maintenance, maintenance. Maybe I have been watching too many home improvement shows lately, but I am a fan of houses with built-ins. Built-ins are essentially furniture that is permanent to the house, built right into the walls, corners, or wherever it happens to be. It adds value, and it integrates into the overall house much better than furniture that was pieced together separately and then stuck against the wall in the corner. DirectAccess is like a built-in. It is inside the operating system. There is no software to install, no software to update, no software to reinstall when it breaks. Everything that DA needs is already in Windows today, you just aren’t using it. Oh, and it’s free, well, built into the cost of your Windows license anyway. There are no user CALs, no ongoing licensing costs related to implementing Microsoft DirectAccess. Password and login issues with VPN If you have ever worked on a helpdesk for a company that uses VPN, you know what I’m talking about. There are a series of common troubleshooting calls that happen in the VPN world related to passwords. Sometimes the user forgets their password. Perhaps their password has expired and needs to be changed—ugh, VPN doesn’t handle this scenario very well either. Or maybe the employee changed their expired password on their desktop before they left work for the day, but are now trying to log in remotely from their laptop and it isn’t working. What is the solution to password problems with VPN? Reset the user’s password and then make the user come into the office in order to make it work on their laptop. Yup, these kinds of phone calls still happen every day. This is unfortunate, but a real potential problem with VPN. What’s the good news? DirectAccess doesn’t have these kinds of problems! Since DA is part of the operating system, it has the capability to be connected anytime that Windows is online. This includes the login screen! Even if I am sitting on the login or lock screen, and the system is waiting for me to input my username and password, as long as I have Internet access I also have a DirectAccess tunnel. This means that I can actively do password management tasks. If my password expires and I need to update it, it works. If I forgot my password and I can’t get into my laptop, I can call the helpdesk and simply ask them to reset my password. I can then immediately log in to my DirectAccess laptop with the new password, right from my house. Another cool function that this seamlessness enables is the ability to login with new user accounts. Have you ever logged into your laptop as a different user account in order to test something? Yup, that works over DirectAccess as well. For example, I am sitting at home and I need to help one of the sales guys troubleshoot some sort of file permission problem. I suspect it’s got something to do with his user account, so I want to log in to my laptop as him in order to test it. The problem is that his user account has never logged into my laptop before. With VPN, not a chance. This would never work. With DirectAccess, piece of cake! I simply log off, type in his username and password, and bingo. I’m logged in, while still sitting at home in my pajamas. It is important to note that you can run both DirectAccess and VPN on the same Windows Server 2016 remote access server. If both technologies have capabilities that you could benefit from, use them both! Summary The technology of today demands for most companies to enable their employees to work from wherever they are. More and more organizations are hiring a work from home workforce, and need a secure, stable, and efficient way to provide access of corporate data and applications to these mobile workers. The Remote Access role in Windows Server 2016 is designed to do exactly that. With three different ways of providing remote access to corporate resources, IT departments have never had so much remote access technology available at their fingertips, built right into the Windows operating system that they already own. If you are still supporting a third-party or legacy VPN system, you should definitely explore the new capabilities provided here and discover how much they could save for your business. DirectAccess is particularly impressive and compelling; it’s a brand new way of looking at remote access. Automatic connectivity includes always-on machines that are constantly being patched and updated because they are always connected to your management servers. You can improve user productivity and network security at the same time. These two things are usually oxymorons in the IT world, but with DirectAccess they hold hands and sing songs together. Resources for Article: Further resources on this subject: Remote Authentication [article] Configuring a MySQL linked server on SQL Server 2008 [article] Configuring sipXecs Server Features [article]
Read more
  • 0
  • 0
  • 7065

article-image-thinking-functionally
Packt
01 Nov 2016
10 min read
Save for later

Thinking Functionally

Packt
01 Nov 2016
10 min read
In this article by Kevin Ashton, the author of the book F# 4.0 Programming Cookbook, you will learn the following recipes: Working with optional values Working with tuples and pattern matching (For more resources related to this topic, see here.) Working with optional values In 2009, Tony Hoare, the creator of the ALGOL W programming language, called the concept of null reference his "billion-dollar mistake". Thousands of man-hours are lost each year due to bugs caused by null references. Many of these errors are avoidable, although not all developers are aware of it. The Option type, and working with it, is what this recipe is all about. This recipe covers a few different ways of using the option type, although all of the different methods shown can exist within the same script file. Getting ready Make sure that you have a text editor or IDE, and that the F# compiler and tools are installed on your system. In Visual Studio versions 2012 and 2013, F# is installed by default. If you are running Visual Studio 2015, you will need to make sure that you select Visual F# tools during the installation phase. The code in this example expects that you have a file named Alphabet.txt located in the same directory as your source code. This file consists of the letters a to z, each on a single line (with no lines before or after). How to do it… Open a new F# Script file. In Visual Studio, select File | New | File Select F# Script File Enter the following code: open System open System.IO let filePath = Path.Combine(__SOURCE_DIRECTORY__,"Alphabet.txt") let tryGet a = let lines = File.ReadAllLines(filePath) if a < 1 || a > lines.Length then None else Some(lines.[a-1]) let printResult res = let resultText = match res with | None -> "No valid letter found." | Some str -> sprintf "The letter found was %s" str printfn "%s" resultText Select the code that you have entered and press ALT + ENTER (in Visual Studio, this will send the highlighted lines of code to the F# Interactive (FSI) evaluation session). The FSI session should display: val filePath : string = "C:DevelopmentPacktBookChapter1Alphabet.txt" val tryGet : a:int -> string option To test this code, enter the following and send the result to FSI session: let badResult1, goodResult, badResult2 = tryGet 0, tryGet 5, tryGet 27 The FSI session should display: val goodResult : string option = Some "e" val badResult2 : string option = None val badResult1 : string option = None Now, enter the following code and send it to the FSI session: printResult badResult1 printResult goodResult The FSI session should display: No valid letter found The letter found was e Here, we show a different way of using the option type. In this case, it is used where there might not be a valid result from an operation (in the case of dividing by zero for instance). Enter the following code and send the result to the FSI session: let inline tryDivide numerator denominator = if denominator = LanguagePrimitives.GenericZero then None else Some(numerator / denominator) This method is defined as an inline method because that allows it take advantage of F#'s automatic generalization. If the inline keyword were to be omitted, the compiler would assume that this method only works on integer types. Notice also the use of LanguagePrimitives.GenericZero; this allows for this function to work on any type that defines a static get_Zero method. The FSI session should display: val inline tryDivide : numerator: ^a -> denominator: ^b -> ^c option when ( ^a or ^b) : (static member ( / ) : ^a * ^b -> ^c) and ^b : (static member get_Zero : -> ^b) and ^b : equality To test this function, enter the following code and send it to the FSI session: let goodDivideResult, badDivideResult = tryDivide 10 5, tryDivide 1 0 The FSI session should display: val goodDivideResult : int option = Some 2 val badDivideResult : int option = None Here is another way of using the Option type – to handle the results of parse expression where the parse may or may not succeed. In this case, the function we are writing takes advantage of an F# feature where output parameters (parameters that are marked with the out keyword in C#) can be returned as part of a tuple that includes the function result. Enter the following code and send it to the interactive session: let tryParse a f = let res,parsed = f a if res then Some parsed else None The FSI session should display: val tryParse : a:'a -> f:('a -> bool * 'b) -> 'b option To test this function, enter the following code and send it to the FSI session: let goodParseResult = tryParse "5" (Int32.TryParse) let badParseResult = tryParse "A" (Int32.TryParse) The FSI session should display: val goodParseResult : int option = Some 5 val badParseResult : int option = None How it works… This recipe shows three potential uses of the Option type. First, we show how the Option type can be used to return a safe result, where the expected value might not exist (this is particularly useful when working with databases or other IO operations, where the expected record might not be found). Then, we show a way of using the Option type to deal with operations that could potentially throw an exception (in this case, avoiding division by zero). Finally, we show a way of using the Option type to handle situations where it is valid for there to be no result, and to provide a strongly typed and explicit way of showing this. Working with tuples and pattern matching In F#, and in functional programming in general, a function is defined as taking one parameter and returning one result. F# makes it possible for developers to write functions that take more than a single parameter as input, and return more than a single value as the results. The way F# does this is by providing tuples. Tuples are heterogeneous collections of values. This recipe will show you how to work with tuples. As with the previous recipe, this recipe shows multiple ways of working with tuples, as well as ways of pattern matching over them. How to do it… Open a new F# script file. Enter the following code and send it to the FSI session: let divideWithRemainder numerator denominator = let divisionResult = numerator / denominator let remainder = numerator % denominator let inputs = (numerator, denominator) let outputs = (divisionResult, remainder) (inputs,outputs) let printResult input = let inputs = fst input let numerator, denominator = inputs let outputs = snd input let divisionResult, remainder = outputs printfn "%d divided by %d = %d (remainder: %d)" numerator denominator divisionResult remainder The FSI session should display: val divideWithRemainder :numerator:int -> denominator:int - > (int * int) * (int * int) val printResult : (int * int) * (int * int) -> unit To test the previous code, enter the following and send to the FSI session: divideWithRemainder 10 4 |> printResult The FSI session should display: 10 divided by 4 = 2 (remainder: 2) The printResult function could also be written in other ways with the same result. These ways are shown next. Send these to the interactive session to confirm that they display the same result. let printResultPartialPattern (inputs, outputs) = let numerator, denominator = inputs let divisionResult, remainder = outputs printfn "%d divided by %d = %d (remainder: %d)" numerator denominator divisionResult remainder let printResultFullPattern ((numerator, denominator), (divisionResult, remainder)) = printfn "%d divided by %d = %d (remainder: %d)" numerator denominator divisionResult remainder The FSI session should display: val printResultPartialPattern : inputs:(int * int) * outputs:(int * int) -> unit val printResultFullPattern : (int * int) * (int * int) -> unit Notice how for all of the various printResult functions, the signature displayed in the F# Interactive Session window are of the same type, (int * int) * (int * int). Whenever you see the asterisk (*) in a function signature, it is there to show that the type is a tuple, with a possible range consisting of all the possible values of the type on the left multiplied by all the possible values of the type on the right of the asterisk. This is a good way of displaying how tuples form part of the theory of algebraic data types. Now, we will show another way of using tuples. Tuples can be given type aliases so that their structure and intent is clear. This also allows for the defining of functions that have a clear purpose with regards to either their inputs or their outputs. Enter the following code and send it to the FSI session: type Gender = | Female | Male type Cat = (string * int * Gender) let printCat (cat:Cat) = let name,age,gender = cat let genderPronoun = match gender with | Female -> "She" | Male -> "He" printfn "%s is a cat. %s is %d years old." name genderPronoun age let cats: Cat list = ["Alice", 6, Female "Loki", 4, Male "Perun", 4, Male] The FSI session should display: type Gender =| Female | Male type Cat = string * int * Gender val printCat : string * int * Gender -> unit val cats : Cat list = [("Alice", 6, Female); ("Loki", 4, Male); ("Perun", 4, Male)] To test the preceding code, enter the following and send it to the FSI session: cats |> List.iter printCat The FSI session should display: How it works… This recipe shows two potential uses for tuples. Tuples can be used to allow multiple inputs to a function, or to allow a function to return multiple outputs. The values contained within a tuple do not have to be of the same type. First, we show a case for both returning multiple values from a function and providing multiple input values. The divideWithRemainder function accepts two inputsin curried form and returns a tupled output (which itself consists of two tuples). The tupled output from the divideWithRemainder function is then piped into the printResult function. Notice that the first printResult function accepts only a single input. The other ways of writing the printResult function are two different ways of taking advantage of F#'s pattern matching capabilities for simplifying the process. Then, we show you how to work with tuples with more than two elements (the functions fst and snd do not work for tuples that consist of more than two elements). As with the first section, pattern matching is able to decompose the tuple into its constituent parts. We also show how you can provide a type alias for a tuple and get a strongly typed feedback of whether the tuple matches the expected type. If you were to leave out an element from the cats list (age or gender for example), the IDE and compilers would give you an error. Summary This article intends to introduce you to many of the features of F# that will be used throughout the book. We learned more about using the Option type, Tuples, and Active Patterns, for moving from different programming styles towards a more idiomatic F# functional style. Resources for Article: Further resources on this subject: Creating an F# Project [article] Go Programming Control Flow [article] Working with Windows Phone Controls [article]
Read more
  • 0
  • 0
  • 1310

article-image-configuring-endpoint-protection-configuration-manager
Packt
01 Nov 2016
5 min read
Save for later

Configuring Endpoint Protection in Configuration Manager

Packt
01 Nov 2016
5 min read
In this article by Nicolai Henriksen, the author of the book Microsoft System Center 1511 Endpoint Protection Cookbook, we will cover how you need to configure Endpoint Protection in Configuration Manager. (For more resources related to this topic, see here.) This is the part where you need to think through every setting you make so that it does the impact and good you want in your organization. How to configure Endpoint Protection in Configuration Manager In order to manage security and malware on your client computers with Endpoint Protection. There are a few steps you must setup and configure in order to get it working in System Center Configuration Manager (SCCM). Getting ready In this article we assume that you have SCCM in-place and working. And have setup and installed the Software Update Point Role with its prerequisites like Windows Server Update Services (WSUS). Also you have planned and thought through what impact this has in your environment, a good understanding how this should and would work in your Configuration Manager hierarchy. How to do it… First we start with installing the Endpoint Protection Role from within the SCCM console. This role must be installed before you can use and configure Endpoint Protection. It must only be installed on one site system server, and it must be installed on top of your hierarchy, meaning if you have a Central Administration Site (CAS) you install in there, or if you have a stand-alone primary site you install it there. Be aware that when you install the Endpoint Protection Role on the site server it will also install the Endpoint Protection client on that same server. This is by default and cannot be changed. However services and scans are disabled so that you can still run any other existing anti-malware solution that you may already have in place. No real-time scanning or any other form of scanning will be performed by Endpoint Protection before you enable it with a policy. So be aware of this so that you don't accidentally enable it while having another anti-malware solution installed. Installing the Endpoint Protection Role is pretty easy and straight forward; these are the steps to manage that. To install and configure Endpoint Protection Role you open the Configuration Manager console, Click Administration. And in the Administration workspace you expand Site Configuration and click on Servers and Site System Roles. You click on Add Site System Roles in the picture shown below: On the next screen I choose to use as default settings that will use the server's computer Account to install the Role on the chosen server. In my case I have a single primary site server where all the Roles reside and this will require no other preparation. However, keep in mind that if you are adding Roles to other site system servers it will require that you add the primary site server's computer Account to the local Administrators group, or you could use an installation account as shown in the following figure: Let's click Next >. This is the page where we choose the Endpoint Protection Role that we want to install. It will only list up the Roles that you have not already added to the chosen server. Pay attention that it also warns you to have software updates and anti-malware definitions already in-place and deployed. The warring will show regardless weather you have this already in-place or not as shown in the next screenshot. The next page on the wizard it about the Microsoft Active Protection Service membership. I like to think of this as the cloud feature, and I encourage you to consider setting this to Advanced membership as that will give you and Microsoft a greater chance of dealing will the unknown type of malware. This will send more information from the infected client about the surroundings of the malware. And Microsoft can investigate the bits and pieces more thoroughly in their environment in the cloud service. If it turns out that this is infectious malware like a Trojan downloader for example, it will get further removal instructions directly and try its best to remove it automatically. Now this feature will work either way on what you choose, but it will work even better if you choose to share some more information. Most other anti-virus and anti-malware products don't ask about this, they just enable it. But Microsoft has chosen to let you decide. Because there could be situations that you might not want to share this at all. You can always choose Do not join MAPS in this page and decide individually in each Endpoint Protection Policy how you want it. Setting it here simply makes this the default setting for every policy made afterwards. Clicking Next > and Finish will start the installation of the Endpoint Protection Role and finish in a few minutes. In the Monitoring | Components status shown below you can see two components starting with SMS_ENDPOINT_PROTECTION that will have a green icon on the left and will tell you that the Role is installed. How it works… So we the Endpoint Protection Role is installed in our SCCM hierarchy as simple as that. But there are more configurations to do that will be the next topic. If you remember the Endpoint Protection client will always be installed on the site server that has the Endpoint Protection Role installed. But by default, it is with no scanning or real-time protection enabled, looks like this red icon on the task-bar on the right side as shown in the figure below. Summary In this article we learned that security is key aspect for any organization. Misconfiguration may have a very bad outcome as this has to do with security. Resources for Article: Further resources on this subject: Managing Application Configuration [article] CoreOS Networking and Flannel Internals [article] Zabbix Configuration [article]
Read more
  • 0
  • 0
  • 2514

article-image-introduction-scala
Packt
01 Nov 2016
8 min read
Save for later

Introduction to Scala

Packt
01 Nov 2016
8 min read
In this article by Diego Pacheco, the author of the book, Building applications with Scala, we will see the following topics: Writing a program for Scala Hello World using the REPL Scala language – the basics Scala variables – var and val Creating immutable variables (For more resources related to this topic, see here.) Scala Hello World using the REPL Let's get started. Go ahead, open your terminal, and type $ scala in order to open the Scala REPL. Once the REPL is open, you can just type "Hello World". By doing this, you are performing two operations – eval and print. The Scala REPL will create a variable called res0 and store your string there, and then it will print the content of the res0 variable. Scala REPL Hello World program $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> "Hello World" res0: String = Hello World scala> Scala is a hybrid language, which means it is both object-oriented (OO) and functional. You can create classes and objects in Scala. Next, we will create a complete Hello World application using classes. Scala OO Hello World program $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> object HelloWorld { | def main(args:Array[String]) = println("Hello World") | } defined object HelloWorld scala> HelloWorld.main(null) Hello World scala> First things first, you need to realize that we use the word object instead of class. The Scala language has different constructs, compared with Java. Object is a Singleton in Scala. It's the same as you code the Singleton pattern in Java. Next, we see the word def that is used in Scala to create functions. In this program, we create the main function just as we do in Java, and we call the built-in function, println, in order to print the String Hello World. Scala imports some java objects and packages by default. Coding in Scala does not require you to type, for instance, System.out.println("Hello World"), but you can if you want to, as shown in the following:. $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> System.out.println("Hello World") Hello World scala> We can and we will do better. Scala has some abstractions for a console application. We can write this code with less lines of code. To accomplish this goal, we need to extend the Scala class App. When we extend from App, we are performing inheritance, and we don't need to define the main function. We can just put all the code on the body of the class, which is very convenient, and which makes the code clean and simple to read. Scala HelloWorld App in the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> object HelloWorld extends App { | println("Hello World") | } defined object HelloWorld scala> HelloWorld object HelloWorld scala> HelloWorld.main(null) Hello World scala> After coding the HelloWorld object in the Scala REPL, we can ask the REPL what HelloWorld is and, as you might realize, the REPL answers that HelloWorld is an object. This is a very convenient Scala way to code console applications because we can have a Hello World application with just three lines of code. Sadly, the same program in Java requires way more code, as you will see in the next section. Java is a great language for performance, but it is a verbose language compared with Scala. Java Hello World application package scalabook.javacode.chap1; public class HelloWorld { public static void main(String args[]){ System.out.println("Hello World"); } } The Java application required six lines of code, while in Scala, we were able to do the same with 50% less code(three lines of code). This is a very simple application; when we are coding complex applications, the difference gets bigger as a Scala application ends up with way lesser code than that of Java. Remember that we use an object in Scala in order to have a Singleton(Design Pattern that makes sure you have just one instance of a class), and if we want to do the same in Java, the code would be something like this: package scalabook.javacode.chap1; public class HelloWorldSingleton { private HelloWorldSingleton(){} private static class SingletonHelper{ private static final HelloWorldSingleton INSTANCE = new HelloWorldSingleton(); } public static HelloWorldSingleton getInstance(){ return SingletonHelper.INSTANCE; } public void sayHello(){ System.out.println("Hello World"); } public static void main(String[] args) { getInstance().sayHello(); } } It's not just about the size of the code, but it is all about consistency and the language providing more abstractions for you. If you write less code, you will have less bugs in your software at the end of the day. Scala language – the basics Scala is a statically typed language with a very expressive type system, which enforces abstractions in a safe yet coherent manner. All values in Scala are Java objects (but primitives that are unboxed at runtime) because at the end of the day, Scala runs on the Java JVM. Scala enforces immutability as a core functional programing principle. This enforcement happens in multiple aspects of the Scala language, for instance, when you create a variable, you do it in an immutable way, and when you use a collection, you use an immutable collection. Scala also lets you use mutable variables and mutable structures, but it favors immutable ones by design. Scala variables – var and val When you are coding in Scala, you create variables using either the var operator or the val operator. The var operator allows you to create mutable states, which is fine as long as you make it local, stick to the core functional programing principles, and avoid mutable shared state. Using var in the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> var x = 10 x: Int = 10 scala> x res0: Int = 10 scala> x = 11 x: Int = 11 scala> x res1: Int = 11 scala> However, Scala has a more interesting construct called val. Using the val operator makes your variables immutable, which means that you can't change their values after you set them. If you try to change the value of a val variable in Scala, the compiler will give you an error. As a Scala developer, you should use val as much as possible because that's a good functional programing mindset, and it will make your programs better and more correct. In Scala, everything is an object; there are no primitives – the var and val rules apply for everything, be it Int, String, or even a class. Using val in the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> val x = 10 x: Int = 10 scala> x res0: Int = 10 scala> x = 11 <console>:12: error: reassignment to val x = 11 ^ scala> x res1: Int = 10 scala> Creating immutable variables Right. Now let's see how we can define the most common types in Scala, such as Int, Double, Boolean, and String. Remember that you can create these variables using val or var, depending on your requirement. Scala variable types at the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> val x = 10 x: Int = 10 scala> val y = 11.1 y: Double = 11.1 scala> val b = true b: Boolean = true scala> val f = false f: Boolean = false scala> val s = "A Simple String" s: String = A Simple String scala> For these variables, we did not define the type. The Scala language figures it out for us. However, it is possible to specify the type if you want. In Scala, the type comes after the name of the variable, as shown in the following section. Scala variables with explicit typing at the Scala REPL $ scala Welcome to Scala 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_77). Type in expressions for evaluation. Or try :help. scala> val x:Int = 10 x: Int = 10 scala> val y:Double = 11.1 y: Double = 11.1 scala> val s:String = "My String " s: String = "My String " scala> val b:Boolean = true b: Boolean = true scala> Summary In this article, we learned about some basic constructs and concepts of the Scala language, with functions, collections, and OO in Scala. Resources for Article: Further resources on this subject: Making History with Event Sourcing [article] Creating Your First Plug-in [article] Content-based recommendation [article]
Read more
  • 0
  • 0
  • 14114
article-image-thorium-and-salt-api
Packt
01 Nov 2016
17 min read
Save for later

Thorium and Salt API

Packt
01 Nov 2016
17 min read
In this article by Joseph Hall, the author of the book Mastering SaltStack Second Edition, we learn about the basics of Thorium and how it helps us in a Salt ecosystem. We are also introduced to the Salt API and how to set up its components as well as creating security certificates. (For more resources related to this topic, see here.) Using Thorium The Thorium system is another component of Salt with the ability to watch the event bus and react based on what it sees there. But the ideas behind it are much different than with the reactor. A word on engines Thorium is one of the engines that started shipping with Salt in version 2016.3. Engines are a type of long-running process that can be written to work with the master or minion. Like other module types, they have access to the Salt configuration and certain Salt subsystems. Engines are separate processes that are managed by Salt. The event reactor runs inside the Salt processes themselves, which means that long-running reactor operations can affect the rest of Salt. Because Thorium is an engine, it does not suffer from this limitation. Looking at Thorium basics Like the reactor, Thorium watches the event bus. But unlike the reactor, which is configured entirely via SLS files, Thorium uses its own subsystem of modules (which are written in Python) and SLS files. Because these modules and SLS files use the state compiler, much of the functionality has been carried over. In order to use Thorium, there are a few steps that you must complete. These steps work together to form the basis of your Thorium setup, so be careful not to skip any. Enabling Thorium First, as an engine, you need to enable Thorium in the master configuration file using the engines directive: engines: - thorium: {} Because Thorium is so heavily configured using its own files, no configuration needs to be passed in at this point. However, engines do need a dictionary of some sort passed in, so we pass in an empty one. Setting up the Thorium directory tree With Thorium configured, we need to create a directory to store Thorium SLS files in. By default, this is /srv/thorium/. Go ahead and create that: # mkdir /srv/thorium/ If you’d like to change this directory, you may do so in the master configuration file: thorium_roots: base: - /srv/thorium-alt/ Like the state system, Thorium requires a top.sls file. This is the first of many similarities you’ll find between the two subsystems. As with /srv/salt/top.sls, you need to specify an environment, a target, and a list of SLS files: base: '*': - thorium_test To be honest, the environment and target really don’t mean much; they are artifacts from the state system, which weren’t designed to do anything special inside of Thorium. That said, the target does actually have some useful purposes. The target here doesn’t refer to any minions. Rather, it refers to the master that this top file applies to. For example, if your master’s ID is moe and you set a target of curly, then this top file won’t be evaluated for that master. If you were to check the /var/log/salt/master file, you would find this: Sound confusing? In a single-master, non-syndicated environment, it probably is. In such an environment, go ahead and set the target to *. But in an environment in which multiple masters are present, you may wish to divide the workload between them. Take a look at this top.sls file: base: 'monitoring-master': - alerts - graphs 'packaging-master': - redhat-pkgs - debian-pkgs In this multi-master environment, the masters may work in concert to manage jobs among the minions, but one master will also be tasked with looking for monitoring-related events and processing them, while the other will handle packaging-related events. There is a component of Thorium that we haven’t discussed yet called the register. We’ll get to it in a moment, but this is a good time to point out that the register is not shared. This means that if you assign two different masters to handle the same events, any actions performed by one master will be invisible to the other. The right hand won’t know what the left is doing, as it were. As you may expect, the list following each target specifies a set of SLS files to be evaluated. But whereas state SLS files are only evaluated when you kick off a state run (state.highstate, for instance), Thorium SLS files are evaluated at regular intervals. By default, these intervals are set to every half second. You can change that interval in the master configuration file: thorium_interval: 0.5 Once you have your top.sls file configured, it’s time to set up some SLS files. Writing Thorium SLS files Let’s go ahead and create /srv/thorium/thorium-test.sls with the following content in it: shell_test: local.cmd: - tgt: myminion - func: cmd.run - arg: - echo 'thorium success' > /tmp/thorium.txt I wouldn’t restart your master yet if I were you. First, let’s talk about what we’re looking at here. This should look very familiar to you, with a few differences. As you would expect, shell_test is the ID of this code block; local.cmd refers to the module and function that will be used, and everything that follows is arguments to that function. The local module is a Thorium-specific module. Execution, state, runner, and other modules are not available in Thorium without using a Thorium module that wraps them. The local module is one such wrapper, which provides access to execution modules. As such, tgt refers to the target that the module will be executed on, func is the module and function that will be executed, and arg is a list of ordered arguments. If you like, you may use kwarg instead to specify keyword arguments. Because state modules are accessed via the state execution module, local.cmd would also be used to kick those off; runner.cmd is also available to issue commands using the runner subsystem. Now, why did I tell you not to restart your master yet? Because if you did, this SLS file would run every half second, writing out to /tmp/thorium.txt over and over again. In order to keep it from running so often, we need to gate it somehow. Using requisites Because Thorium uses the state compiler, all state requisites are available, and they all function as you would expect. Let’s go ahead and add another code block and alter our first one a little bit: checker: check.event: - value: trigger - name: salt/thorium/*/test shell_test: local.cmd: - tgt: myminion - func: cmd.run - arg: - echo 'thorium success' > /tmp/thorium.txt - require: - check: checker The check module has a number of functions that are designed to compare a piece of data against an event and return True if the specified conditions are met. In this case, we’re using check.event, which looks at a given tag and returns True if an event comes in and matches it. The tag that we are looking for is salt/thorium/*/test, which is intended to look for salt/thorium/<minion_id>/test. The event must also have a variable called checker in its payload, with a value of trigger. We have also added a require requisite to the shell_test code block, which will prevent that block from running unless the right event comes in. Now that we’re set up, go ahead and restart the master and a minion called myminion, and issue the following command from the minion: # salt-call event.fire_master '{"checker":"trigger"}' 'salt/thorium/myminion/test' local: True You may need to wait a second or two for the event to be processed and the command from shell_test to be sent. But then you should be able to see a file called /tmp/thorium.txt and read its contents: # cat /tmp/thorium.txt thorium success This particular SLS, as it is now, mimics the functionality of the reactor system, albeit with a slightly more complex setup. Let’s take a moment now to go beyond the reactor. Using the register Thorium isn’t just another reactor. Even if it were, just running in its own process space makes it more valuable than the old reactor. But the true value of Thorium comes with the register. The register is Thorium’s own in-memory database, which persists across executions. A value that is placed in the register at one point in time is will still be there a half hour later unless the master is restarted. Is the register really that fragile? At the moment, yes. And as I stated before, it’s also not shared between systems. However, it is possible to make a copy of the register on disk by adding the following code block: myregisterfile: file.save This will cause the register to be written to a file called myregisterfile at the following location: /var/cache/salt/master/thorium/saves/myregisterfile At the time of writing this, that file will not be reloaded into memory when the master restarts. We’re going to go ahead and alter our SLS file. The shell_test code block doesn’t need to change, but the checker code block will. Remove the name field and change the function from check.event to check.contains: checker: check.contains: - value: trigger We’re still looking for a payload with a variable called checker and a value called checker, but we’re going to look at the tag somewhere else: myregister: reg.set: - add: checker - match: salt/thorium/*/test In this code block, myregister is the name of the register that you’re going to write to. The reg.set function will add a variable to that register that contains the specified piece of the payload. In this case, it will grab the variable from the payload called checker and add its associated value. However, it will only add this information to the registry if the tag on the event in question matches the match specification (salt/thorium/*/test). Go ahead and restart the master, and then fire the same event to the master: # salt-call event.fire_master '{"checker":"trigger"}' 'salt/thorium/myminion/test' local: True If you’ve added the file.save code block from before, we can go ahead and take a look at the register: # cat /var/cache/salt/master/thorium/saves/myregisterfile {"myregister": {"val": "set(['trigger'])"}} Looking forward The Thorium system is pretty new, so it’s still filling out. The value of the registry is that data can be aggregated to it and analyzed in real time. Unfortunately, at the time of writing this, the functions to analyze that data do not yet exist. Understanding the Salt API We've spent some time looking at how to send requests, but many users would argue that receiving requests is just as important, if not more so. Let's take a moment to understand the Salt API. What is the Salt API? Very simply, the Salt API is a REST interface wrapped around Salt. But that doesn't tell you the whole story. The salt command is really just a command-line interface for Salt. In fact, each of the other Salt commands (salt-call, salt-cloud, and so on) is really just a way to access various parts of Salt from the command line. The Salt API provides a way to access Salt from a different interface: HTTP (or HTTPS, preferably). Because web protocols are so ubiquitous, the Salt API allows software, written in any language that has the capability of interacting with web servers, to take advantage of it. Setting up the Salt API Being a REST interface, the Salt API acts as a web server over and above Salt. But it doesn't actually provide the server interface itself. It uses other web frameworks to provide those services and then acts as more of a middleman between them and Salt. The modules that are supported for this are: CherryPy Tornado WSGI These modules are set up in the master configuration file. Each has its own set of configuration parameters and possible dependencies. Let's take a look at each one. CherryPy CherryPy is a minimalist web framework that is designed to be very Pythonic. Because it is based around creating web code in the same way that other Python code is created, it is said to result in code that is much smaller and more quickly developed. It has a mature codebase and a number of notable users. It has also been the de facto module of the Salt API for some time. This module does require that the CherryPy package (usually called python-cherrypy) be installed. The basic setup for CherryPy doesn't involve much configuration. At a minimum, you should have the following: rest_cherrypy: port: 8080 ssl_crt: /etc/pki/tls/certs/localhost.crt ssl_key: /etc/pki/tls/certs/localhost.key We'll discuss creating certificates in a moment, but first let's talk about configuration in general. There are a number of configuration parameters available for this module, but we'll focus on the more common ones here: port: This is required. It's the port for the Salt API to listen on. host: Normally, the Salt API listens on all available interfaces (0.0.0.0). If you are in an an environment where you need to provide services only to one interface, then provide the IP address (that is, 10.0.0.1) here. ssl_crt: This is the path to your SSL certificate. We'll cover this in a moment. ssl_key: This is the path to the private key for the SSL certificate. Again, we'll cover this in a moment. debug: If you are setting up the Salt API for the first time, setting this to True can be very helpful. But once you are up and running, make sure to remove this option or explicitly set it to False. disable_ssl: It is highly recommended that the default value of False be used here. Even when just getting started, self-signed certificates are better than setting this to True. Why? Because nothing is as permanent as temporary, and at least self-signed certificates will remind you each time that you need to get a real set of certificates in place. Don't be complacent for the sake of learning. root_prefix: Normally, the Salt API will serve from the root path of the server (that is, https://saltapi.example.com/), but if you have several applications that you're serving from the same host or you just want to be more specific, you can change this. The default is /, but you could set it to /exampleapi in order to serve REST services from https://saltapi.example.com/exampleapi, for example. webhook_url: If you are using webhooks, they need their own entry point. By default, this is set to /hook, which in our example would serve from https://saltapi.example.com/hook. webhook_disable_auth: Normally, the Salt API requires authentication, but this is quite commonly not possible with third-party applications that need to call it over a webhook. This allows webhooks to not require authentication. We'll go more in depth on this in a moment. Tornado Tornado is a somewhat newer framework that was written by Facebook. It is also newer than Salt but is quickly becoming the web framework of choice inside Salt itself. In fact, it is used so much inside Salt that it is now considered a hard dependency for Salt and will be available on all newer installations. Tornado doesn't have as many configuration options inside the Salt API as CherryPy. The ones that are supported (as defined in the CherryPy section) are: port ssl_crt ssl_key debug disable_ssl While the Tornado module doesn't support nearly as much functionality as the CherryPy module just yet, keep an eye on it; it may become the new de facto Salt API module. WSGI WSGI, or Web Server Gateway Interface, is a Python standard, defined in PEP 333. Direct support for it ships with Python itself, so no external dependencies are required, but this module is also pretty basic. The only configuration option to worry about here is port. However, this module is useful in that it allows the Salt API to be run under any WSGI-compliant web server, such as Apache with mod_wsgi or Nginx with FastCGI. Because this module does not provide any sort of SSL-based security, it is recommended that one of these options be used, with those third-party web servers being properly configured with the appropriate SSL settings. Creating SSL certificates It is highly advisable to use an SSL certificate for the Salt API even if you currently only plan to use it on a local, secured network. You should probably also purchase a certificate that is signed by a certificate authority (CA). When you get to this point, the CA will provide instructions on how to create one using their system. However, for now, we can get by with a self-signed certificate. There are a number of guides online for creating self-signed certificates, but finding one that is easy to understand is somewhat more difficult. The following steps will generate both an SSL certificate and the key to use it on a Linux system: First, we'll need to generate the key. Don't worry about the password—just enter one for now, take note of it, and we'll strip it out in a moment. # openssl genrsa -des3 -out server.key 2048 Generating RSA private key, 2048 bit long modulus ................++++++ ..............................................++++++ e is 65537 (0x10001) Enter pass phrase for server.key: Verifying - Enter pass phrase for server.key: Once you have the key, you need to use it to generate a certificate signing request, or CSR. You will be asked a number of questions about you that are important if you want a certificate signed by a CA. On your internal network, it's somewhat less important. # openssl req -new -key server.key -out server.csr Enter pass phrase for server.key: You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [AU]:US State or Province Name (full name) [Some-State]:Utah Locality Name (eg, city) []:Salt Lake City Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company, LLC Organizational Unit Name (eg, section) []: Common Name (e.g. server FQDN or YOUR name) []: Email Address []:me@example.com Please enter the following 'extra' attributes to be sent with your certificate request A challenge password []: An optional company name []: At this point, we can go ahead and strip the password from the key. # cp server.key server.key.org # openssl rsa -in server.key.org -out server.key Enter pass phrase for server.key.org: writing RSA fkey Finally, we'll create a self-signed certificate. # openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt Signature ok subject=/C=US/ST=Utah/L=Salt Lake City/O=My Company, LLC/emailAddress=me@example.com Getting Private key At this point, you will have four files: server.crt server.csr server.key server.key.org Copy server.crt to the path specified for ssl_crt and server.key to the path specified for ssl_key. Summary In this article, we have learned how to setup the Thorium directory tree, using requisites and also using the register. We have also studied how to setup the Salt API. The modules that are supported for Salt API are CherryPy, Tornado, and WSGI. We have also gone through creating SSL certificates that is highly advisable to use an SSL certificate for the Salt API. Resources for Article: Further resources on this subject: Salt Configuration [article] Diving into Salt Internals [article] Introducing Salt [article]
Read more
  • 0
  • 0
  • 5572

article-image-container-linking-and-docker-dns
Packt
01 Nov 2016
29 min read
Save for later

Container Linking and Docker DNS

Packt
01 Nov 2016
29 min read
In this article by Jon Langemak, the author of the book Docker Networking Cookbook, has covered the following recipes: Verifying host based DNS configuration inside a container Overriding the default name resolution settings Configuring links for name and service resolution Leveraging Docker DNS (For more resources related to this topic, see here.) Verifying host based DNS configuration inside a container While you might not realize it but Docker, by default, is providing your containers a means to do basic name resolution. Docker passes name resolution from the Docker host, directly into the container. The result is that a spawned container can natively resolve anything that the Docker host itself can. The mechanics used by Docker to achieve name resolution in a container are elegantly simple. In this recipe, we'll walk through how this is done and how you can verify that it's working as expected. Getting Ready In this recipe we'll be demonstrating the configuration on a single Docker host. It is assumed that this host has Docker installed and that Docker is in its default configuration. We'll be altering name resolution settings on the host so you'll need root level access. How to do it… To start with, let's start a new container on our host docker1 and examine how the container handles name resolution: user@docker1:~$ docker run -d -P --name=web8 jonlangemak/web_server_8_dns d65baf205669c871d1216dc091edd1452a318b6522388e045c211344815c280a user@docker1:~$ user@docker1:~$ docker exec web8 host www.google.com www.google.com has address 216.58.216.196 www.google.com has IPv6 address 2607:f8b0:4009:80e::2004 user@docker1:~ $ It would appear that the container has the ability to resolve DNS names. If we look at our local Docker host and run the same test, we should get similar results: user@docker1:~$ host www.google.com www.google.com has address 216.58.216.196 www.google.com has IPv6 address 2607:f8b0:4009:80e::2004 user@docker1:~$ In addition, just like our Docker host, the container can also resolve local DNS records associated with the local domain lab.lab: user@docker1:~$ docker exec web8 host docker4 docker4.lab.lab has address 192.168.50.102 user@docker1:~$ You'll notice that we didn't need to specify a fully qualified domain name in order to resolve the host name docker4 in the domain lab.lab. At this point it's safe to assume that the container is receiving some sort of intelligent update from the Docker host which provides it relevant information about the local DNS configuration. In case you don't know, the resolv.conf file is generally where you define a Linux system's name resolution parameters. In many cases it is altered automatically by configuration information in other places. However – regardless of how it's altered, it should always be the source of truth for how the system handles name resolution. To see what the container is receiving, let's examine the containers resolv.conf file: user@docker1:~$ docker exec -t web8 more /etc/resolv.conf :::::::::::::: /etc/resolv.conf :::::::::::::: # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 10.20.30.13 search lab.lab user@docker1:~$ As you can see, the container has learned that the local DNS server is 10.20.30.13 and that the local DNS search domain is lab.lab. Where did it get this information? The solution is rather simple. When a container starts, Docker generates instances of the following three files for each container spawned and saves it with the container configuration: /etc/hostname /etc/hosts /etc/resolv.conf These files are stored as part of the container configuration and then mounted into the container. We can use findmnt tool from within the container to examine the source of the mounts: root@docker1:~# docker exec web8 findmnt -o SOURCE …<Additional output removed for brevity>… /dev/mapper/docker1--vg-root[/var/lib/docker/containers/c803f130b7a2450609672c23762bce3499dec9abcfdc540a43a7eb560adaf62a/resolv.conf /dev/mapper/docker1--vg-root[/var/lib/docker/containers/c803f130b7a2450609672c23762bce3499dec9abcfdc540a43a7eb560adaf62a/hostname] /dev/mapper/docker1--vg-root[/var/lib/docker/containers/c803f130b7a2450609672c23762bce3499dec9abcfdc540a43a7eb560adaf62a/hosts] root@docker1:~# So while the container thinks it has local copies of the hostname, hosts, and resolv.conf, file in its /etc/ directory, the real files are actually located in the containers configuration directory (/var/lib/docker/containers/) on the Docker host. When you tell Docker to run a container, it does 3 things: It examines the Docker hosts /etc/resolv.conf file and places a copy of it in the containers directory. It creates a hostname file in the containers directory and assigns the container a unique hostname. It creates a hosts file in the containers directory and adds relevant records including localhost and a record referencing the host itself. Each time the container is restarted, the container's resolv.conf file is updated based on the values found in the Docker hosts resolv.conf file. This means that any changes made to the resolv.conf file are lost each time the container is restarted. The hostname and hosts configuration files are also rewritten each time the container is restarted losing any changes made during the previous run. To validate the configuration files a given container is using we can inspect the containers configuration for these variables: user@docker1:~$ docker inspect web8 | grep HostsPath "HostsPath": "/var/lib/docker/containers/c803f130b7a2450609672c23762bce3499dec9abcfdc540a43a7eb560adaf62a/hosts", user@docker1:~$ docker inspect web8 | grep HostnamePath "HostnamePath": "/var/lib/docker/containers/c803f130b7a2450609672c23762bce3499dec9abcfdc540a43a7eb560adaf62a/hostname", user@docker1:~$ docker inspect web8 | grep ResolvConfPath "ResolvConfPath": "/var/lib/docker/containers/c803f130b7a2450609672c23762bce3499dec9abcfdc540a43a7eb560adaf62a/resolv.conf", user@docker1:~$ As expected, these are the same mount paths we saw when we ran the findmnt command from within the container itself. These represent the exact mount path for each file into the containers /etc/ directory for each respective file. Overriding the default name resolution settings The method Docker uses for providing name resolution to containers works very well in most cases. However, there could be some instances where you want Docker to provide the containers with a DNS server other than the one the Docker host is configured to use. In these cases, Docker offers you a couple of options. You can tell the Docker service to provide a different DNS server for all of the containers the service spawns. You can also manually override this setting at container runtime by providing a DNS server as an option to the docker run subcommand. In this recipe, we'll show you your options for changing the default name resolution behavior as well as how to verify the settings worked. Getting Ready In this recipe we'll be demonstrating the configuration on a single Docker host. It is assumed that this host has Docker installed and that Docker is in its default configuration. We'll be altering name resolution settings on the host so you'll need root level access. How to do it… As we saw in the first recipe in this article, by default, Docker provides containers with the DNS server that the Docker host itself uses. This comes in the form of copying the host's resolv.conf file and providing it to each spawned container. Along with the name server setting, this file also includes definitions for DNS search domains. Both of these options can be configured at the service level to cover any spawned containers as well as at the individual level. For the purpose of comparison, let's start by examining the Docker hosts DNS configuration: root@docker1:~# more /etc/resolv.conf # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 10.20.30.13 search lab.lab root@docker1:~# With this configuration, we would expect that any container spawned on this host would receive the same name server and DNS search domain. Let's spawn a container called web8 to verify this is working as expected: root@docker1:~# docker run -d -P --name=web8 jonlangemak/web_server_8_dns 156bc29d28a98e2fbccffc1352ec390bdc8b9b40b84e4c5f58cbebed6fb63474 root@docker1:~# root@docker1:~# docker exec -t web8 more /etc/resolv.conf :::::::::::::: /etc/resolv.conf :::::::::::::: # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 10.20.30.13 search lab.lab As expected, the container receives the same configuration. Let's now inspect the container and see if we see any DNS related options defined: user@docker1:~$ docker inspect web8 | grep Dns "Dns": [], "DnsOptions": [], "DnsSearch": [], user@docker1:~$ Since we're using the default configuration, there is no reason to configure anything specific within the container in regards to DNS server or search domain. Each time the container starts, Docker will apply the settings for the hosts resolv.conf file to the containers DNS configuration files. If we'd prefer to have Docker give containers a different DNS server or DNS search domain, we can do so through Docker options. In this case, the two we're interested in are: --dns=<DNS Server> - Specify a DNS server address that Docker should provide to the containers. --dns-search=<DNS Search Domain> - Specify a DNS search domain that Docker should provide to the containers. Let's configure Docker to provide containers with a public DNS server (4.2.2.2) and a search domain of lab.external. We can do so by passing the following options to the Docker systemd drop-in file: ExecStart=/usr/bin/dockerd --dns=4.2.2.2 --dns-search=lab.external Once the options are configured, reload the systemd configuration, restart the service to load the new options, and restart our container web8: user@docker1:~$ sudo systemctl daemon-reload user@docker1:~$ sudo systemctl restart docker user@docker1:~$ docker start web8 web8 user@docker1:~$ docker exec -t web8 more /etc/resolv.conf search lab.external nameserver 4.2.2.2 user@docker1:~$ You'll note that despite this container initially having the hosts DNS server (10.20.30.13) and search domain (lab.lab) it now has the service level DNS options we just specified. If you recall earlier, we saw that when we inspected this container, it didn't define a specific DNS server or search domain. Since none was specified, Docker now uses the settings from the Docker options which take priority. While this provides some level of flexibility, it's not yet truly flexible. At this point any and all containers spawned on this server will be provided the same DNS server and search domain. To be truly flexible we should be able to have Docker alter the name resolution configuration on a per container level. As luck would have it, these options can also be provided directly at container runtime. The preceding image defines the priority Docker uses when deciding what name resolution settings to apply to a container when it's started. Settings defined at container runtime always take priority. If the settings aren't defined there, Docker then looks to see if they are configured at the service level. If the settings aren't there, it falls back to the default method of relying on the Docker hosts DNS settings. For instance, we can launch a container called web2 and provide different options: root@docker1:~# docker run -d --dns=8.8.8.8 --dns-search=lab.dmz -P --name=web8-2 jonlangemak/web_server_8_dns 1e46d66a47b89d541fa6b022a84d702974414925f5e2dd56eeb840c2aed4880f root@docker1:~# If we inspect the container, we'll see that we now have dns and dns-search fields defined as part of the container configuration: root@docker1:~# docker inspect web8-2 …<output removed for brevity>… "Dns": [ "8.8.8.8" ], "DnsOptions": [], "DnsSearch": [ "lab.dmz" ], …<output removed for brevity>… root@docker1:~# This ensures that if the container is restarted, it will still have the same DNS settings that were initially provided the first time the container was run. Let's make some slight changes to the Docker service to verify the priority is working as expected. Let's change our Docker options to look like this: ExecStart=/usr/bin/dockerd --dns-search=lab.external Now restart the service and run the following container: user@docker1:~$ sudo systemctl daemon-reload user@docker1:~$ sudo systemctl restart docker root@docker1:~# root@docker1:~# docker run -d -P --name=web8-3 jonlangemak/web_server_8_dns 5e380f8da17a410eaf41b772fde4e955d113d10e2794512cd20aa5e551d9b24c root@docker1:~# Since we didn't provide any DNS related options at container run time the next place we'd check would be the service level options. Our Docker service level options include a DNS search domain of lab.external, we'd expect the container to receive that search domain. However, since we don't have a DNS server defined, we'll need to fall back to the one configured on the Docker host itself. And now examine its resolv.conf file to make sure things worked as expected: user@docker1:~$ docker exec -t web8-3 more /etc/resolv.conf search lab.external nameserver 10.20.30.13 user@docker1:~$ Configuring Links for name and service resolution Container linking provides a means for one container to easily communicate with another container on the same host. As we've seen in previous examples, most container to container communication has occurred through IP addresses. Container linking improves on this by allowing linked containers to communicate with each other by name. In addition to providing basic name resolution, it also provides a means to see what services a linked container is providing. In this recipe we'll review how to create container links as well as discuss some of their limitations. Getting Ready In this recipe we'll be demonstrating the configuration on a single Docker host. It is assumed that this host has Docker installed and that Docker is in its default configuration. We'll be altering name resolution settings on the host so you'll need root level access. How to do it… The phrase container linking might imply to some that it involves some kind of network configuration or modification. In reality, container linking has very little to do with container networking. In the default mode, container linking provides a means for one container to resolve the name of another. For instance, let's start two containers on our lab host docker1: root@docker1:~# docker run -d -P --name=web1 jonlangemak/web_server_1 88f9c862966874247c8e2ba90c18ac673828b5faac93ff08090adc070f6d2922 root@docker1:~# docker run -d -P --name=web2 --link=web1 jonlangemak/web_server_2 00066ea46367c07fc73f73bdcdff043bd4c2ac1d898f4354020cbcfefd408449 root@docker1:~# Notice how when I started the second container I used a new flag called --link and referenced the container web1. We would now say that web2 was linked to web1. However, they're not really linked in any sort of way. A better description might be to say that web2 is now aware of web1. Let's connect to the container web2 to show you what I mean: root@docker1:~# docker exec -it web2 /bin/bash root@00066ea46367:/# ping web1 -c 2 PING web1 (172.17.0.2): 48 data bytes 56 bytes from 172.17.0.2: icmp_seq=0 ttl=64 time=0.163 ms 56 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.092 ms --- web1 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.092/0.128/0.163/0.036 ms root@00066ea46367:/# It appears that the web2 container is now able to resolve the container web1 by name. This is because the linking process inserted records into the web2 containers hosts file: root@00066ea46367:/# more /etc/hosts 127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 172.17.0.2 web1 88f9c8629668 172.17.0.3 00066ea46367 root@00066ea46367:/# With this configuration, the web2 container can reach the web1 container either by the name we gave the container at runtime (web1) or the unique hostname Docker generated for the container (88f9c8629668). In addition to the hosts file being updated, web2 also generates some new environmental variables: root@00066ea46367:/# printenv WEB1_ENV_APACHE_LOG_DIR=/var/log/apache2 HOSTNAME=00066ea46367 APACHE_RUN_USER=www-data WEB1_PORT_80_TCP=tcp://172.17.0.2:80 WEB1_PORT_80_TCP_PORT=80 LS_COLORS= WEB1_PORT=tcp://172.17.0.2:80 WEB1_ENV_APACHE_RUN_GROUP=www-data APACHE_LOG_DIR=/var/log/apache2 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/ WEB1_PORT_80_TCP_PROTO=tcp APACHE_RUN_GROUP=www-data SHLVL=1 HOME=/root WEB1_PORT_80_TCP_ADDR=172.17.0.2 WEB1_ENV_APACHE_RUN_USER=www-data WEB1_NAME=/web2/web1 _=/usr/bin/printenv root@00066ea46367:/# You'll notice many new environmental variables. Docker will copy any environmental variables from the linked container that were defined as part of the container. This includes: Environmental variables described in the docker image. More specifically, any ENV variables from the images Dockerfile Environmental variables passed to the container at runtime through the --env or -e flag. In this case, these three variables were defined as ENV variables in the images Dockerfile: APACHE_RUN_USER=www-data APACHE_RUN_GROUP=www-data APACHE_LOG_DIR=/var/log/apache2 Since both container images have the same ENV variables defined we'll see the local variables as well as the same environmental variables from the container web1 prefixed with WEB1_ENV_: WEB1_ENV_APACHE_RUN_USER=www-data WEB1_ENV_APACHE_RUN_GROUP=www-data WEB1_ENV_APACHE_LOG_DIR=/var/log/apache2 In addition, Docker also created 6 other environmental variables that describe the web1 container as well as any of its exposed ports: WEB1_PORT=tcp://172.17.0.2:80 WEB1_PORT_80_TCP=tcp://172.17.0.2:80 WEB1_PORT_80_TCP_ADDR=172.17.0.2 WEB1_PORT_80_TCP_PORT=80 WEB1_PORT_80_TCP_PROTO=tcp WEB1_NAME=/web2/web1 Linking also allows you to specify aliases. For instance let's stop, remove, and respawn container web2 using a slightly different syntax for linking… user@docker1:~$ docker stop web2 web2 user@docker1:~$ docker rm web2 web2 user@docker1:~$ docker run -d -P --name=web2 --link=web1:webserver jonlangemak/web_server_2 e102fe52f8a08a02b01329605dcada3005208d9d63acea257b8d99b3ef78e71b user@docker1:~$ Notice that after the link definition we inserted a :webserver. The name after the colon represents the alias for the link. In this case, I've specified an alias for the container web1 as webserver. If we examine the web2 container, we'll see that the alias is now also listed in the hosts file: root@c258c7a0884d:/# more /etc/hosts …<Additional output removed for brevity>… 172.17.0.2 webserver 88f9c8629668 web1 172.17.0.3 c258c7a0884d root@c258c7a0884d:/# Aliases also impact the environmental variables created during linking. Rather than using the container name they'll instead use the alias: user@docker1:~$ docker exec web2 printenv …<Additional output removed for brevity>… WEBSERVER_PORT_80_TCP_ADDR=172.17.0.2 WEBSERVER_PORT_80_TCP_PORT=80 WEBSERVER_PORT_80_TCP_PROTO=tcp …<Additional output removed for brevity>… user@docker1:~$ At this point you might be wondering how dynamic this is. After all, Docker is providing this functionality by updating static files in each container. What happens if a container's IP address changes? For instance, let's stop the container web1 and start a new container called web3 using the same image: user@docker1:~$ docker stop web1 web1 user@docker1:~$ docker run -d -P --name=web3 jonlangemak/web_server_1 69fa80be8b113a079e19ca05c8be9e18eec97b7bbb871b700da4482770482715 user@docker1:~$ If you'll recall from earlier, the container web1 had an IP address of 172.17.0.2 allocated to it. Since I stopped the container, Docker will release that IP address reservation making it available to be reassigned to the next container we start. Let's check the IP address assigned to the container web3: user@docker1:~$ docker exec web3 ip addr show dev eth0 79: eth0@if80: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff inet 172.17.0.2/16 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:acff:fe11:2/64 scope link valid_lft forever preferred_lft forever user@docker1:~$ As expected, web3 took the now open IP address of 172.17.0.2 that previously belonged to the web1 container. We can also verify that the container web2 still believes that this IP address belongs to the web1 container: user@docker1:~$ docker exec –t web2 more /etc/hosts | grep 172.17.0.2 172.17.0.2 webserver 88f9c8629668 web1 user@docker1:~$ If we start the container web1 once again, we should see it will get a new IP address allocated to it: user@docker1:~$ docker start web1 web1 user@docker1:~$ docker exec web1 ip addr show dev eth0 81: eth0@if82: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:04 brd ff:ff:ff:ff:ff:ff inet 172.17.0.4/16 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:acff:fe11:4/64 scope link valid_lft forever preferred_lft forever user@docker1:~$ If we check the container web2 again, we should see that Docker has updated it to reference web1's new IP address… user@docker1:~$ docker exec web2 more /etc/hosts | grep web1 172.17.0.4 webserver 88f9c8629668 web1 user@docker1:~$ However, while Docker takes care of updating the host file with the new IP address, it will not take care of updating any of the environmental variables to reflect the new IP address: user@docker1:~$ docker exec web2 printenv …<Additional output removed for brevity>… WEBSERVER_PORT=tcp://172.17.0.2:80 WEBSERVER_PORT_80_TCP=tcp://172.17.0.2:80 WEBSERVER_PORT_80_TCP_ADDR=172.17.0.2 …<Additional output removed for brevity>… user@docker1:~$ Additionally it should be pointed out that the link is only one way. That is, this link does not cause the container web1 to become aware of the web2 container. The container web1 will not receive the host records or the environmental variables referencing the web2: user@docker1:~$ docker exec -it web1 ping web2 ping: unknown host user@docker1:~$ Another reason to provision links is when you use Docker Inter Container Connectivity (ICC) mode set to false. As we've discussed previously, ICC prevents any containers on the same bridge from talking directly to each other. This forces them to talk to each other only though published ports. Linking provides a mechanism to override the default ICC rules. To demonstrate, let's stop and remove all the containers on our host docker1 and then add the following Docker option to the systemd drop in file: ExecStart=/usr/bin/dockerd --icc=false Now reload the systemd configuration, restart the service, and start the following containers: docker run -d -P --name=web1 jonlangemak/web_server_1 docker run -d -P --name=web2 jonlangemak/web_server_2 With ICC mode on you'll notice containers can't talk directly to each other: user@docker1:~$ docker exec web1 ip addr show dev eth0 87: eth0@if88: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff inet 172.17.0.2/16 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:acff:fe11:2/64 scope link valid_lft forever preferred_lft forever user@docker1:~$ docker exec -it web2 curl http://172.17.0.2 user@docker1:~$ In the preceding example, web2 is not able to access the web servers on web1. Now let's delete and recreate the web2 container this time linking it to web1: user@docker1:~$ docker stop web2 web2 user@docker1:~$ docker rm web2 web2 user@docker1:~$ docker run -d -P --name=web2 --link=web1 jonlangemak/web_server_2 4c77916bb08dfc586105cee7ae328c30828e25fcec1df55f8adba8545cbb2d30 user@docker1:~$ docker exec -it web2 curl http://172.17.0.2 <body> <html> <h1><span style="color:#FF0000;font-size:72px;">Web Server #1 - Running on port 80</span></h1> </body> </html> user@docker1:~$ We can see with the link in place the communication is allowed as expected. Once again, just like the link, this access is allowed in one direction. It should be noted that linking works differently when using user defined networks. In this recipe we covered what are now being called legacy links. Linking with user defined networks will be covered in the following recipes. Leveraging Docker DNS The introduction of user defined networks signaled a big change in Docker networking. While the ability to provision custom networks was the big news, there were also major enhancements in name resolution. User defined networks can benefit from what's being called embedded DNS. The Docker engine itself now has the ability to provide name resolution to all of the containers. This is a marked improvement from the legacy solution where the only means for name resolution was external DNS or linking which relied on the hosts file. In this recipe, we'll walk through how to use and configure embedded DNS. Getting Ready In this recipe we'll be demonstrating the configuration on a single Docker host. It is assumed that this host has Docker installed and that Docker is in its default configuration. We'll be altering name resolution settings on the host so you'll need root level access. How to do it… As mentioned, the embedded DNS system only works on user defined Docker networks. That being said, let's provision a user defined network and then start a simple container on it: user@docker1:~$ docker network create -d bridge mybridge1 0d75f46594eb2df57304cf3a2b55890fbf4b47058c8e43a0a99f64e4ede98f5f user@docker1:~$ docker run -d -P --name=web1 --net=mybridge1 jonlangemak/web_server_1 3a65d84a16331a5a84dbed4ec29d9b6042dde5649c37bc160bfe0b5662ad7d65 user@docker1:~$ As we saw in an earlier recipe, by default, Docker pulls the name resolution configuration from the Docker host and provides it to the container. This behavior can be changed by providing different DNS servers or search domains either at the service level or at container run time. In the case of containers connected to a user-defined network, the DNS settings provided to the container are slightly different. For instance, let's look at the resolv.conf file for the container we just connected to the user defined bridge mybridge1: user@docker1:~$ docker exec -t web1 more /etc/resolv.conf search lab.lab nameserver 127.0.0.11 options ndots:0 user@docker1:~$ Notice how the name server for this container is now 127.0.0.11. This IP address represents Docker's embedded DNS server and will be used for any container which is connected to a user-defined network. It is a requirement that any container connected to a user-defined use the embedded DNS server. Containers not initially started on a user defined network will get updated the moment they connect to a user defined network. For instance, let's start another container called web2 but have it use the default docker0 bridge: user@docker1:~$ docker run -dP --name=web2 jonlangemak/web_server_2 d0c414477881f03efac26392ffbdfb6f32914597a0a7ba578474606d5825df3f user@docker1:~$ docker exec -t web2 more /etc/resolv.conf :::::::::::::: /etc/resolv.conf :::::::::::::: # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN nameserver 10.20.30.13 search lab.lab user@docker1:~$ If we now connect the web2 container to our user-defined network, Docker will update the name server to reflect the embedded DNS server: user@docker1:~$ docker network connect mybridge1 web2 user@docker1:~$ docker exec -t web2 more /etc/resolv.conf search lab.lab nameserver 127.0.0.11 options ndots:0 user@docker1:~$ Since both our containers are now connected to the same user-defined network they can now reach each other by name: user@docker1:~$ docker exec -t web1 ping web2 -c 2 PING web2 (172.18.0.3): 48 data bytes 56 bytes from 172.18.0.3: icmp_seq=0 ttl=64 time=0.107 ms 56 bytes from 172.18.0.3: icmp_seq=1 ttl=64 time=0.087 ms --- web2 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.087/0.097/0.107/0.000 ms user@docker1:~$ docker exec -t web2 ping web1 -c 2 PING web1 (172.18.0.2): 48 data bytes 56 bytes from 172.18.0.2: icmp_seq=0 ttl=64 time=0.060 ms 56 bytes from 172.18.0.2: icmp_seq=1 ttl=64 time=0.119 ms --- web1 ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.060/0.089/0.119/0.030 ms user@docker1:~$ You'll note that the name resolution is bidirectional and works inherently without the use of any links. That being said, with user defined networks, we can still define links for the purpose of creating local aliases. For instance, let's stop and remove both containers web1 and web2 and reprovision them as follows: user@docker1:~$ docker run -d -P --name=web1 --net=mybridge1 --link=web2:thesecondserver jonlangemak/web_server_1 fd21c53def0c2255fc20991fef25766db9e072c2bd503c7adf21a1bd9e0c8a0a user@docker1:~$ docker run -d -P --name=web2 --net=mybridge1 --link=web1:thefirstserver jonlangemak/web_server_2 6e8f6ab4dec7110774029abbd69df40c84f67bcb6a38a633e0a9faffb5bf625e user@docker1:~$ The first interesting item to point out is that Docker let us link to a container that did not yet exist. When we ran the container web1 we asked Docker to link it to the container web2. At that point, web2 didn't exist. This is a notable difference in how links work with the embedded DNS server. In legacy linking Docker needed to know the target containers information prior to making the link. This was because it had to manually update the source containers host file and environmental variables. The second interesting item is that aliases are no longer listed in the containers hosts file. If we look at the hosts file on each container we'll see that the linking no longer generates entries: user@docker1:~$ docker exec -t web1 more /etc/resolv.conf search lab.lab nameserver 127.0.0.11 options ndots:0 user@docker1:~$ docker exec -t web2 more /etc/resolv.conf search lab.lab nameserver 127.0.0.11 options ndots:0 user@docker1:~$ All of the resolution is now occurring in the embedded DNS server. This includes keeping track of defined aliases and their scope. So even without host records, each container is able to resolve the other containers alias through the embedded DNS server: user@docker1:~$ docker exec -t web1 ping thesecondserver -c2 PING thesecondserver (172.18.0.3): 48 data bytes 56 bytes from 172.18.0.3: icmp_seq=0 ttl=64 time=0.067 ms 56 bytes from 172.18.0.3: icmp_seq=1 ttl=64 time=0.067 ms --- thesecondserver ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.067/0.067/0.067/0.000 ms user@docker1:~$ docker exec -t web2 ping thefirstserver -c 2 PING thefirstserver (172.18.0.2): 48 data bytes 56 bytes from 172.18.0.2: icmp_seq=0 ttl=64 time=0.062 ms 56 bytes from 172.18.0.2: icmp_seq=1 ttl=64 time=0.042 ms --- thefirstserver ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.042/0.052/0.062/0.000 ms user@docker1:~$ The aliases created have a scope that is local to the container itself. For instance, a third container on the same user defined network is not able to resolve the aliases created as part of the links: user@docker1:~$ docker run -d -P --name=web3 --net=mybridge1 jonlangemak/web_server_1 d039722a155b5d0a702818ce4292270f30061b928e05740d80bb0c9cb50dd64f user@docker1:~$ docker exec -it web3 ping thefirstserver -c 2 ping: unknown host user@docker1:~$ docker exec -it web3 ping thesecondserver -c 2 ping: unknown host user@docker1:~$ You'll recall that legacy linking also automatically created a set of environmental variables on the source container. These environmental variables referenced the target container and any ports it might be exposing. Linking in user defined networks does not create these environmental variables: user@docker1:~$ docker exec web1 printenv PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=4eba77b66d60 APACHE_RUN_USER=www-data APACHE_RUN_GROUP=www-data APACHE_LOG_DIR=/var/log/apache2 HOME=/root user@docker1:~$ As we saw in the previous recipe, keeping these variables up to date wasn't achievable even with legacy links. That being said, it's not a total surprise the functionality doesn't exist when dealing with user defined networks. In addition to providing local container resolution, the embedded DNS server also handles any external requests. As we saw in the preceding example, the search domain from the Docker host (lab.lab in my case) was still being passed down to the containers and configured in their resolv.conf file. The name server learned from the host becomes a forwarder for the embedded DNS server. This allows the embedded DNS server to process any container name resolution requests and hand off external requests to the name server used by the Docker host. This behavior can be overridden either at the service level or by passing the --dns or --dns-search flag to a container at run time. For instance, we can start two more instances of the web1 container and specify a specific DNS server in either case: user@docker1:~$ docker run -dP --net=mybridge1 --name=web4 --dns=10.20.30.13 jonlangemak/web_server_1 19e157b46373d24ca5bbd3684107a41f22dea53c91e91e2b0d8404e4f2ccfd68 user@docker1:~$ docker run -dP --net=mybridge1 --name=web5 --dns=8.8.8.8 jonlangemak/web_server_1 700f8ac4e7a20204100c8f0f48710e0aab8ac0f05b86f057b04b1bbfe8141c26 user@docker1:~$ Note that web4 would receive 10.20.30.13 as a DNS forwarder even if we didn't specify it explicitly. This is because that's also the DNS server used by the Docker host and when not specified the container inherits from the host. It is specified here for the sake of the example. Now if we try to resolve a local DNS record on either container we can see that in the case of web1 it works since it has the local DNS server defined whereas the lookup on web2 fails because 8.8.8.8 doesn't know about the lab.lab domain: user@docker1:~$ docker exec -it web4 ping docker1.lab.lab -c 2 PING docker1.lab.lab (10.10.10.101): 48 data bytes 56 bytes from 10.10.10.101: icmp_seq=0 ttl=64 time=0.080 ms 56 bytes from 10.10.10.101: icmp_seq=1 ttl=64 time=0.078 ms --- docker1.lab.lab ping statistics --- 2 packets transmitted, 2 packets received, 0% packet loss round-trip min/avg/max/stddev = 0.078/0.079/0.080/0.000 ms user@docker1:~$ docker exec -it web5 ping docker1.lab.lab -c 2 ping: unknown host user@docker1:~$ Summary In this article we discussed the available options for container name resolution. This includes both the default name resolution behavior as well as the new embedded DNS server functionality that exists with user defined networks. You will get hold on the process used to determine name server assignment under each of these scenarios. Resources for Article: Further resources on this subject: Managing Application Configuration [article] Virtualizing Hosts and Applications [article] Deploying a Play application on CoreOS and Docker [article]
Read more
  • 0
  • 0
  • 23531

article-image-hosting-google-app-engine
Packt
21 Oct 2016
22 min read
Save for later

Hosting on Google App Engine

Packt
21 Oct 2016
22 min read
In this article by Mat Ryer, the author of the book Go Programming Blueprints Second Edition, we will see how to create a successful Google Application and deploy it in Google App Engine along with Googles Cloud data storage facility for App Engine Developers. (For more resources related to this topic, see here.) Google App Engine gives developers a NoOps (short for No Operations, indicating that developers and engineers have no work to do in order to have their code running and available) way of deploying their applications, and Go has been officially supported as a language option for some years now. Google's architecture runs some of the biggest applications in the world, such as Google Search, Google Maps, Gmail, among others, so is a pretty safe bet when it comes to deploying our own code. Google App Engine allows you to write a Go application, add a few special configuration files, and deploy it to Google's servers, where it will be hosted and made available in a highly available, scalable, and elastic environment. Instances will automatically spin up to meet demand and tear down gracefully when they are no longer needed with a healthy free quota and preapproved budgets. Along with running application instances, Google App Engine makes available a myriad of useful services, such as fast and high-scale data stores, search, memcache, and task queues. Transparent load balancing means you don't need to build and maintain additional software or hardware to ensure servers don't get overloaded and that requests are fulfilled quickly. In this article, we will build the API backend for a question and answer service similar to Stack Overflow or Quora and deploy it to Google App Engine. In the process, we'll explore techniques, patterns, and practices that can be applied to all such applications as well as dive deep into some of the more useful services available to our application. Specifically, in this article, you will learn: How to use the Google App Engine SDK for Go to build and test applications locally before deploying to the cloud How to use app.yaml to configure your application How Modules in Google App Engine let you independently manage the different components that make up your application How the Google Cloud Datastore lets you persist and query data at scale A sensible pattern for the modeling of data and working with keys in Google Cloud Datastore How to use the Google App Engine Users API to authenticate people with Google accounts A pattern to embed denormalized data into entities The Google App Engine SDK for Go In order to run and deploy Google App Engine applications, we must download and configure the Go SDK. Head over to https://cloud.google.com/appengine/downloads and download the latest Google App Engine SDK for Go for your computer. The ZIP file contains a folder called go_appengine, which you should place in an appropriate folder outside of your GOPATH, for example, in /Users/yourname/work/go_appengine. It is possible that the names of these SDKs will change in the future—if that happens, ensure that you consult the project home page for notes pointing you in the right direction at https://github.com/matryer/goblueprints. Next, you will need to add the go_appengine folder to your $PATH environment variable, much like what you did with the go folder when you first configured Go. To test your installation, open a terminal and type this: goapp version You should see something like the following: go version go1.6.1 (appengine-1.9.37) darwin/amd64 The actual version of Go is likely to differ and is often a few months behind actual Go releases. This is because the Cloud Platform team at Google needs to do work on its end to support new releases of Go. The goapp command is a drop-in replacement for the go command with a few additional subcommands; so you can do things like goapp test and goapp vet, for example. Creating your application In order to deploy an application to Google's servers, we must use the Google Cloud Platform Console to set it up. In a browser, go to https://console.cloud.google.com and sign in with your Google account. Look for the Create Project menu item, which often gets moved around as the console changes from time to time. If you already have some projects, click on a project name to open a submenu, and you'll find it in there. If you can't find what you're looking for, just search Creating App Engine project and you'll find it. When the New Project dialog box opens, you will be asked for a name for your application. You are free to call it whatever you like (for example, Answers), but note the Project ID that is generated for you; you will need to refer to this when you configure your app later. You can also click on Edit and specify your own ID, but know that the value must be globally unique, so you'll have to get creative when thinking one up. Here we will use answersapp as the application ID, but you won't be able to use that one since it has already been taken. You may need to wait a minute or two for your project to get created; there's no need to watch the page—you can continue and check back later. App Engine applications are Go packages Now that the Google App Engine SDK for Go is configured and our application has been created, we can start building it. In Google App Engine, an application is just a normal Go package with an init function that registers handlers via the http.Handle or http.HandleFunc functions. It does not need to be the main package like normal tools. Create a new folder (somewhere inside your GOPATH folder) called answersapp/api and add the following main.go file: package api import ( "io" "net/http" ) func init() { http.HandleFunc("/", handleHello) } func handleHello(w http.ResponseWriter, r *http.Request) { io.WriteString(w, "Hello from App Engine") } You will be familiar with most of this by now, but note that there is no ListenAndServe call, and the handlers are set inside the init function rather than main. We are going to handle every request with our simple handleHello function, which will just write a welcoming string. The app.yaml file In order to turn our simple Go package into a Google App Engine application, we must add a special configuration file called app.yaml. The file will go at the root of the application or module, so create it inside the answersapp/api folder with the following contents: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go api_version: go1 handlers: - url: /.* script: _go_app The file is a simple human–(and machine) readable configuration file in YAML (Yet Another Markup Language format—refer to yaml.org for more details). The following table describes each property: Property Description application The application ID (copied and pasted from when you created your project). version Your application version number—you can deploy multiple versions and even split traffic between them to test new features, among other things. We'll just stick with version 1 for now. runtime The name of the runtime that will execute your application. Since we're building a Go application, we'll use go. api_version The go1 api version is the runtime version supported by Google; you can imagine that this could be go2 in the future. handlers A selection of configured URL mappings. In our case, everything will be mapped to the special _go_app script, but you can also specify static files and folders here. Running simple applications locally Before we deploy our application, it makes sense to test it locally. We can do this using the App Engine SDK we downloaded earlier. Navigate to your answersapp/api folder and run the following command in a terminal: goapp serve You should see the following output: This indicates that an API server is running locally on port :56443, an admin server is running on :8000, and our application (the module default) is now serving at localhost:8080, so let's hit that one in a browser. As you can see by the Hello from App Engine response, our application is running locally. Navigate to the admin server by changing the port from :8080 to :8000. The preceding screenshot shows the web portal that we can use to interrogate the internals of our application, including viewing running instances, inspecting the data store, managing task queues, and more. Deploying simple applications to Google App Engine To truly understand the power of Google App Engine's NoOps promise, we are going to deploy this simple application to the cloud. Back in the terminal, stop the server by hitting Ctrl+C and run the following command: goapp deploy Your application will be packaged and uploaded to Google's servers. Once it's finished, you should see something like the following: Completed update of app: theanswersapp, version: 1 It really is as simple as that. You can prove this by navigating to the endpoint you get for free with every Google App Engine application, remembering to replace the application ID with your own: https://YOUR_APPLICATION_ID_HERE.appspot.com/. You will see the same output as earlier (the font may render differently since Google's servers will make assumptions about the content type that the local dev server doesn't). The application is being served over HTTP/2 and is already capable of pretty massive scale, and all we did was write a config file and a few lines of code. Modules in Google App Engine A module is a Go package that can be versioned, updated, and managed independently. An app might have a single module, or it can be made up of many modules: each distinct but part of the same application with access to the same data and services. An application must have a default module—even if it doesn't do much. Our application will be made up of the following modules: Description The module name The obligatory default module default An API package delivering RESTful JSON api A static website serving HTML, CSS, and JavaScript that makes AJAX calls to the API module web Each module will be a Go package and will, therefore, live inside its own folder. Let's reorganize our project into modules by creating a new folder alongside the api folder called default. We are not going to make our default module do anything other than use it for configuration, as we want our other modules to do all the meaningful work. But if we leave this folder empty, the Google App Engine SDK will complain that it has nothing to build. Inside the default folder, add the following placeholder main.go file: package defaultmodule func init() {} This file does nothing except allowing our default module to exist. It would have been nice for our package names to match the folders, but default is a reserved keyword in Go, so we have a good reason to break that rule. The other module in our application will be called web, so create another folder alongside the api and default folders called web. Here we are only going to build the API for our application and cheat by downloading the web module. Head over to the project home page at https://github.com/matryer/goblueprints, access the content for Second Edition, and look for the download link for the web components for this article in the Downloads section of the README file. The ZIP file contains the source files for the web component, which should be unzipped and placed inside the web folder. Now, our application structure should look like this: /answersapp/api /answersapp/default /answersapp/web Specifying modules To specify which module our api package will become, we must add a property to the app.yaml inside our api folder. Update it to include the module property: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go module: api api_version: go1 handlers: - url: /.* script: _go_app Since our default module will need to be deployed as well, we also need to add an app.yaml configuration file to it. Duplicate the api/app.yaml file inside default/app.yaml, changing the module to default: application: YOUR_APPLICATION_ID_HERE version: 1 runtime: go module: default api_version: go1 handlers: - url: /.* script: _go_app Routing to modules with dispatch.yaml In order to route traffic appropriately to our modules, we will create another configuration file called dispatch.yaml, which will let us map URL patterns to the modules. We want all traffic beginning with the /api/ path to be routed to the api module and everything else to the web module. As mentioned earlier, we won't expect our default module to handle any traffic, but it will have more utility later. In the answersapp folder (alongside our module folders—not inside any of the module folders), create a new file called dispatch.yaml with the following contents: application: YOUR_APPLICATION_ID_HERE dispatch: - url: "*/api/*" module: api - url: "*/*" module: web The same application property tells the Google App Engine SDK for Go which application we are referring to, and the dispatch section routes URLs to modules. Google Cloud Datastore One of the services available to App Engine developers is Google Cloud Datastore, a NoSQL document database built for automatic scaling and high performance. Its limited feature-set guarantees very high scale, but understanding the caveats and best practices is vital to a successful project. Denormalizing data Developers with experience of relational databases (RDBMS) will often aim to reduce data redundancy (trying to have each piece of data appear only once in their database) by normalizing data, spreading it across many tables, and adding references (foreign keys) before joining it back via a query to build a complete picture. In schemaless and NoSQL databases, we tend to do the opposite. We denormalize data so that each document contains the complete picture it needs, making read times extremely fast—since it only needs to go and get a single thing. For example, consider how we might model tweets in a relational database such as MySQL or Postgres: A tweet itself contains only its unique ID, a foreign key reference to the Users table representing the author of the tweet, and perhaps many URLs that were mentioned in TweetBody. One nice feature of this design is that a user can change their Name or AvatarURL and it will be reflected in all of their tweets, past and future: something you wouldn't get for free in a denormalized world. However, in order to present a tweet to the user, we must load the tweet itself, look up (via a join) the user to get their name and avatar URL, and then load the associated data from the URLs table in order to show a preview of any links. At scale, this becomes difficult because all three tables of data might well be physically separated from each other, which means lots of things need to happen in order to build up this complete picture. Consider what a denormalized design would look like instead: We still have the same three buckets of data, except that now our tweet contains everything it needs in order to render to the user without having to look up data from anywhere else. The hardcore relational database designers out there are realizing what this means by now, and it is no doubt making them feel uneasy. Following this approach means that: Data is repeated—AvatarURL in User is repeated as UserAvatarURL in the tweet (waste of space, right?) If the user changes their AvatarURL, UserAvatarURL in the tweet will be out of date Database design, at the end of the day, comes down to physics. We are deciding that our tweet is going to be read far more times than it is going to be written, so we'd rather take the pain up-front and take a hit in storage. There's nothing wrong with repeated data as long as there is an understanding about which set is the master set and which is duplicated for speed. Changing data is an interesting topic in itself, but let's think about a few reasons why we might be OK with the trade-offs. Firstly, the speed benefit to reading tweets is probably worth the unexpected behavior of changes to master data not being reflected in historical documents; it would be perfectly acceptable to decide to live with this emerged functionality for that reason. Secondly, we might decide that it makes sense to keep a snapshot of data at a specific moment in time. For example, imagine if someone tweets asking whether people like their profile picture. If the picture changed, the tweet context would be lost. For a more serious example, consider what might happen if you were pointing to a row in an Addresses table for an order delivery and the address later changed. Suddenly, the order might look like it was shipped to a different place. Finally, storage is becoming increasingly cheaper, so the need for normalizing data to save space is lessened. Twitter even goes as far as copying the entire tweet document for each of your followers. 100 followers on Twitter means that your tweet will be copied at least 100 times, maybe more for redundancy. This sounds like madness to relational database enthusiasts, but Twitter is making smart trade-offs based on its user experience; they'll happily spend a lot of time writing a tweet and storing it many times to ensure that when you refresh your feed, you don't have to wait very long to get updates. If you want to get a sense of the scale of this, check out the Twitter API and look at what a tweet document consists of. It's a lot of data. Then, go and look at how many followers Lady Gaga has. This has become known in some circles as "the Lady Gaga problem" and is addressed by a variety of different technologies and techniques that are out of the scope of this article. Now that we have an understanding of good NoSQL design practices, let's implement the types, functions, and methods required to drive the data part of our API. Entities and data access To persist data in Google Cloud Datastore, we need a struct to represent each entity. These entity structures will be serialized and deserialized when we save and load data through the datastore API. We can add helper methods to perform the interactions with the data store, which is a nice way to keep such functionality physically close to the entities themselves. For example, we will model an answer with a struct called Answer and add a Create method that in turn calls the appropriate function from the datastore package. This prevents us from bloating our HTTP handlers with lots of data access code and allows us to keep them clean and simple instead. One of the foundation blocks of our application is the concept of a question. A question can be asked by a user and answered by many. It will have a unique ID so that it is addressable (referable in a URL), and we'll store a timestamp of when it was created. type Question struct { Key *datastore.Key `json:"id" datastore:"-"` CTime time.Time `json:"created"` Question string `json:"question"` User UserCard `json:"user"` AnswersCount int `json:"answers_count"` } The UserCard struct represents a denormalized User entity, both of which we'll add later. You can import the datastore package in your Go project using this: import "google.golang.org/appengine/datastore" It's worth spending a little time understanding the datastore.Key type. Keys in Google Cloud Datastore Every entity in Datastore has a key, which uniquely identifies it. They can be made up of either a string or an integer depending on what makes sense for your case. You are free to decide the keys for yourself or let Datastore automatically assign them for you; again, your use case will usually decide which is the best approach to take Keys are created using datastore.NewKey and datastore.NewIncompleteKey functions and are used to put and get data into and out of Datastore via the datastore.Get and datastore.Put functions. In Datastore, keys and entity bodies are distinct, unlike in MongoDB or SQL technologies, where it is just another field in the document or record. This is why we are excluding Key from our Question struct with the datastore:"-" field tag. Like the json tags, this indicates that we want Datastore to ignore the Key field altogether when it is getting and putting data. Keys may optionally have parents, which is a nice way of grouping associated data together and Datastore makes certain assurances about such groups of entities, which you can read more about in the Google Cloud Datastore documentation online. Putting data into Google Cloud Datastore Before we save data into Datastore, we want to ensure that our question is valid. Add the following method underneath the Question struct definition: func (q Question) OK() error { if len(q.Question) < 10 { return errors.New("question is too short") } return nil } The OK function will return an error if something is wrong with the question, or else it will return nil. In this case, we just check to make sure the question has at least 10 characters. To persist this data in the data store, we are going to add a method to the Question struct itself. At the bottom of questions.go, add the following code: func (q *Question) Create(ctx context.Context) error { log.Debugf(ctx, "Saving question: %s", q.Question) if q.Key == nil { q.Key = datastore.NewIncompleteKey(ctx, "Question", nil) } user, err := UserFromAEUser(ctx) if err != nil { return err } q.User = user.Card() q.CTime = time.Now() q.Key, err = datastore.Put(ctx, q.Key, q) if err != nil { return err } return nil } The Create method takes a pointer to Question as the receiver, which is important because we want to make changes to the fields. If the receiver was (q Question)—without *, we would get a copy of the question rather than a pointer to it, and any changes we made to it would only affect our local copy and not the original Question struct itself. The first thing we do is use log (from the google.golang.org/appengine/log package) to write a debug statement saying we are saving the question. When you run your code in a development environment, you will see this appear in the terminal; in production, it goes into a dedicated logging service provided by Google Cloud Platform. If the key is nil (that means this is a new question), we assign an incomplete key to the field, which informs Datastore that we want it to generate a key for us. The three arguments we pass are context.Context (which we must pass to all datastore functions and methods), a string describing the kind of entity, and the parent key; in our case, this is nil. Once we know there is a key in place, we call a method (which we will add later) to get or create User from an App Engine user and set it to the question and then set the CTime field (created time) to time.Now—timestamping the point at which the question was asked. One we have our Question function in good shape, we call datastore.Put to actually place it inside the data store. As usual, the first argument is context.Context, followed by the question key and the question entity itself. Since Google Cloud Datastore treats keys as separate and distinct from entities, we have to do a little extra work if we want to keep them together in our own code. The datastore.Put method returns two arguments: the complete key and error. The key argument is actually useful because we're sending in an incomplete key and asking the data store to create one for us, which it does during the put operation. If successful, it returns a new datastore.Key object to us, representing the completed key, which we then store in our Key field in the Question object. If all is well, we return nil. Add another helper to update an existing question: func (q *Question) Update(ctx context.Context) error { if q.Key == nil { q.Key = datastore.NewIncompleteKey(ctx, "Question", nil) } var err error q.Key, err = datastore.Put(ctx, q.Key, q) if err != nil { return err } return nil } This method is very similar except that it doesn't set the CTime or User fields, as they will already have been set. Reading data from Google Cloud Datastore Reading data is as simple as putting it with the datastore.Get method, but since we want to maintain keys in our entities (and datastore methods don't work like that), it's common to add a helper function like the one we are going to add to questions.go: func GetQuestion(ctx context.Context, key *datastore.Key) (*Question, error) { var q Question err := datastore.Get(ctx, key, &q) if err != nil { return nil, err } q.Key = key return &q, nil } The GetQuestion function takes context.Context and the datastore.Key method of the question to get. It then does the simple task of calling datastore.Get and assigning the key to the entity before returning it. Of course, errors are handled in the usual way. This is a nice pattern to follow so that users of your code know that they never have to interact with datastore.Get and datastore.Put directly but rather use the helpers that can ensure the entities are properly populated with the keys (along with any other tweaks that they might want to do before saving or after loading). Summary This article thus gives us an idea about the Go App functionality, how to create a simple application and upload on Google App Engine thus giving a clear understanding of configurations and its working Further we also get some ideas about modules in Google App Engine and also Googles cloud data storage facility for App Engine Developers Resources for Article: Further resources on this subject: Google Forms for Multiple Choice and Fill-in-the-blank Assignments [article] Publication of Apps [article] Prerequisites for a Map Application [article]
Read more
  • 0
  • 0
  • 20960
article-image-jupyter-and-python-scripting
Packt
21 Oct 2016
9 min read
Save for later

Jupyter and Python Scripting

Packt
21 Oct 2016
9 min read
In this article by Dan Toomey, author of the book Learning Jupyter, we will see data access in Jupyter with Python and the effect of pandas on Jupyter. We will also see Python graphics and lastly Python random numbers. (For more resources related to this topic, see here.) Python data access in Jupyter I started a view for pandas using Python Data Access as the name. We will read in a large dataset and compute some standard statistics on the data. We are interested in seeing how we use pandas in Jupyter, how well the script performs, and what information is stored in the metadata (especially if it is a larger dataset). Our script accesses the iris dataset built into one of the Python packages. All we are looking to do is read in a slightly large number of items and calculate some basic operations on the dataset. We are really interested in seeing how much of the data is cached in the PYNB file. The Python code is: # import the datasets package from sklearn import datasets # pull in the iris data iris_dataset = datasets.load_iris() # grab the first two columns of data X = iris_dataset.data[:, :2] # calculate some basic statistics x_count = len(X.flat) x_min = X[:, 0].min() - .5 x_max = X[:, 0].max() + .5 x_mean = X[:, 0].mean() # display our results x_count, x_min, x_max, x_mean I broke these steps into a couple of cells in Jupyter, as shown in the following screenshot: Now, run the cells (using Cell | Run All) and you get this display below. The only difference is the last Out line where our values are displayed. It seemed to take longer to load the library (the first time I ran the script) than to read the data and calculate the statistics. If we look in the PYNB file for this notebook, we see that none of the data is cached in the PYNB file. We simply have code references to the library, our code, and the output from when we last calculated the script: { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(300, 3.7999999999999998, 8.4000000000000004, 5.8433333333333337)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# calculate some basic statisticsn", "x_count = len(X.flat)n", "x_min = X[:, 0].min() - .5n", "x_max = X[:, 0].max() + .5n", "x_mean = X[:, 0].mean()n", "n", "# display our resultsn", "x_count, x_min, x_max, x_mean" ] } Python pandas in Jupyter One of the most widely used features of Python is pandas. pandas are built-in libraries of data analysis packages that can be used freely. In this example, we will develop a Python script that uses pandas to see if there is any effect to using them in Jupyter. I am using the Titanic dataset from http://www.kaggle.com/c/titanic-gettingStarted/download/train.csv. I am sure the same data is available from a variety of sources. Here is our Python script that we want to run in Jupyter: from pandas import * training_set = read_csv('train.csv') training_set.head() male = training_set[training_set.sex == 'male'] female = training_set[training_set.sex =='female'] womens_survival_rate = float(sum(female.survived))/len(female) mens_survival_rate = float(sum(male.survived))/len(male) The result is… we calculate the survival rates of the passengers based on sex. We create a new notebook, enter the script into appropriate cells, include adding displays of calculated data at each point and produce our results. Here is our notebook laid out where we added displays of calculated data at each cell,as shown in the following screenshot: When I ran this script, I had two problems: On Windows, it is common to use backslash ("") to separate parts of a filename. However, this coding uses the backslash as a special character. So, I had to change over to use forward slash ("/") in my CSV file path. I originally had a full path to the CSV in the above code example. The dataset column names are taken directly from the file and are case sensitive. In this case, I was originally using the 'sex' field in my script, but in the CSV file the column is named Sex. Similarly I had to change survived to Survived. The final script and result looks like the following screenshot when we run it: I have used the head() function to display the first few lines of the dataset. It is interesting… the amount of detail that is available for all of the passengers. If you scroll down, you see the results as shown in the following screenshot: We see that 74% of the survivors were women versus just 19% men. I would like to think chivalry is not dead! Curiously the results do not total to 100%. However, like every other dataset I have seen, there is missing and/or inaccurate data present. Python graphics in Jupyter How do Python graphics work in Jupyter? I started another view for this named Python Graphics so as to distinguish the work. If we were to build a sample dataset of baby names and the number of births in a year of that name, we could then plot the data. The Python coding is simple: import pandas import matplotlib %matplotlib inline baby_name = ['Alice','Charles','Diane','Edward'] number_births = [96, 155, 66, 272] dataset = list(zip(baby_name,number_births)) df = pandas.DataFrame(data = dataset, columns=['Name', 'Number']) df['Number'].plot() The steps of the script are as follows: We import the graphics library (and data library) that we need Define our data Convert the data into a format that allows for easy graphical display Plot the data We would expect a resultant graph of the number of births by baby name. Taking the above script and placing it into cells of our Jupyter node, we get something that looks like the following screenshot: I have broken the script into different cells for easier readability. Having different cells also allows you to develop the script easily step by step, where you can display the values computed so far to validate your results. I have done this in most of the cells by displaying the dataset and DataFrame at the bottom of those cells. When we run this script (Cell | Run All), we see the results at each step displayed as the script progresses: And finally we see our plot of the births as shown in the following screenshot. I was curious what metadata was stored for this script. Looking into the IPYNB file, you can see the expected value for the formula cells. The tabular data display of the DataFrame is stored as HTML—convenient: { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>n", "<table border="1" class="dataframe">n", "<thead>n", "<tr style="text-align: right;">n", "<th></th>n", "<th>Name</th>n", "<th>Number</th>n", "</tr>n", "</thead>n", "<tbody>n", "<tr>n", "<th>0</th>n", "<td>Alice</td>n", "<td>96</td>n", "</tr>n", "<tr>n", "<th>1</th>n", "<td>Charles</td>n", "<td>155</td>n", "</tr>n", "<tr>n", "<th>2</th>n", "<td>Diane</td>n", "<td>66</td>n", "</tr>n", "<tr>n", "<th>3</th>n", "<td>Edward</td>n", "<td>272</td>n", "</tr>n", "</tbody>n", "</table>n", "</div>" ], "text/plain": [ " Name Numbern", "0 Alice 96n", "1 Charles 155n", "2 Diane 66n", "3 Edward 272" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], The graphic output cell that is stored like this: { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "<matplotlib.axes._subplots.AxesSubplot at 0x47cf8f0>" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "<a few hundred lines of hexcodes> …/wc/B0RRYEH0EQAAAABJRU5ErkJggg==n", "text/plain": [ "<matplotlib.figure.Figure at 0x47d8e30>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# plot the datan", "df['Number'].plot()n" ] } ], Where the image/png tag contains a large hex digit string representation of the graphical image displayed on screen (I abbreviated the display in the coding shown). So, the actual generated image is stored in the metadata for the page. Python random numbers in Jupyter For many analyses we are interested in calculating repeatable results. However, much of the analysis relies on some random numbers to be used. In Python, you can set the seed for the random number generator to achieve repeatable results with the random_seed() function. In this example, we simulate rolling a pair of dice and looking at the outcome. We would example the average total of the two dice to be 6—the halfway point between the faces. The script we are using is this: import pylab import random random.seed(113) samples = 1000 dice = [] for i in range(samples): total = random.randint(1,6) + random.randint(1,6) dice.append(total) pylab.hist(dice, bins= pylab.arange(1.5,12.6,1.0)) pylab.show() Once we have the script in Jupyter and execute it, we have this result: I had added some more statistics. Not sure if I would have counted on such a high standard deviation. If we increased the number of samples, this would decrease. The resulting graph was opened in a new window, much as it would if you ran this script in another Python development environment. The toolbar at the top of the graphic is extensive, allowing you to manipulate the graphic in many ways. Summary In this article, we walked through simple data access in Jupyter through Python. Then we saw an example of using pandas. We looked at a graphics example. Finally, we looked at an example using random numbers in a Python script. Resources for Article: Further resources on this subject: Python Data Science Up and Running [article] Mining Twitter with Python – Influence and Engagement [article] Unsupervised Learning [article]
Read more
  • 0
  • 0
  • 34017

article-image-prepare-for-2017-with-mapt
Packt
21 Oct 2016
2 min read
Save for later

Prepare for our 2017 Awards with Mapt

Packt
21 Oct 2016
2 min read
At Packt, we're committed to supporting developers to learn the skills they need to remain relevant in their field. But what exactly does relevant mean? To us, relevance is about the impact you have. And we believe that software should have always have an impact, whether it's for a business, for customers - whoever it is, it's ultimately about making a difference. We want to reward developers who make an impact. Whether you're a web developer who's creating awesome applications and websites that are engaging users every single day, or even a data analyst who has used Machine Learning to uncover revealing insights about healthcare or the environment, we're going to want to hear from you. We don't want to give too much away right now, but we're confident that you're going to be interested in our award... So, to prepare yourself for our awards, get started on Mapt and find your route through some of the most important skills in software today. What are you waiting for? We're sponsoring seats on Mapt for limited prices this week. That means you'll be able to get a subscription for a special discounted price - but be quick, each discount is time limited! Subscribe here.
Read more
  • 0
  • 0
  • 1783
Modal Close icon
Modal Close icon