Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7014 Articles
article-image-building-components-using-angular
Packt
06 Apr 2017
11 min read
Save for later

Building Components Using Angular

Packt
06 Apr 2017
11 min read
In this article by Shravan Kumar Kasagoni, the author of the book Angular UI Development, we will learn how to use new features of Angular framework to build web components. After going through this article you will understand the following: What are web components How to setup the project for Angular application development Data binding in Angular (For more resources related to this topic, see here.) Web components In today's web world if we need to use any of the UI components provided by libraries like jQuery UI, YUI library and so on. We write lot of imperative JavaScript code, we can't use them simply in declarative fashion like HTML markup. There are fundamental problems with the approach. There is no way to define custom HTML elements to use them in declarative fashion. The JavaScript, CSS code inside UI components can accidentally modify other parts of our web pages, our code can also accidentally modify UI components, which is unintended. There is no standard way to encapsulate code inside these UI components. Web Components provides solution to all these problems. Web Components are set of specifications for building reusable UI components. Web Components specifications is comprised of four parts: Templates: Allows us to declare fragments of HTML that can be cloned and inserted in the document by script Shadow DOM: Solves DOM tree encapsulation problem Custom elements: Allows us to define custom HTML tags for UI components HTML imports: Allows to us add UI components to web page using import statement More information on web components can be found at: https://www.w3.org/TR/components-intro/. Component are the fundament building blocks of any Angular application. Components in Angular are built on top of the web components specification. Web components specification is still under development and might change in future, not all browsers supports it. But Angular provides very high abstraction so that we don't need to deal with multiple technologies in web components. Even if specification changes Angular can take care of internally, it provides much simpler API to write web components. Getting started with Angular We know Angular is completely re-written from scratch, so everything is new in Angular. In this article we will discuss few important features like data binding, new templating syntax and built-in directives. We are going use more practical approach to learn these new features. In the next section we are going to look at the partially implemented Angular application. We will incrementally use Angular new features to implement this application. Follow the instruction specified in next section to setup sample application. Project Setup Here is a sample application with required Angular configuration and some sample code. Application Structure Create directory structure, files as mentioned below and copy the code into files from next section. Source Code package.json We are going to use npm as our package manager to download libraries and packages required for our application development. Copy the following code to package.json file. { "name": "display-data", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "tsc": "tsc", "tsc:w": "tsc -w", "lite": "lite-server", "start": "concurrent "npm run tsc:w" "npm run lite" " }, "author": "Shravan", "license": "ISC", "dependencies": { "angular2": "^2.0.0-beta.1", "es6-promise": "^3.0.2", "es6-shim": "^0.33.13", "reflect-metadata": "^0.1.2", "rxjs": "^5.0.0-beta.0", "systemjs": "^0.19.14", "zone.js": "^0.5.10" }, "devDependencies": { "concurrently": "^1.0.0", "lite-server": "^1.3.2", "typescript": "^1.7.5" } } The package.json file holds metadata for npm, in the preceding code snippet there are two important sections: dependencies: It holds all the packages required for an application to run devDependencies: It holds all the packages required only for development Once we add the preceding package.json file to our project we should run the following command at the root of our application. $ npm install The preceding command will create node_modules directory in the root of project and downloads all the packages mentioned in dependencies, devDependencies sections into node_modules directory. There is one more important section, that is scripts. We will discuss about scripts section, when we are ready to run our application. tsconfig.json Copy the below code to tsconfig.json file. { "compilerOptions": { "target": "es5", "module": "system", "moduleResolution": "node", "sourceMap": true, "emitDecoratorMetadata": true, "experimentalDecorators": true, "removeComments": false, "noImplicitAny": false }, "exclude": [ "node_modules" ] } We are going to use TypeScript for developing our Angular applications. The tsconfig.json file is the configuration file for TypeScript compiler. Options specified in this file are used while transpiling our code into JavaScript. This is totally optional, if we don't use it TypeScript compiler use are all default flags during compilation. But this is the best way to pass the flags to TypeScript compiler. Following is the expiation for each flag specified in tsconfig.json: target: Specifies ECMAScript target version: 'ES3' (default), 'ES5', or 'ES6' module: Specifies module code generation: 'commonjs', 'amd', 'system', 'umd' or 'es6' moduleResolution: Specifies module resolution strategy: 'node' (Node.js) or 'classic' (TypeScript pre-1.6) sourceMap: If true generates corresponding '.map' file for .js file emitDecoratorMetadata: If true enables the output JavaScript to create the metadata for the decorators experimentalDecorators: If true enables experimental support for ES7 decorators removeComments: If true, removes comments from output JavaScript files noImplicitAny: If true raise error if we use 'any' type on expressions and declarations exclude: If specified, the compiler will not compile the TypeScript files in the containing directory and subdirectories index.html Copy the following code to index.html file. <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Top 10 Fastest Cars in the World</title> <link rel="stylesheet" href="app/site.css"> <script src="node_modules/angular2/bundles/angular2-polyfills.js"></script> <script src="node_modules/systemjs/dist/system.src.js"></script> <script src="node_modules/rxjs/bundles/Rx.js"></script> <script src="node_modules/angular2/bundles/angular2.dev.js"></script> <script> System.config({ transpiler: 'typescript', typescriptOptions: {emitDecoratorMetadata: true}, map: {typescript: 'node_modules/typescript/lib/typescript.js'}, packages: { 'app' : { defaultExtension: 'ts' } } }); System.import('app/boot').then(null, console.error.bind(console)); </script> </head> <body> <cars-list>Loading...</cars-list> </body> </html> This is startup page of our application, it contains required angular scripts, SystemJS configuration for module loading. Body tag contains <cars-list> tag which renders the root component of our application. However, I want to point out one specific statement: The System.import('app/boot') statement will import boot module from app package. Physically it loading boot.js file under app folder. car.ts Copy the following code to car.ts file. export interface Car { make: string; model: string; speed: number; } We are defining a car model using TypeScript interface, we are going to use this car model object in our components. app.component.ts Copy the following code to app.component.ts file. import {Component} from 'angular2/core'; @Component({ selector: 'cars-list', template: '' }) export class AppComponent { public heading = "Top 10 Fastest Cars in the World"; } Important points about AppComponent class: The AppComponent class is our application root component, it has one public property named 'heading' The AppComponent class is decorated with @Component() function with selector, template properties in its configuration object The @Component() function is imported using ES2015 module import syntax from 'angular2/core' module in Angular library We are also exporting the AppComponent class as module using export keyword Other modules in application can also import the AppComponent class using module name (app.component – file name without extension) using ES2015 module import syntax boot.ts Copy the following code to boot.ts file. import {bootstrap} from 'angular2/platform/browser' import {AppComponent} from './app.component'; bootstrap(AppComponent); In this file we are importing bootstrap() function from 'angular2/platform/browser' module and the AppComponent class from 'app.component' module. Next we are invoking bootstrap() function with the AppComponent class as parameter, this will instantiate an Angular application with the AppComponent as root component. site.css Copy the following code to site.css file. * { font-family: 'Segoe UI Light', 'Helvetica Neue', 'Segoe UI', 'Segoe'; color: rgb(51, 51, 51); } This file contains some basic styles for our application. Working with data in Angular In any typical web application, we need to display data on a HTML page and read the data from input controls on a HTML page. In Angular everything is a component, HTML page is represented as template and it is always associated with a component class. Application data lives on component's class properties. Either push values to template or pull values from template, to do this we need to bind the properties of component class to the controls on the template. This mechanism is known as data binding. Data binding in angular allows us to use simple syntax to push or pull data. When we bind the properties of component class to the controls on the template, if the data on the properties changes, Angular will automatically update the template to display the latest data and vice versa. We can also control the direction of data flow (from component to template, from template to component). Displaying Data using Interpolation If we go back to our AppComponent class in sample application, we have heading property. We need to display this heading property on the template. Here is the revised AppComponent class: app/app.component.ts import {Component} from 'angular2/core'; @Component({ selector: 'cars-list', template: '<h1>{{heading}}</h1>' }) export class AppComponent { public heading = "Top 10 Fastest Cars in the World"; } In @Component() function we updated template property with expression {{heading}} surrounded by h1 tag. The double curly braces are the interpolation syntax in Angular. Any property on the class we need to display on the template, use the property name surrounded by double curly braces. Angular will automatically render the value of property on the browser screen. Let's run our application, go to command line and navigate to the root of the application structure, then run the following command. $ npm start The preceding start command is part of scripts section in package.json file. It is invoking two other commands npm run tsc:w, npm run lite. npm run tsc:w: This command is performing the following actions: It is invoking TypeScript compiler in watch mode TypeScript compiler will compile all our TypeScript files to JavaScript using configuration mentioned in tsconfig.json TypeScript compiler will not exit after the compilation is over, it will wait for changes in TypeScript files Whenever we modify any TypeScript file, on the fly compiler will compile them to JavaScript npm run lite: This command will start a lite weight Node.js web server and launches our application in browser Now we can continue to make the changes in our application. Changes are detected and browser will refresh automatically with updates. Output in the browser: Let's further extend this simple application, we are going to bind the heading property to a textbox, here is revised template: template: ` <h1>{{heading}}</h1> <input type="text" value="{{heading}}"/> ` If we notice the template it is a multiline string and it is surrounded by ` (backquote/ backtick) symbols instead of single or double quotes. The backtick (``) symbols are new multi-line string syntax in ECMAScript 2015. We don't need start our application again, as mentioned earlier it will automatically refresh the browser with updated output until we stop 'npm start' command is at command line. Output in the browser: Now textbox also displaying the same value in heading property. Let's change the value in textbox by typing something, then hit the tab button. We don't see any changes happening on the browser. But as mentioned earlier in data binding whenever we change the value of any control on the template, which is bind to a property of component class it should update the property value. Then any other controls bind to same property should also display the updated value. In browser h1 tag should also display the same text whatever we type in textbox, but it won't happen. Summary We started this article by covering introduction to web components. Next we discussed a sample application which is the foundation for this article. Then we discussed how to write components using new features in Angular to like data binding and new templating syntaxes using lot of examples. By the end of this article, you should have good understanding of Angular new concepts and should be able to write basic components. Resources for Article: Further resources on this subject: Get Familiar with Angular [article] Gearing Up for Bootstrap 4 [article] Angular's component architecture [article]
Read more
  • 0
  • 0
  • 34866

article-image-pros-and-cons-serverless-model
Packt
06 Apr 2017
22 min read
Save for later

Pros and Cons of the Serverless Model

Packt
06 Apr 2017
22 min read
In this article by Diego Zanon, author of the book Building Serverless Web Applications, we give you an introduction to the Serverless model and the pros and cons that you should consider before building a serverless application. (For more resources related to this topic, see here.) Introduction to the Serverless model Serverless can be a model, a kind of architecture, a pattern or anything else you prefer to call it. For me, serverless is an adjective, a word that qualifies a way of thinking. It's a way to abstract how the code that you write will be executed. Thinking serverless is to not think in servers. You code, you test, you deploy and that's (almost) enough. Serverless is a buzzword. You still need servers to run your applications, but you should not worry about them that much. Maintaining a server is none of your business. The focus is on the development, writing code, and not in the operation. DevOps is still necessary, although with a smaller role. You need to automate deployment and have at least a minimal monitoring of how your application is operating and costing, but you won't start or stop machines to match the usage and you'll neither replace failed instances nor apply security patches to the operating system. Thinking serverless A serverless solution is entirely request-driven. Every time that a user requests some information, a trigger will notify your cloud vendor to pick your code and execute it to retrieve the answer. In contrast, a traditional solution also works to answer requests, but the code is always up and running, consuming machine resources that were reserved specifically for you, even when no one is using your website. In a serverless architecture, it's not necessary to load the entire codebase into a running machine to process a single request. For a faster loading step, only the code that is necessary to answer the request is selected to run. This small piece of the solution is referenced to as a function. So we only run functions on demand. Although we call it simply as a function, it's usually a zipped package that contains a piece of code that runs as an entry point and all the modules that this code depends to execute. What makes the Serverless model so interesting is that you are only billed for the time that was needed to execute your function, which is usually measured in fractions of seconds, not hours of use. If no one is using your service, you pay nothing. Also, if you have a sudden peak of users accessing your app, the cloud service will load different instances to handle all simultaneous requests. If one of those cloud machines fails, another one will be made available automatically, without your interference. Serverless and PaaS Serverless is often confused with Platform as a Service (PaaS). PaaS is a kind of cloud computing model that allows developers to launch applications without worrying about the infrastructure. According to this definition, they have the same objective, which is true. Serverless is like a rebranding of PaaS or you can call it as the next generation of PaaS. The main difference between PaaS and Serverless is that in PaaS you don't manage machines, but you know that they exist. You are billed by provisioning them, even if there is no user actively browsing your website. In PaaS, your code is always running and waiting for new requests. In Serverless, there is a service that is listening for requests and will trigger your code to run only when necessary. This is reflected in your bill. You will pay only for the fractions of seconds that your code was executed and the number of requests that were made to this listener. IaaS and on-premises Besides PaaS, Serverless is frequently compared to Infrastructure as a Service (IaaS) and the on-premises solutions to expose its advantages. IaaS is another strategy to deploy cloud solutions where you hire virtual machines and is allowed to connect to them to configure everything that you need in the guest operating system. It gives you greater flexibility, but comes along with more responsibilities. You need to apply security patches, handle occasional failures and set up new servers to handle usage peaks. Also, you pay the same per hour if you are using 5% or 100% of the machine's CPU. On-premises are the traditional kind of solution where you buy the physical computers and run them inside your company. You get total flexibility and control with this approach. Hosting your own solution can be cheaper, but it happens only when your traffic usage is extremely stable. Over or under provisioning computers is so frequent that it's hard to have real gains using this approach, also when you add the risks and costs to hire a team to manage those machines. Cloud providers may look expensive, but several detailed use cases prove that the return on investment (ROI) is larger running on the cloud than on-premises. When using the cloud, you benefit from the economy of scale of many gigantic data centers. Running by your own exposes your business to a wide range of risks and costs that you'll never be able to anticipate. The main goals of serverless To define a service as serverless, it must have at least the following features: Scale as you need: There is no under or over provisioning Highly available: Fault-tolerant and always online Cost-efficient: Never pay for idle servers Scalability Regarding IaaS, you can achieve infinite scalability with any cloud service. You just need to hire new machines as your usage grows. You can also automate the process of starting and stopping servers as your demand changes. But this is not a fast way to scale. When you start a new machine, you usually need something like 5 minutes before it can be usable to process new requests. Also, as starting and stopping machines is costly, you only do this after you are certain that you need. So, your automated process will wait for some minutes to confirm that your demand has changed before taking any action. IaaS is able to handle well-behaved usage changes, but can't handle unexpected high peaks that happen after announcements or marketing campaigns. With serverless, your scalability is measured in seconds and not minutes. Besides being scalable, it's very fast to scale. Also, it scales per request, without needing to provision capacity. When you consider a high usage frequency in a scale of minutes, IaaS suffers to satisfy the needed capacity while serverless meets even higher usages in less time. In the following figure, the left graph shows how scalability occurs with IaaS. The right graph shows how well the demand can be satisfied using a serverless solution: With an on-premises approach, this is an even bigger problem. As the usage grows, new machines must be bought and prepared. However, increasing the infrastructure requires purchase orders to be created and approved, delay waiting the new servers to arrive and time for the team to configure and test them. It can take weeks to grow, or even months if the company is very big and requests many steps and procedures to be filled in. Availability A highly available solution is one that is fault-tolerant to hardware failures. If one machine goes out, you can keep running with a satisfactory performance. However, if you lose an entire data center due to a power outage, you need machines in another data center to keep the service online. It generally means duplicating your entire infrastructure by placing each half in a different data center. Highly available solutions are usually very expensive in IaaS and on-premises. If you have multiple machines to handle your workload, placing them in different physical places and running a load balance service can be enough. If one data center goes out, you keep the traffic in the remaining machines and scale to compensate. However, there are cases where you pay extra without using those machines. For example, if you have a huge relational database that scaled vertically, you will end up paying for another expensive machine just as a slave to keep the availability. Even for NoSQL databases, if you set a MongoDB replica set in a consistent model, you pay for instances that will act only as secondaries, without serving to alleviate read requests. Instead of running idle machines, you can set them in a cold start state, meaning that the machine is prepared, but is off to reduce costs. However, if you run a website that sells products or services, you can lose customers even in small downtimes. Cold start for web servers can take a few minutes to recover, but needs several minutes for databases. Considering these scenarios, you get high availability for free in serverless. The cost is already considered in what you pay to use. Another aspect of availability is how to handle Distributed Denial of Service (DDoS) attacks. When you receive a huge load of requests in a very short time, how do you handle it? There are some tools and techniques that help mitigate the problem, for example, blacklisting IPs that go over a specific request rate, but before those tools start to work, you need to scale the solution, and it needs to scale really fast to prevent the availability to be compromised. In this, again, serverless has the best scaling speed. Cost efficiency It's impossible to match the traffic usage with what you have provisioned. Considering IaaS or on-premises, as a rule of thumb, CPU and RAM usage must always be lower than 90% to consider the machine healthy, and it is desirable to have a CPU using less than 20% of the capacity on normal traffic. In this case, you are paying for 80% of the waste, where the capacity is in an idle state. Paying for computer resources that you don't use is not efficient. Many cloud providers advertise that you just pay for what you use, but they usually offer significant discounts when you provision for 24 hours of usage in long term (one year or more). This means that you pay for machines that will keep running even in very low traffic hours. Also, even if you want to shut down machines in hours with very low traffic, you need to keep at least a minimum infrastructure 24/7 to keep you web server and databases always online. Regarding high availability, you need extra machines to add redundancy. Again, it's a waste of resources. Another efficiency problem is related with the databases, especially relational ones. Scaling vertically is a very troublesome task, so relational databases are always provisioned considering max peaks. It means that you pay for an expensive machine when most of the time you don't need one. In serverless, you shouldn't worry about provisioning or idle times. You should pay exactly for the CPU and RAM time that is used, measured in fractions of seconds and not hours. If it's a serverless database, you need to store data permanently, so this represents a cost even if no one is using your system. However, storage is usually very cheap. The higher cost, which is related with the database engine that runs queries and manipulates data, will only be billed by the amount of time used without considering idle times. Running a serverless system continuously for one hour has a much higher cost than one hour in a traditional system. However, the difference is that it will never keep one machine with 100% for one hour straight. The cost efficiency of serverless is perceived clearer in websites with varying traffic and not in the ones with flat traffic. Pros We can list the following strengths for the Serverless model: Fast scalability High availability Efficient usage of resources Reduced operational costs Focus on business, not on infrastructure System security is outsourced Continuous delivery Microservices friendly Cost model is startup friendly Let's skip the first three benefits, since they were already covered in the previous section and take a look into the others. Reduced operational costs As the infrastructure is fully managed by the cloud vendor, it reduces the operational costs since you don't need to worry about hardware failures, neither applying security patches to the operating system, nor fixing network issues. It effectively means that you need to spend less sysadmin hours to keep your application running. Also, it helps reducing risks. If you make an investment to deploy a new service and that ends up as a failure, you don't need to worry about selling machines or how to dispose of the data center that you have built. Focus on business Lean software development advocates that you must spend time in what aggregates value to the final product. In a serverless project, the focus is on business. Infrastructure is a second-class citizen. Configuring a large infrastructure is a costly and time consuming task. If you want to validate an idea through a Minimum Viable Product (MVP), without losing time to market, consider using serverless to save time. There are tools that automate the deployment, such as the Serverless Framework, which helps the developer to launch a prototype with minimum effort. If the idea fails, infrastructure costs are minimized since there are no payments in advance. System security The cloud vendor is responsible to manage the security of the operating system, runtime, physical access, networking, and all related technologies that enable the platform to operate. The developer still needs to handle authentication, authorization, and code vulnerabilities, but the rest is outsourced to the cloud provider. It's a positive feature if you consider that a large team of specialists is focused to implement the best security practices and patches new bug fixes as soon as possible to serve hundreds of their customers. That's economy of scale. Continuous delivery Serverless is based on breaking a big project into dozens of packages, each one represented by a top-level function that handles requests. Deploying a new version of a function means uploading a ZIP package to replacing the previous one and updating the endpoint configuration that specifies how this function can be triggered. Executing this task manually, for dozens of functions, is an exhaustive task. Automation is a must-have feature when working in a serverless project. For this task, we can use the Serverless Framework, which helps developers to manage and organize solutions, making a deployment task as simple as executing a one-line command. With automation, continuous delivery is a consequence that brings many benefits like the ability to deploy short development cycles and easier rollbacks anytime. Another related benefit when the deployment is automated is the creation of different environments. You can create a new test environment, which is an exact duplicate of the development environment, using simple commands. The ability to replicate the environment is very important to favor acceptance tests and deployment to production. Microservices friendly A microservices architecture is encouraged in a serverless project. As your functions are single units of deployment, you can have different teams working concurrently on different use cases. You can also use different programming languages in the same project and take advantage of emerging technologies or team skills. Cost model Suppose that you have built a serverless store. The average user will make some requests to see a few products and a few more requests to decide if they will buy something or not. In serverless, a single unit of code has a predictable time to execute for a given input. After collecting some data, you can predict how much a single user costs in average, and this unit cost will almost be constant as your application grows in usage. Knowing how much a single user cost and keeping this number fixed is very important for a startup. It helps to decide how much you need to charge for a service or earn through ads or sales to make a profit. In a traditional infrastructure, you need to make payments in advance, and scaling your application means increasing your capacity in steps. So, calculating the unit cost of a user is more difficult and it's a variable number. The following chart makes this difference easier to understand: Cons Serverless is great, but no technology is a silver bullet. You should be aware of the following issues: Higher latency Constraints Hidden inefficiencies Vendor dependency Debugging Atomic deploys Uncertainties We will discuss these drawbacks in detail now. Higher latency Serverless is request-driven and your code is not running all the time. When a request is made, it triggers a service that finds your function, unzips the package, loads into a container, and makes it available to be executed. The problem is that those steps takes time—up to a few hundreds of milliseconds. This issue is called cold start delay and is a trade-off that exists between serverless cost-effective model and a lower latency of traditional hosting. There are some solutions to fix this performance problem. For example, you can configure your function to reserve more RAM memory. It gives a faster start and overall performance. The programming language is also important. Java has a much higher cold start time then JavaScript (Node.js). Another solution is to benefit from the fact that the cloud provider may cache the loaded code, which means that the first execution will have a delay but further requests will benefit by a smaller latency. You can optimize a serverless function by aggregating a large number of functionalities into a single function. The consequence is that this package will be executed with a higher frequency and will frequently skip the cold start issue. The problem is that a big package will take more time to load and provoke a higher first start time. At a last resort, you could schedule another service to ping your functions periodically, like one time per couple of minutes, to prevent them to be put to sleep. It will add costs, but remove the cold start problem. There is also a concept of serverless databases that references services where the database is fully managed by the vendor and it charges only the storage and the time to execute the database engine. Those solutions are wonderful, but they add even more delay for your requests when compared with traditional databases. Proceed with caution when selecting those. Constraints If you go serverless, you need to know what the vendor constraints are. For example, on Amazon Web Services (AWS), you can't run a Lambda function for more than 5 minutes. It makes sense because if you're doing this, you are using it wrong. Serverless was designed to be cost efficient in short bursts. For constant and predictable processing, it will be expensive. Another constraint on AWS Lambda is the number of concurrent executions across all functions within a given region. Amazon limits this to 100. Suppose that your functions need 100 milliseconds in average to execute. In this scenario, you can handle up to 1000 users per second. The reasoning behind this restriction is to avoid excessive costs due to programming errors that may create potential runways or recursive iterations. AWS Lambda has a default limit of 100 concurrent executions. However, you can file a case into AWS Support Center to raise this limit. If you say that your application is ready for production and that you understand the risks, they will happily increase this value. When monitoring your Lambda functions using Amazon CloudWatch, there is an option called Throttles. Each invocation that exceeds the safety limit of concurrent calls is counted as one throttle. Hidden inefficiencies Some people are calling serverless as a NoOps solution. That's not true. DevOps is still necessary. You don't need to worry much about servers because they are second-class citizens and the focus is on your business. However, adding metrics and monitoring your applications will always be a good practice. It's so easy to scale that a specific function may be deployed with a poor performance that takes much more time than necessary and remains unnoticed forever because no one is monitoring the operation. Also, over or under provisioning is also possible (in a smaller sense) since you need to configure your function setting the amount of RAM memory that it will reserve and the threshold to timeout the execution. It's a very different scale of provisioning, but you need to keep it in mind to avoid mistakes. Vendor dependency When you build a serverless solution, you trust your business in a third-party vendor. You should be aware that companies fail and you can suffer downtimes, security breaches, and performance issues. Also, the vendor may change the billing model, increase costs, introduce bugs into their services, have poor documentations, modify an API forcing you to upgrade, and terminate services. A whole bunch of bad things may happen. What you need to weigh is whether it's worth trusting in another company or making a big investment to build all by yourself. You can mitigate these problems by doing a market search before selecting a vendor. However, you still need to count on luck. For example, Parse was a vendor that offered managed services with really nice features. It was bought by Facebook in 2013, which gave more reliability due to the fact that it was backed by a big company. Unfortunately, Facebook decided to shutdown all servers in 2016 giving one year of notice for customers to migrate to other vendors. Vendor lock-in is another big issue. When you use cloud services, it's very likely that one specific service has a completely different implementation than another vendor, making those two different APIs. You need to rewrite code in case you decide to migrate. It's already a common problem. If you use a managed service to send e-mails, you need to rewrite part of your code before migrating to another vendor. What raises a red flag here is that a serverless solution is entirely based into one vendor and migrating the entire codebase can be much more troublesome. To mitigate this problem, some tools like the Serverless Framework are moving to include multivendor support. Currently, only AWS is supported but they expect to include Microsoft, Google, and IBM clouds in the future, without requiring code rewrites to migrate. Multivendor support represents safety for your business and gives power to competitiveness. Debugging Unit testing a serverless solution is fairly simple because any code that your functions relies on can be separated into modules and unit tested. Integration tests are a little bit more complicated because you need to be online to test with external services. You can build stubs, but you lose some fidelity in your tests. When it comes to debugging to test a feature or fix an error, it's a whole different problem. You can't hook into an external service and make slow processing steps to see how your code behaves. Also, those serverless APIs are not open source, so you can't run them in-house for testing. All you have is the ability to log steps, which is a slow debugging approach, or you can extract the code and adapt it to host into your own servers and make local calls. Atomic deploys Deploying a new version of a serverless function is easy. You update the code and the next time that a trigger requests this function, your newly deployed code will be selected to run. This means that, for a brief moment, two instances of the same function can be executed concurrently with different implementations. Usually, that's not a problem, but when you deal with persistent storage and databases, you should be aware that a new piece of code can insert data into a format that an old version can't understand. Also, if you want to deploy a function that relies on a new implementation of another function, you need to be careful in the order that you deploy those functions. Ordering is often not secured by the tools that automate the deployment process. The problem here is that current serverless implementations consider that deployment is an atomic process for each function. You can't batch deploy a group of functions atomically. You can mitigate this issue by disabling the event source while you deploy a specific group, but that means introducing a downtime into the deployment process, or you can use a monolith approach instead of a microservices architecture for serverless applications. Uncertainties Serverless is still a pretty new concept. Early adopters are braving this field testing what works and which kind patterns and technologies can be used. Emerging tools are defining the development process. Vendors are releasing and improving new services. There are high expectations for the future, but the future hasn't come yet. Some uncertainties still worry developers when it comes to build large applications. Being a pioneer can be rewarding, but risky. Technical debt is a concept that compares software development with finances. The easiest solution in the short run is not always the best overall solution. When you take a bad decision in the beginning, you pay later with extra hours to fix it. Software is not perfect. Every single architecture has pros and cons that append technical debt in the long run. The question is: how much technical debt does serverless aggregate to the software development process? Is it more, less, or equivalent to the kind of architecture that you are using today? Summary In this article, you learned about the Serverless model and how it is different from other traditional approaches. You already know what the main benefits and advantages are that it may offer for your next application. Also, you are aware that no technology is a silver bullet. You know what kind of problems you may have with serverless and how mitigate some of them. Resources for article Refer to Building Serverless Architectures, available at https://www.packtpub.com/application-development/building-serverless-architectures for further resources on this subject. Resources for Article: Further resources on this subject: Deployment and DevOps [article] Customizing Xtext Components [article] Heads up to MvvmCross [article]
Read more
  • 0
  • 0
  • 2049

Packt
06 Apr 2017
15 min read
Save for later

Synchronization – An Approach to Delivering Successful Machine Learning Projects

Packt
06 Apr 2017
15 min read
“In the midst of chaos, there is also opportunity”                                                                                                                - Sun Tzu In this article, by Cory Lesmeister, the author of the book Mastering Machine Learning with R - Second Edition, Cory provides insights on ensuring the success and value of your machine learning endeavors. (For more resources related to this topic, see here.) Framing the problem Raise your hand if any of the following has happened or is currently happening to you: You’ve been part of a project team that failed to deliver anything of business value You attend numerous meetings, but they don’t seem productive; maybe they are even complete time wasters Different teams are not sharing information with each other; thus, you are struggling to understand what everyone else is doing, and they have no idea what you are doing or why you are doing it An unknown stakeholder, feeling threatened by your project, comes from out of nowhere and disparages you and/or your work The Executive Committee congratulates your team on their great effort, but decides not to implement it, or even worse, tells you to go back and do it all over again, only this time solve the real problem OK, you can put your hand down now. If you didn’t raise your hand, please send me your contact information because you are about as rare as a unicorn. All organizations, regardless of their size, struggle with integrating different functions, current operations, and other projects. In short, the real-world is filled with chaos. It doesn’t matter how many advanced degrees people have, how experienced they are, how much money is thrown at the problem, what technology is used, how brilliant and powerful the machine learning algorithm is, problems such as those listed above will happen. The bottom line is that implementing machine learning projects in the business world is complicated and prone to failure. However, out of this chaos you have the opportunity to influence your organization by integrating disparate people and teams, fostering a collaborative environment that can adapt to unforeseen changes. But, be warned, this is not easy. If it was easy everyone would be doing it. However, it works and, it works well. By it, I’m talking about the methodology I developed about a dozen years ago, a method I refer to as the “Synchronization Process”. If we ask ourselves, “what are the challenges to implementation”, it seems to me that the following blog post, clearly and succinctly sums it up: https://www.capgemini.com/blog/capping-it-off/2012/04/four-key-challenges-for-business-analytics It enumerates four challenges: Strategic alignment Agility Commitment Information maturity This blog addresses business analytics, but it can be extended to machine learning projects. One could even say machine learning is becoming the analytics tool of choice in many organizations. As such, I will make the case below that the Synchronization Process can effectively deal with the first three challenges. Not only that, the process can provide additional benefits. By overcoming the challenges, you can deliver an effective project, by delivering an effective project you can increase actionable insights and by increasing actionable insights, you will improve decision-making, and that is where the real business value resides. Defining the process “In preparing for battle, I have always found that plans are useless, but planning is indispensable.”                                                                                       - Dwight D. Eisenhower I adopted the term synchronization from the US Army’s operations manual, FM 3-0 where it is described as a battlefield tenet and force multiplier. The manual defines synchronization as, “…arranging activities in time, space and purpose to mass maximum relative combat power at a decisive place and time”. If we overlay this military definition onto the context of a competitive marketplace, we come up with a definition I find more relevant. For our purpose, synchronization is defined as, “arranging business functions and/or tasks in time and purpose to produce the proper amount of focus on a critical event or events”. These definitions put synchronization in the context of an “endstate” based on a plan and a vision. However, it is the process of seeking to achieve that endstate that the true benefits come to fruition. So, we can look at synchronization as not only an endstate, but also as a process. The military’s solution to synchronizing operations before implementing a plan is the wargame. Like the military, businesses and corporations have utilized wargaming to facilitate decision-making and create integration of different business functions. Following the synchronization process techniques explained below, you can take the concept of business wargaming to a new level. I will discuss and provide specific ideas, steps, and deliverables that you can implement immediately. Before we begin that discussion, I want to cover the benefits that the process will deliver. Exploring the benefits of the process When I created this methodology about a dozen years ago, I was part of a market research team struggling to commit our limited resources to numerous projects, all of which were someone’s top priority, in a highly uncertain environment. Or, as I like to refer to it, just another day at the office. I knew from my military experience that I had the tools and techniques to successfully tackle these challenges. It worked then and has been working for me ever since. I have found that it delivers the following benefits to an organization: Integration of business partners and stakeholders Timely and accurate measurement of performance and effectiveness Anticipation of and planning for possible events Adaptation to unforeseen threats Exploitation of unforeseen opportunities Improvement in teamwork Fostering a collaborative environment Improving focus and prioritization In market research, and I believe it applies to all analytical endeavors, including machine learning, we talked about focusing on three specific questions about what to measure: What are we measuring? When do we measure it? How will we measure it? We found that successfully answering those questions facilitated improved decision-making by informing leadership what STOP doing, what to START doing and what to KEEP doing. I have found myself in many meetings going nowhere when I would ask a question like, “what are you looking to stop doing?” Ask leadership what they want to stop, start, or continue to do and you will get to the core of the problem. Then, your job will be to configure the business decision as the measurement/analytical problem. The Synchronization Process can bring this all together in a coherent fashion. I’ve been asked often about what triggers in my mind that a project requires going through the Synchronization Process. Here are some of the questions you should consider, and if you answer “yes” to any of them, it may be a good idea to implement the process: Are resources constrained to the point that several projects will suffer poor results or not be done at all? Do you face multiple, conflicting priorities? Could the external environment change and dramatically influence project(s)? Are numerous stakeholders involved or influenced by a project’s result? Is the project complex and facing a high-level of uncertainty? Does the project involve new technology? Does the project face the actual or potential for organizational change? You may be thinking, “Hey, we have a project manager for all this?” OK, how is that working out? Let me be crystal clear here, this is not just project management! This is about improving decision-making! A Gannt Chart or task management software won’t do that. You must be the agent of change. With that, let’s turn our attention to the process itself. Exploring the process Any team can take the methods elaborated on below and incorporate them to their specific situation with their specific business partners.  If executed properly, one can expect the initial investment in time and effort to provide substantial payoff within weeks of initiating the process. There are just four steps to incorporate with each having several tasks for you and your team members to complete. The four steps are as follows: Project kick-off Project analysis Synchronization exercise Project execution Let’s cover each of these in detail. I will provide what I like to refer to as a “Quad Chart” for each process step along with appropriate commentary. Project kick-off I recommend you lead the kick-off meeting to ensure all team members understand and agree to the upcoming process steps. You should place emphasis on the importance of completing the pre-work and understanding of key definitions, particularly around facts and critical assumptions. The operational definitions are as follows: Facts: Data or information that will likely have an impact on the project Critical assumptions: Valid and necessary suppositions in the absence of facts that, if proven false, would adversely impact planning or execution It is an excellent practice to link facts and assumptions. Here is an example of how that would work: It is a FACT that the Information Technology is beta-testing cloud-based solutions. We must ASSUME for planning purposes, that we can operate machine learning solutions on the cloud by the fourth quarter of this year. See, we’ve linked a fact and an assumption together and if this cloud-based solution is not available, let’s say it would negatively impact our ability to scale-up our machine learning solutions. If so, then you may want to have a contingency plan of some sort already thought through and prepared for implementation. Don’t worry if you haven’t thought of all possible assumptions or if you end up with a list of dozens. The synchronization exercise will help in identifying and prioritizing them. In my experience, identifying and tracking 10 critical assumptions at the project level is adequate. The following is the quad chart for this process step: Figure 1: Project kick-off quad chart Notice what is likely a new term, “Synchronization Matrix”. That is merely the tool used by the team to capture notes during the Synchronization Exercise. What you are doing is capturing time and events on the X-axis, and functions and terms on the Y-axis. Of course, this is highly customizable based on the specific circumstances and we will discuss more about it in process step number 3, that is Synchronization exercise, but here is an abbreviated example: Figure 2: Synchronization matrix example You can see in the matrix that I’ve included a row to capture critical assumptions. I can’t understate how important it is to articulate, capture, and track them. In fact, this is probably my favorite quote on the subject: … flawed assumptions are the most common cause of flawed execution. Harvard Business Review, The High Performance Organization, July-August 2005 OK, I think I’ve made my point, so let’s look at the next process step. Project analysis At this step, the participants prepare by analyzing the situation, collecting data, and making judgements as necessary. The goal is for each participant of the Synchronization Exercise to come to that meeting fully prepared. A good technique is to provide project participants with a worksheet template for them to use to complete the pre-work. A team can complete this step either individually, collectively or both. Here is the quad chart for the process step: Figure 3: Project analysis quad chart Let me expand on a couple of points. The idea of a team member creating information requirements is quite important. These are often tied back to your critical assumptions. Take the example above of the assumption around fielding a cloud-based capability. Can you think of some information requirements that might have as a potential end-user? Furthermore, can you prioritize them? OK, having done that, can you think of a plan to acquire that information and confirm or deny the underlying critical assumption? Notice also how that ties together with decision points you or others may have to make and how they may trigger contingency plans. This may sound rather basic and simplistic, but unless people are asked to think like this, articulate their requirements, share the information don’t expect anything to change anytime soon. It will be business as usual and let me ask again, “how is that working out for you?”. There is opportunity in all that chaos, so embrace it, and in the next step you will see the magic happen. Synchronization exercise The focus and discipline of the participants determine the success of this process step. This is a wargame-type exercise where team members portray their plan over time. Now, everyone gets to see how their plan relates to or even inhibits someone else’s plan and vice versa. I’ve done this step several different ways, including building the matrix on software, but the method that has consistently produced the best results is to build the matrix on large paper and put it along a conference room wall. Then, have the participants, one at a time, use post-it notes to portray their key events.  For example, the marketing manager gets up to the wall and posts “Marketing Campaign One” in the first time phase, “Marketing Campaign Two” in the final time phase, along with “Propensity Models” in the information requirements block. Iterating by participant and by time/event leads to coordination and cooperation like nothing you’ve ever seen. Another method to facilitate the success of the meeting is to have a disinterested and objective third party “referee” the meeting. This will help to ensure that any issues are captured or resolved and the process products updated accordingly. After the exercise, team members can incorporate the findings to their individual plans. This is an example quad chart on the process step: Figure 4: Synchronization exercise quad chart I really like the idea of execution and performance metrics. Here is how to think about them: Execution metrics—are we doing things right? Performance metrics—are we doing the right things? As you see, execution is about plan implementation, while performance metrics are about determining if the plan is making a difference (yes, I know that can be quite a dangerous thing to measure). Finally, we come to the fourth step where everything comes together during the execution of the project plan. Project execution This is a continual step in the process where a team can utilize the synchronization products to maintain situational understanding of the itself, key stakeholders, and the competitive environment. It can determine and how plans are progressing and quickly react to opportunities and threats as necessary. I recommend you update and communicate changes to the documentation on a regular basis. When I was in pharmaceutical forecasting, it was imperative that I end the business week by updating the matrices on SharePoint, which were available to all pertinent team members. The following is the quad chart for this process step: Figure 5: Project execution quad chart Keeping up with the documentation is a quick and simple process for the most part, and by doing so you will keep people aligned and cooperating. Be aware that like everything else that is new in the world, initial exuberance and enthusiasm will start to wane after several weeks. That is fine as long as you keep the documentation alive and maintain systematic communication. You will soon find that behavior is changing without anyone even taking heed, which is probably the best way to actually change behavior. A couple of words of warning. Don’t expect everyone to embrace the process wholeheartedly, which is to say that office politics may create a few obstacles. Often, an individual or even an entire business function will withhold information as “information is power”, and by sharing information they may feel they are losing power. Another issue may rise where some people feel it is needlessly complex or unnecessary. A solution to these problems is to scale back the number of core team members and utilize stakeholder analysis and a communication plan to bring they naysayers slowly into the fold. Change is never easy, but necessary nonetheless. Summary In this article, I’ve covered, at a high-level, a successful and proven process to deliver machine learning projects that will drive business value. I developed it from my numerous years of planning and evaluating military operations, including a one-year stint as a strategic advisor to the Iraqi Oil Police, adapting it to the needs of any organization. Utilizing the Synchronization Process will help any team avoid the common pitfalls of projects and improve efficiency and decision-making. It will help you become an agent of change and create influence in an organization without positional power. Resources for Article: Further resources on this subject: Machine Learning with R [article] Machine Learning Using Spark MLlib [article] Welcome to Machine Learning Using the .NET Framework [article]
Read more
  • 0
  • 0
  • 2181

article-image-convolutional-neural-networks-reinforcement-learning
Packt
06 Apr 2017
9 min read
Save for later

Convolutional Neural Networks with Reinforcement Learning

Packt
06 Apr 2017
9 min read
In this article by Antonio Gulli, Sujit Pal, the authors of the book Deep Learning with Keras, we will learn about reinforcement learning, or more specifically deep reinforcement learning, that is, the application of deep neural networks to reinforcement learning. We will also see how convolutional neural networks leverage spatial information and they are therefore very well suited for classifying images. (For more resources related to this topic, see here.) Deep convolutional neural network A Deep Convolutional Neural Network (DCCN) consists of many neural network layers. Two different types of layers, convolutional and pooling, are typically alternated. The depth of each filter increases from left to right in the network. The last stage is typically made of one or more fully connected layers as shown here: There are three key intuitions beyond ConvNets: Local receptive fields Shared weights Pooling Let's review them together. Local receptive fields If we want to preserve the spatial information, then it is convenient to represent each image with a matrix of pixels. Then, a simple way to encode the local structure is to connect a submatrix of adjacent input neurons into one single hidden neuron belonging to the next layer. That single hidden neuron represents one local receptive field. Note that this operation is named convolution and it gives the name to this type of networks. Of course we can encode more information by having overlapping submatrices. For instance let's suppose that the size of each single submatrix is 5 x 5 and that those submatrices are used with MNIST images of 28 x 28 pixels, then we will be able to generate 23 x 23 local receptive field neurons in the next hidden layer. In fact it is possible to slide the submatrices by only 23 positions before touching the borders of the images. In Keras, the size of each single submatrix is called stride-length and this is an hyper-parameter which can be fine-tuned during the construction of our nets. Let's define the feature map from one layer to another layer. Of course we can have multiple feature maps which learn independently from each hidden layer. For instance we can start with 28 x 28 input neurons for processing MINST images, and then recall k feature maps of size 23 x 23 neurons each (again with stride of 5 x 5) in the next hidden layer. Shared weights and bias Let's suppose that we want to move away from the pixel representation in a row by gaining the ability of detecting the same feature independently from the location where it is placed in the input image. A simple intuition is to use the same set of weights and bias for all the neurons in the hidden layers. In this way each layer will learn a set of position-independent latent features derived from the image. Assuming that the input image has shape (256, 256) on 3 channels with tf (Tensorflow) ordering, this is represented as (256, 256, 3). Note that with th (Theano) mode the channels dimension (the depth) is at index 1, in tf mode is it at index 3. In Keras if we want to add a convolutional layer with dimensionality of the output 32 and extension of each filter 3 x 3 we will write: model = Sequential() model.add(Convolution2D(32, 3, 3, input_shape=(256, 256, 3)) This means that we are applying a 3 x 3 convolution on 256 x 256 image with 3 input channels (or input filters) resulting in 32 output channels (or output filters). An example of convolution is provided in the following diagram: Pooling layers Let's suppose that we want to summarize the output of a feature map. Again, we can use the spatial contiguity of the output produced from a single feature map and aggregate the values of a submatrix into one single output value synthetically describing the meaning associated with that physical region. Max pooling One easy and common choice is the so-called max pooling operator which simply outputs the maximum activation as observed in the region. In Keras, if we want to define a max pooling layer of size 2 x 2 we will write: model.add(MaxPooling2D(pool_size = (2, 2))) An example of max pooling operation is given in the following diagram: Average pooling Another choice is the average pooling which simply aggregates a region into the average values of the activations observed in that region. Keras implements a large number of pooling layers and a complete list is available online. In short, all the pooling operations are nothing more than a summary operation on a given region. Reinforcement learning Our objective is to build a neural network to play the game of catch. Each game starts with a ball being dropped from a random position from the top of the screen. The objective is to move a paddle at the bottom of the screen using the left and right arrow keys to catch the ball by the time it reaches the bottom. As games go, this is quite simple. At any point in time, the state of this game is given by the (x, y) coordinates of the ball and paddle. Most arcade games tend to have many more moving parts, so a general solution is to provide the entire current game screen image as the state. The following diagram shows four consecutive screenshots of our catch game: Astute readers might note that our problem could be modeled as a classification problem, where the input to the network are the game screen images and the output is one of three actions - move left, stay, or move right. However, this would require us to provide the network with training examples, possibly from recordings of games played by experts. An alternative and simpler approach might be to build a network and have it play the game repeatedly, giving it feedback based on whether it succeeds in catching the ball or not. This approach is also more intuitive and is closer to the way humans and animals learn. The most common way to represent such a problem is through a Markov Decision Process (MDP). Our game is the environment within which the agent is trying to learn. The state of the environment at time step t is given by st (and contains the location of the ball and paddle). The agent can perform certain actions at (such as moving the paddle left or right). These actions can sometimes result in a reward rt, which can be positive or negative (such as an increase or decrease in the score). Actions change the environment and can lead to a new state st+1, where the agent can perform another action at+1, and so on. The set of states, actions and rewards, together with the rules for transitioning from one state to the other, make up a Markov decision process. A single game is one episode of this process, and is represented by a finite sequence of states, actions and rewards: Since this is a Markov decision process, the probability of state st+1 depends only on current state st and action at. Maximizing future rewards As an agent, our objective is to maximize the total reward from each game. The total reward can be represented as follows: In order to maximize the total reward, the agent should try to maximize the total reward from any time point t in the game. The total reward at time step t is given by Rt and is represented as: However, it is harder to predict the value of the rewards the further we go into the future. In order to take this into consideration, our agent should try to maximize the total discounted future reward at time t instead. This is done by discounting the reward at each future time step by a factor γ over the previous time step. If γ is 0, then our network does not consider future rewards at all, and if γ is 1, then our network is completely deterministic. A good value for γ is around 0.9. Factoring the equation allows us to express the total discounted future reward at a given time step recursively as the sum of the current reward and the total discounted future reward at the next time step: Q-learning Deep reinforcement learning utilizes a model-free reinforcement learning technique called Q-learning. Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function which represents the maximum discounted future reward when we perform action a in state s: Once we know the Q-function, the optimal action a at a state s is the one with the highest Q-value. We can then define a policy π(s) that gives us the optimal action at any state: We can define the Q-function for a transition point (st, at, rt, st+1) in terms of the Q-function at the next point (st+1, at+1, rt+1, st+2) similar to how we did with the total discounted future reward. This equation is known as the Bellmann equation. The Q-function can be approximated using the Bellman equation. You can think of the Q-function as a lookup table (called a Q-table) where the states (denoted by s) are rows and actions (denoted by a) are columns, and the elements (denoted by Q(s, a)) are the rewards that you get if you are in the state given by the row and take the action given by the column. The best action to take at any state is the one with the highest reward. We start by randomly initializing the Q-table, then carry out random actions and observe the rewards to update the Q-table iteratively according to the following algorithm: initialize Q-table Q observe initial state s repeat select and carry out action a observe reward r and move to new state s' Q(s, a) = Q(s, a) + α(r + γ maxa' Q(s', a') - Q(s, a)) s = s' until game over You will realize that the algorithm is basically doing stochastic gradient descent on the Bellman equation, backpropagating the reward through the state space (or episode) and averaging over many trials (or epochs). Here α is the learning rate that determines how much of the difference between the previous Q-value and the discounted new maximum Q-value should be incorporated. Summary We have seen the application of deep neural networks, reinforcement learning. We have also seen convolutional neural networks and how they are well suited for classifying images. Resources for Article: Further resources on this subject: Deep learning and regression analysis [article] Training neural networks efficiently using Keras [article] Implementing Artificial Neural Networks with TensorFlow [article]
Read more
  • 0
  • 0
  • 26890

article-image-supervised-learning-classification-and-regression
Packt
06 Apr 2017
30 min read
Save for later

Supervised Learning: Classification and Regression

Packt
06 Apr 2017
30 min read
In this article by Alexey Grigorev, author of the book Mastering Java for Data Science, we will look at how to do pre-processing of data in Java and how to do Exploratory Data Analysis inside and outside Java. Now, when we covered the foundation, we are ready to start creating Machine Learning models. First, we start with Supervised Learning. In the supervised settings we have some information attached to each observation – called labels – and we want to learn from it, and predict it for observations without labels. There are two types of labels: the first are discrete and finite, such as True/False or Buy/Sell, and second are continuous, such as salary or temperature. These types correspond to two types of Supervised Learning: Classification and Regression. We will talk about them in this article. This article covers: Classification problems Regression Problems Evaluation metrics for each type Overview of available implementations in Java (For more resources related to this topic, see here.) Classification In Machine Learning, the classification problem deals with discrete targets with finite set of possible values. The Binary Classification is the most common type of classification problems: the target variable can have only two possible values, such as True/False, Relevant/Not Relevant, Duplicate/Not Duplicate, Cat/Dog and so on. Sometimes the target variable can have more than two outcomes, for example: colors, category of an item, model of a car, and so on, and we call it Multi-Class Classification. Typically each observation can only have one label, but in some settings an observation can be assigned several values. Typically, Multi-Class Classification can be converted to a set of Binary Classification problems, which is why we will mostly concentrate on Binary Classification. Binary Classification Models There are many models for solving the binary classification problem and it is not possible to cover all of them. We will briefly cover the ones that are most often used in practice. They include: Logistic Regression Support Vector Machines Decision Trees Neural Networks We assume that you are already familiar with these methods. Deep familiarity is not required, but for more information you can check the following book: An Introduction to Statistical Learning by G. James, D. Witten, T. Hastie and R. Tibshirani Python Machine Learning by S. Raschka When it comes to libraries, we will cover the following: Smile, JSAT, LIBSVM and LIBLINEAR and Encog. Smile Smile (Statistical Machine Intelligence and Learning Engine) is a library with a large set of classification and other machine learning algorithms. You can see the full list here: https://github.com/haifengl/smile. The library is available on Maven Central and the latest version at the moment of writing is 1.1.0. To include it to your project, add the following dependency: <dependency> <groupId>com.github.haifengl</groupId> <artifactId>smile-core</artifactId> <version>1.1.0</version> </dependency> It is being actively developed; new features and bug fixes are added quite often, but not released as frequently. We recommend to use the latest available version of Smile, and to get it, you will need to build it from the sources. To do it: Install sbt – a tool for building scala projects. You can follow this instruction: http://www.scala-sbt.org/release/docs/Manual-Installation.html Use git to clone the project from https://github.com/haifengl/smile To build and publish the library to local maven repository, run the following command: sbt core/publishM2 The Smile library consists of several sub-modules, such as smile-core, smile-nlp, and smile-plot and so on. So, after building it, add the following dependency to your pom: <dependency> <groupId>com.github.haifengl</groupId> <artifactId>smile-core</artifactId> <version>1.2.0</version> </dependency> The models from Smile expect the data to be in a form of two-dimensional arrays of doubles, and the label information is one dimensional array of integers. For binary models, the values should be 0 or 1. Some models in Smile can handle Multi-Class Classification problems, so it is possible to have more labels. The models are built using the Builder pattern: you create a special class, set some parameters and at the end it returns the object it builds. In Smile this builder class is typically called Trainer, and all models should have a trainer for them. For example, consider training a Random Forest model: double[] X = ...// training data int[] y = ...// 0 and 1 labels RandomForest model = new RandomForest.Trainer() .setNumTrees(100) .setNodeSize(4) .setSamplingRates(0.7) .setSplitRule(SplitRule.ENTROPY) .setNumRandomFeatures(3) .train(X, y); The RandomForest.Trainer class takes in a set of parameters and the training data, and at the end produces the trained Random Forest model. The implementation of Random Forest from Smile has the following parameters: numTrees: number of trees to train in the model. nodeSize: the minimum number of items in the leaf nodes. samplingRate: the ratio of training data used to grow each tree. splitRule: the impurity measure used for selecting the best split. numRandomFeatures: the number of features the model randomly chooses for selecting the best split. Similarly, a logistic regression is trained as follows: LogisticRegression lr = new LogisticRegression.Trainer() .setRegularizationFactor(lambda) .train(X, y); Once we have a model, we can use it for predicting the label of previously unseen items. For that we use the predict method: double[] row = // data int prediction = model.predict(row); This code outputs the most probable class for the given item. However, often we are more interested not in the label itself, but in the probability of having the label. If a model implements the SoftClassifier interface, then it is possible to get these probabilities like this: double[] probs = new double[2]; model.predict(row, probs); After running this code, the probs array will contain the probabilities. JSAT JSAT (Java Statistical Analysis Tool) is another Java library which contains a lot of implementations of common Machine Learning algorithms. You can check the full list of implemented models at https://github.com/EdwardRaff/JSAT/wiki/Algorithms. To include JSAT to a Java project, add this to pom: <dependency> <groupId>com.edwardraff</groupId> <artifactId>JSAT</artifactId> <version>0.0.5</version> </dependency> Unlike Smile, which just takes arrays of doubles, JSAT requires a special wrapper class for data instances. If we have an array, it is converted to the JSAT representation like this: double[][] X = ... // data int[] y = ... // labels // change to more classes for more classes for multi-classification CategoricalData binary = new CategoricalData(2); List<DataPointPair<Integer>> data = new ArrayList<>(X.length); for (int i = 0; i < X.length; i++) { int target = y[i]; DataPoint row = new DataPoint(new DenseVector(X[i])); data.add(new DataPointPair<Integer>(row, target)); } ClassificationDataSet dataset = new ClassificationDataSet(data, binary); Once we have prepared the dataset, we can train a model. Let us consider the Random Forest classifier again: RandomForest model = new RandomForest(); model.setFeatureSamples(4); model.setMaxForestSize(150); model.trainC(dataset); First, we set some parameters for the model, and then, at we end, we call the trainC method (which means “train a classifier”). In the JSAT implementation, Random Forest has fewer options for tuning: only the number of features to select and the number of trees to grow. There are several implementations of Logistic Regression. The usual Logistic Regression model does not have any parameters, and it is trained like this: LogisticRegression model = new LogisticRegression(); model.trainC(dataset); If we want to have a regularized model, then we need to use the LogisticRegressionDCD class (DCD stands for “Dual Coordinate Descent” - this is the optimization method used to train the logistic regression). We train it like this: LogisticRegressionDCD model = new LogisticRegressionDCD(); model.setMaxIterations(maxIterations); model.setC(C); model.trainC(fold.toJsatDataset()); In this code, C is the regularization parameter, and the smaller values of C correspond to stronger regularization effect. Finally, for outputting the probabilities, we can do the following: double[] row = // data DenseVector vector = new DenseVector(row); DataPoint point = new DataPoint(vector); CategoricalResults out = model.classify(point); double probability = out.getProb(1); The class CategoricalResults contains a lot of information, including probabilities for each class and the most likely label. LIBSVM and LIBLINEAR Next we consider two similar libraries: LIBSVM and LIBLINEAR. LIBSVM (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) is a library with implementation of Support Vector Machine models, which include support vector classifiers. LIBLINEAR (https://www.csie.ntu.edu.tw/~cjlin/liblinear/) is a library for fast linear classification algorithms such as Liner SVM and Logistic Regression. Both these libraries come from the same research group and have very similar interfaces. LIBSVM is implemented in C++ and has an officially supported java version. It is available on Maven Central: <dependency> <groupId>tw.edu.ntu.csie</groupId> <artifactId>libsvm</artifactId> <version>3.17</version> </dependency> Note that the Java version of LIBSVM is updated not as often as the C++ version. Nevertheless, the version above is stable and should not contain bugs, but it might be slower than its C++ version. To use SVM models from LIBSVM, you first need to specify the parameters. For this, you create a svm_parameter class. Inside, you can specify many parameters, including: the kernel type (RBF, POLY or LINEAR), the regularization parameter C, probability, which you can set to 1 to be able to get probabilities, the svm_type should be set to C_SVC: this tells that the model should be a classifier. Here is an example of how you can configure an SVM classifier with the Linear kernel which can output probabilities: svm_parameter param = new svm_parameter(); param.svm_type = svm_parameter.C_SVC; param.kernel_type = svm_parameter.LINEAR; param.probability = 1; param.C = C; // default parameters param.cache_size = 100; param.eps = 1e-3; param.p = 0.1; param.shrinking = 1; The polynomial kernel is specified by the following formula: It has three additional parameters: gamma, coeff0 and degree; and also C – the regularization parameter. You can specify it like this: svm_parameter param = new svm_parameter(); param.svm_type = svm_parameter.C_SVC; param.kernel_type = svm_parameter.POLY; param.C = C; param.degree = degree; param.gamma = 1; param.coef0 = 1; param.probability = 1; // plus defaults from the above Finally, the Gaussian kernel (or RBF) has the following formula:' So there is one parameter gamma, which controls the width of the Gaussians. We can specify the model with the RBF kernel like this: svm_parameter param = new svm_parameter(); param.svm_type = svm_parameter.C_SVC; param.kernel_type = svm_parameter.RBF; param.C = C; param.gamma = gamma; param.probability = 1; // plus defaults from the above Once we created the configuration object, we need to convert the data in the right format. The LIBSVM command line application reads files in the SVMLight format, so the library also expects sparse data representation. For a single array, the conversion is following: double[] dataRow = // single row vector svm_node[] svmRow = new svm_node[dataRow.length]; for (int j = 0; j < dataRow.length; j++) { svm_node node = new svm_node(); node.index = j; node.value = dataRow[j]; svmRow[j] = node; } For a matrix, we do this for every row: double[][] X = ... // data int n = X.length; svm_node[][] nodes = new svm_node[n][]; for (int i = 0; i < n; i++) { nodes[i] = wrapAsSvmNode(X[i]); } Where wrapAsSvmNode is a function, that wraps a vector into an array of svm_node objects. Now we can put the data and the labels together into svm_problem object: double[] y = ... // labels svm_problem prob = new svm_problem(); prob.l = n; prob.x = nodes; prob.y = y; Now using the data and the parameters we can train the SVM model: svm_model model = svm.svm_train(prob, param);  Once the model is trained, we can use it for classifying unseen data. Getting probabilities is done this way: double[][] X = // test data int n = X.length; double[] results = new double[n]; double[] probs = new double[2]; for (int i = 0; i < n; i++) { svm_node[] row = wrapAsSvmNode(X[i]); svm.svm_predict_probability(model, row, probs); results[i] = probs[1]; } Since we used the param.probability = 1, we can use svm.svm_predict_probability method to predict probabilities. Like Smile, the method takes an array of doubles and writes the results there. Then we can get the probabilities there. Finally, while training, LIBSVM outputs a lot of things on the console. If we are not interested in this output, we can disable it with the following code snippet: svm.svm_set_print_string_function(s -> {}); Just add this in the beginning of your code and you will not see this anymore. The next library is LIBLINEAR, which provides very fast and high performing linear classifiers such as SVM with Linear Kernel and Logistic Regression. It can easily scale to tens and hundreds of millions of data points. Unlike LIBSVM, there is no official Java version of LIBLINEAR, but there is unofficial Java port available at http://liblinear.bwaldvogel.de/. To use it, include the following: <dependency> <groupId>de.bwaldvogel</groupId> <artifactId>liblinear</artifactId> <version>1.95</version> </dependency> The interface is very similar to LIBSVM. First, you define the parameters: SolverType solverType = SolverType.L1R_LR; double C = 0.001; double eps = 0.0001; Parameter param = new Parameter(solverType, C, eps); We have to specify three parameters here: solverType: defines the model which will be used; C: the amount regularization, the smaller C the stronger the regularization; epsilon: tolerance for stopping the training process. A reasonable default is 0.0001; For classification there are the following solvers: Logistic Regression: L1R_LR or L2R_LR SVM: L1R_L2LOSS_SVC or L2R_L2LOSS_SVC According to the official FAQ (which can be found here: https://www.csie.ntu.edu.tw/~cjlin/liblinear/FAQ.html) you should: Prefer SVM to Logistic Regression as it trains faster and usually gives higher accuracy. Try L2 regularization first unless you need a sparse solution – in this case use L1 The default solver is L2-regularized support linear classifier for the dual problem. If it is slow, try solving the primal problem instead. Then you define the dataset. Like previously, let's first see how to wrap a row: double[] row = // data int m = row.length; Feature[] result = new Feature[m]; for (int i = 0; i < m; i++) { result[i] = new FeatureNode(i + 1, row[i]); } Please note that we add 1 to the index – the 0 is the bias term, so the actual features should start from 1. We can put this into a function wrapRow and then wrap the entire dataset as following: double[][] X = // data int n = X.length; Feature[][] matrix = new Feature[n][]; for (int i = 0; i < n; i++) { matrix[i] = wrapRow(X[i]); } Now we can create the Problem class with the data and labels: double[] y = // labels Problem problem = new Problem(); problem.x = wrapMatrix(X); problem.y = y; problem.n = X[0].length + 1; problem.l = X.length; Note that here we also need to provide the dimensionality of the data, and it's the number of features plus one. We need to add one because it includes the bias term. Now we are ready to train the model: Model model = LibLinear.train(fold, param); When the model is trained, we can use it to classify unseen data. In the following example we will output probabilities: double[] dataRow = // data Feature[] row = wrapRow(dataRow); Linear.predictProbability(model, row, probs); double result = probs[1]; The code above works fine for the Logistic Regression model, but it will not work for SVM: SVM cannot output probabilities, so the code above will throw an error for solvers like L1R_L2LOSS_SVC. What you can do instead is get the raw output: double[] values = new double[1]; Feature[] row = wrapRow(dataRow); Linear.predictValues(model, row, values); double result = values[0]; In this case the results will not contain probability, but some real value. When this value is greater than zero, the model predicts that the class is positive. If we would like to map this value to the [0, 1] range, we can use the sigmoid function for that: public static double[] sigmoid(double[] scores) { double[] result = new double[scores.length]; for (int i = 0; i < result.length; i++) { result[i] = 1 / (1 + Math.exp(-scores[i])); } return result; } Finally, like LIBSVM, LIBLINEAR also outputs a lot of things to standard output. If you do not wish to see it, you can mute it with the following code: PrintStream devNull = new PrintStream(new NullOutputStream()); Linear.setDebugOutput(devNull); Here, we use NullOutputStream from Apache IO which does nothing, so the screen stays clean. When to use LIBSVM and when to use LIBLINEAR? For large datasets often it is not possible to use any kernel methods. In this case you should prefer LIBLINEAR. Additionally, LIBLINEAR is especially good for Text Processing purposes such as Document Classification. Encog Finally, we consider a library for training Neural Networks: Encog. It is available on Maven Central and can be added with the following snippet: <dependency> <groupId>org.encog</groupId> <artifactId>encog-core</artifactId> <version>3.3.0</version> </dependency> With this library you first need to specify the network architecture: BasicNetwork network = new BasicNetwork(); network.addLayer(new BasicLayer(new ActivationSigmoid(), true, noInputNeurons)); network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 30)); network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 1)); network.getStructure().finalizeStructure(); network.reset(); Here we create a network with one input layer, one inner layer with 30 neurons and one output layer with 1 neuron. In each layer we use sigmoid as the activation function and add the bias input (the true parameter). The last line randomly initializes the weights in the network. For both input and output Encog expects two-dimensional double arrays. In case of binary classification we typically have a one dimensional array, so we need to convert it: double[][] X = // data double[] y = // labels double[][] y2d = new double[y.length][]; for (int i = 0; i < y.length; i++) { y2d[i] = new double[] { y[i] }; } Once the data is converted, we wrap it into special wrapper class: MLDataSet dataset = new BasicMLDataSet(X, y2d); Then this dataset can be used for training: MLTrain trainer = new ResilientPropagation(network, dataset); double lambda = 0.01; trainer.addStrategy(new RegularizationStrategy(lambda)); int noEpochs = 101; for (int i = 0; i < noEpochs; i++) { trainer.iteration(); } There are a lot of other Machine Learning libraries available in Java. For example Weka, H2O, JavaML and others. It is not possible to cover all of them, but you can also try them and see if you like them more than the ones we have covered. Evaluation We have covered many Machine Learning libraries, and many of them implement the same algorithms like Random Forest or Logistic Regression. Also, each individual model can have many different parameters: a Logistic Regression has the regularization coefficient, an SVM is configured by setting the kernel and its parameters. How do we select the best single model out of so many possible variants? For that we first define some evaluation metric and then select the model which achieves the best possible performance with respect to this metric. For binary classification we can select one of the following metrics: Accuracy and Error Precision, Recall and F1 AUC (AU ROC) Result Evaluation K-Fold Cross Validation Training, Validation and Testing. Accuracy Accuracy tells us for how many examples the model predicted the correct label. Calculating it is trivial: int n = actual.length; double[] proba = // predictions; double[] prediction = Arrays.stream(proba).map(p -> p > threshold ? 1.0 : 0.0).toArray(); int correct = 0; for (int i = 0; i < n; i++) { if (actual[i] == prediction[i]) { correct++; } } double accuracy = 1.0 * correct / n; Accuracy is the simplest evaluation metric and everybody understands it. Precision, Recall and F1 In some cases accuracy is not the best measure of model performance. For example, suppose we have an unbalanced dataset: there are only 1% of examples that are positive. Then a model which always predict negative is right in 99% cases, and hence will have accuracy of 0.99. But this model is not useful. There are alternatives to accuracy that can overcome this problem. Precision and Recall are among these metrics: they both look at the fraction of positive items that the model correctly recognized. They can be calculated using the Confusion Matrix: a table which summarizes the performance of a binary classifier: Precision is the fraction of correctly predicted positive items among all items the model predicted positive. In terms of the confusion matrix, Precision is TP / (TP + FP). Recall is the fraction of correctly predicted positive items among items that are actually positive. With values from the confusion matrix, Recall is TP / (TP + FN). It is often hard to decide whether one should optimize Precision or Recall. But there is another metric which combines both Precision and Recall into one number, and it is called F1 score. For calculating Precision and Recall, we first need to calculate the values for the cells of the confusion matrix: int tp = 0, tn = 0, fp = 0, fn = 0; for (int i = 0; i < actual.length; i++) { if (actual[i] == 1.0 && proba[i] > threshold) { tp++; } else if (actual[i] == 0.0 && proba[i] <= threshold) { tn++; } else if (actual[i] == 0.0 && proba[i] > threshold) { fp++; } else if (actual[i] == 1.0 && proba[i] <= threshold) { fn++; } } Then we can use the values to calculate Precision and Recall: double precision = 1.0 * tp / (tp + fp); double recall = 1.0 * tp / (tp + fn); Finally, F1 can be calculated using the following formula: double f1 = 2 * precision * recall / (precision + recall); ROC and AU ROC (AUC) The metrics above are good for binary classifiers which produce hard output: they only tell if the class should be assigned a positive label or negative. If instead our model outputs some score such that the higher the values of the score the more likely the item is to be positive, then the binary classifier is called a ranking classifier. Most of the models can output probabilities of belonging to a certain class, and we can use it to rank examples such that the positive are likely to come first. The ROC Curve visually tells us how good a ranking classifier separates positive examples from negative ones. The way a ROC curve is build is the following: We sort the observations by their score and then starting from the origin we go up if the observation is positive and right if it is negative. This way, in the ideal case, we first always go up, and then always go right – and this will result in the best possible ROC curve. In this case we can say that the separation between positive and negative examples is perfect. If the separation is not perfect, but still OK, the curve will go up for positive examples, but sometimes will turn right when a misclassification occurred. Finally, a bad classifier will not be able to tell positive and negative examples apart and the curve would alternate between going up and right. The diagonal line on the plot represents the baseline – the performance that a random classifier would achieve. The further away the curve from the baseline, the better. Unfortunately, there is no available easy-to-use implementation of ROC curves in Java. So the algorithm for drawing a ROC curve is the following: Let POS be number of positive labels, and NEG be the number of negative labels Order data by the score, decreasing Start from (0, 0) For each example in the sorted order, if the example is positive, move 1 / POS up in the graph, otherwise, move 1 / NEG right in the graph. This is a simplified algorithm and assumes that the scores are distinct. If the scores aren't distinct, and there are different actual labels for the same score, some adjustment needs to be made. It is implemented in the class RocCurve which you will find in the source code. You can use it as following: RocCurve.plot(actual, prediction); Calling it will create a plot similar to this one: The area under the curve says how good the separation is. If the separation is very good, then the area will be close to one. But if the classifier cannot distinguish between positive and negative examples, the curve will go around the random baseline curve, and the area will be close to 0.5. Area Under the Curve is often abbreviated as AUC, or, sometimes, AU ROC – to emphasize that the Curve is a ROC Curve. AUC has a very nice interpretation: the value of AUC corresponds to probability that a randomly selected positive example is scored higher than a randomly selected negative example. Naturally, if this probability is high, our classifier does a good job separating positive and negative examples. This makes AUC a to-go evaluation metric for many cases, especially when the dataset is unbalanced – in the sense that there are a lot more examples of one class than another. Luckily, there are implementations of AUC in Java. For example, it is implemented in Smile. You can use it like this: double[] predicted = ... // int[] truth = ... // double auc = AUC.measure(truth, predicted); Result Validation When learning from data there is always the danger of overfitting. Overfitting occurs when the model starts learning the noise in the data instead of detecting useful patterns. It is always important to check if a model overfits – otherwise it will not be useful when applied to unseen data. The typical and most practical way of checking whether a model overfits or not is to emulate “unseen data” – that is, take a part of the available labeled data and do not use it for training. This technique is called “hold out”: we hold out a part of the data and use it only for evaluation. Often we shuffle the original data set before splitting. In many cases we make a simplifying assumption that the order of data is not important – that is, one observation has no influence on another. In this case shuffling the data prior to splitting will remove effects that the order of items might have. On the other hand, if the data is a Time Series data, then shuffling it is not a good idea, because there is some dependence between observations. So, let us implement the hold out split. We assume that the data that we have is already represented by X – a two-dimensional array of doubles with features and y – a one-dimensional array of labels. First, let us create a helper class for holding the data: public class Dataset { private final double[][] X; private final double[] y; // constructor and getters are omitted } Splitting our dataset should produce two datasets, so let us create a class for that as well: public class Split { private final Dataset train; private final Dataset test; // constructor and getters are omitted } Now suppose we want to split the data into two parts: train and test. We also want to specify the size of the train set, we will do it using a testRatio parameter: the percentage of items that should go to the test set. So the first thing we do is generating an array with indexes and then splitting it according to testRatio: int[] indexes = IntStream.range(0, dataset.length()).toArray(); int trainSize = (int) (indexes.length * (1 - testRatio)); int[] trainIndex = Arrays.copyOfRange(indexes, 0, trainSize); int[] testIndex = Arrays.copyOfRange(indexes, trainSize, indexes.length); We can also shuffle the indexes if we want: Random rnd = new Random(seed); for (int i = indexes.length - 1; i > 0; i--) { int index = rnd.nextInt(i + 1); int tmp = indexes[index]; indexes[index] = indexes[i]; indexes[i] = tmp; } Then we can select instances for the training set as follows: int trainSize = trainIndex.length; double[][] trainX = new double[trainSize][]; double[] trainY = new double[trainSize]; for (int i = 0; i < trainSize; i++) { int idx = trainIndex[i]; trainX[i] = X[idx]; trainY[i] = y[idx]; } And then finally wrap it into our Dataset class: Dataset train = new Dataset(trainX, trainY); If we repeat the same for the test set, we can put both train and test sets into a Split object: Split split = new Split(train, test); And now we can use train fold for training and test fold for testing the models. If we put all the code above into a function of the Dataset class, for example, trainTestSplit, we can use it as follows: Split split = dataset.trainTestSplit(0.2); Dataset train = split.getTrain(); // train the model using train.getX() and train.getY() Dataset test = split.getTest(); // test the model using test.getX(); test.getY(); K-Fold Cross Validation Holding out only one part of the data may not always be the best option. What we can do instead is splitting it into K parts and then testing the models only on 1/Kth of the data. This is called K-Fold Cross-Validation: it not only gives the performance estimation, but also the possible spread of the error. Typically we are interested in models which give good and consistent performance. K-Fold Cross-Validation helps us to select such models. Then we prepare the data for K-Fold Cross-Validation is the following: First, split the data into K parts Then for each of these parts Take one part as the validation set Take the remaining K-1 parts as the training set If we translate this into Java, the first step will look like this: int[] indexes = IntStream.range(0, dataset.length()).toArray(); int[][] foldIndexes = new int[k][]; int step = indexes.length / k; int beginIndex = 0; for (int i = 0; i < k - 1; i++) { foldIndexes[i] = Arrays.copyOfRange(indexes, beginIndex, beginIndex + step); beginIndex = beginIndex + step; } foldIndexes[k - 1] = Arrays.copyOfRange(indexes, beginIndex, indexes.length); This creates an array of indexes for each fold. You can also shuffle the indexes array as previously. Now we can create splits from each fold: List<Split> result = new ArrayList<>(); for (int i = 0; i < k; i++) { int[] testIdx = folds[i]; int[] trainIdx = combineTrainFolds(folds, indexes.length, i); result.add(Split.fromIndexes(dataset, trainIdx, testIdx)); } In the code above we have two additional methods: combineTrainFolds K-1 arrays with indexes and combines them into one Split.fromIndexes creates a split gives train and test indexes. We have already covered the second function when we created a simple hold-out test set. And the first function, combineTrainFolds, is implemented like this: private static int[] combineTrainFolds(int[][] folds, int totalSize, int excludeIndex) { int size = totalSize - folds[excludeIndex].length; int result[] = new int[size]; int start = 0; for (int i = 0; i < folds.length; i++) { if (i == excludeIndex) { continue; } int[] fold = folds[i]; System.arraycopy(fold, 0, result, start, fold.length); start = start + fold.length; } return result; } Again, we can put the code above into a function of the Dataset class and call it like follows: List<Split> folds = train.kfold(3); Now when we have a list of Split objects, we can create a special function for performing Cross-Validation: public static DescriptiveStatistics crossValidate(List<Split> folds, Function<Dataset, Model> trainer) { double[] aucs = folds.parallelStream().mapToDouble(fold -> { Dataset foldTrain = fold.getTrain(); Dataset foldValidation = fold.getTest(); Model model = trainer.apply(foldTrain); return auc(model, foldValidation); }).toArray(); return new DescriptiveStatistics(aucs); } What this function does takes a list of folds and a callback which inside creates a model. Then, after the model is trained, we calculate AUC for it. Additionally, we take advantage of Java's ability to parallelize loops and train models on each fold at the same time. Finally, we put the AUCs calculated on each fold into a DescriptiveStatistics object, which can later on be used to return the mean and the standard deviation of the AUCs. As you probably remember, the DescriptiveStatistics class comes from the Apache Commons Math library. Let us consider an example. Suppose we want to use Logistic Regression from LIBLINEAR and select the best value for the regularization parameter C. We can use the function above this way: double[] Cs = { 0.01, 0.05, 0.1, 0.5, 1.0, 5.0, 10.0 }; for (double C : Cs) { DescriptiveStatistics summary = crossValidate(folds, fold -> { Parameter param = new Parameter(SolverType.L1R_LR, C, 0.0001); return LibLinear.train(fold, param); }); double mean = summary.getMean(); double std = summary.getStandardDeviation(); System.out.printf("L1 logreg C=%7.3f, auc=%.4f ± %.4f%n", C, mean, std); } Here, LibLinear.train is a helper method which takes a Dataset object and a Parameter object and then trains a LIBLINEAR model. This will print AUC for all provided values of C, so you can see which one is the best, and pick the one with highest mean AUC. Training, Validation and Testing When doing the Cross-Validation there’s still a danger of overfitting. Since we try a lot of different experiments on the same validation set, we might accidentally pick the model which just happened to do well on the validation set – but it may later on fail to generalize to unseen data. The solution to this problem is to hold out a test set at the very beginning and do not touch it at all until we select what we think is the best model. And we use it only for evaluating the final model on it. So how do we select the best model? What we can do is to do Cross-Validation on the remaining train data. It can be either hold out or K-Fold Cross-Validation. In general you should prefer doing K-Fold Cross-Validation because it also gives you the spread of performance, and you may use it in for model selection as well. The typical workflow should be the following: (0) Select some metric for validation, e.g. accuracy or AUC. (1) Split all the data into train and test sets (2) Split the training data further and hold out a validation dataset or split it into K folds (3) Use the validation data for model selection and parameter optimization (4) Select the best model according to the validation set and evaluate it against the hold out test set It is important to avoid looking at the test set too often. It should be used only occasionally for final evaluation to make sure the selected model does not overfit. If the validation scheme is set up properly, the validation score should correspond to the final test score. If this happens, we can be sure that the model does not overfit and is able to generalize to unseen data. Using the classes and the code we created previously, it translates to the following Java code: Dataset data = new Dataset(X, y); Dataset train = split.getTrain(); List<Split> folds = train.kfold(3); // now use crossValidate(folds, ...) to select the best model Dataset test = split.getTest(); // do final evaluation of the best model on test With this information we are ready to do a project on Binary Classification. Summary In this article we spoke about supervised machine learning and about two common supervised problems: Classification and Regression. We also covered the libraries which are commonly used algorithms , implemented and how to evaluate the performance of these algorithms. There is another family of Machine Learning algorithms that do not require the label information: these methods are called Unsupervised Learning. Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Clustering and Other Unsupervised Learning Methods [article] Data Clustering [article]
Read more
  • 0
  • 0
  • 5294

article-image-learning-cassandra
Packt
06 Apr 2017
17 min read
Save for later

Learning Cassandra

Packt
06 Apr 2017
17 min read
In this article by Sandeep Yarabarla, the author of the book Learning Apache Cassandra - Second Edition, we will built products using relational databases like MySQL and PostgreSQL, and perhaps experimented with NoSQL databases including a document store like MongoDB or a key-value store like Redis. While each of these tools has its strengths, you will now consider whether a distributed database like Cassandra might be the best choice for the task at hand. In this article, we'll begin with the need for NoSQL databases to satisfy the conundrum of ever growing data. We will see why NoSQL databases are becoming the de facto choice for big data and real-time web applications. We will also talk about the major reasons to choose Cassandra from among the many database options available to you. Having established that Cassandra is a great choice, we'll go through the nuts and bolts of getting a local Cassandra installation up and running (For more resources related to this topic, see here.) What is Big Data Big Data is a relatively new term which has been gathering steam over the past few years. Big Data is a term used for data sets that are relatively large to be stored in a traditional database system or processed by traditional data processing pipelines. This data could be structured, semi-structured or unstructured data. The data sets that belong to this category usually scale to terabytes or petabytes of data. Big Data usually involves one or more of the following: Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner. Volume: Organizations collect data from a variety of sources, including business transactions, social media and information from sensor or machine-to-machine data. This could involve terabytes to petabytes of data. In the past, storing it would've been a problem – but new technologies have eased the burden. Variety: Data comes in all sorts of formats ranging from structured data to be stored in traditional databases to unstructured data (blobs) such as images, audio files and text files. These are known as the 3 Vs of Big Data. In addition to these, we tend to associate another term with Big Data. Complexity: Today's data comes from multiple sources, which makes it difficult to link, match, cleanse and transform data across systems. However, it's necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control. It must be able to traverse multiple data centers, cloud and geographical zones. Challenges of modern applications Before we delve into the shortcomings of relational systems to handle Big Data, let's take a look at some of the challenges faced by modern web facing and Big Data applications. Later, this will give an insight into how NoSQL data stores or Cassandra in particular helps solve these issues. One of the most important challenges faced by a web facing application is it should be able to handle a large number of concurrent users. Think of a search engine like google which handles millions of concurrent users at any given point of time or a large online retailer. The response from these applications should be swift even as the number of users keeps on growing. Modern applications need to be able to handle large amounts of data which can scale to several petabytes of data and beyond. Consider a large social network with a few hundred million users: Think of the amount of data generated in tracking and managing those users Think of how this data can be used for analytics Business critical applications should continue running without much impact even when there is a system failure or multiple system failures (server failure, network failure, and so on.). The applications should be able to handle failures gracefully without any data loss or interruptions. These applications should be able to scale across multiple data centers and geographical regions to support customers from different regions around the world with minimum delay. Modern applications should be implementing fully distributed architectures and should be capable of scaling out horizontally to support any data size or any number of concurrent users. Why not relational Relational database systems (RDBMS) have been the primary data store for enterprise applications for 20 years. Lately, NoSQL databases have been picking a lot of steam and businesses are slowly seeing a shift towards non-relational databases. There are a few reasons why relational databases don't seem like a good fit for modern web applications. Relational databases are not designed for clustered solutions. There are some solutions that shard data across servers but these are fragile, complex and generally don't work well. Sharding solutions implemented by RDBMS MySQL's product MySQL cluster provides clustering support which adds many capabilities of non-relational systems. It is actually a NoSQL solution that integrates with MySQL relational database. It partitions the data onto multiple nodes and the data can be accessed via different APIs. Oracle provides a clustering solution, Oracle RAC which involves multiple nodes running an oracle process accessing the same database files. This creates a single point of failure as well as resource limitations in accessing the database itself. They are not a good fit current hardware and architectures. Relational databases are usually scaled up by using larger machines with more powerful hardware and maybe clustering and replication among a small number of nodes. Their core architecture is not a good fit for commodity hardware and thus doesn't work with scale out architectures. Scale out vs Scale up architecture Scaling out means adding more nodes to a system such as adding more servers to a distributed database or filesystem. This is also known as horizontal scaling. Scaling up means adding more resources to a single node within the system such as adding more CPU, memory or disks to a server. This is also known as vertical scaling. How to handle Big Data Now that we are convinced the relational model is not a good fit for Big Data, let's try to figure out ways to handle Big Data. These are the solutions that paved the way for various NoSQL databases. Clustering: The data should be spread across different nodes in a cluster. The data should be replicated across multiple nodes in order to sustain node failures. This helps spread the data across the cluster and different nodes contain different subsets of data. This improves performance and provides fault tolerance. A node is an instance of database software running on a server. Multiple instances of the same database could be running on the same server. Flexible schema: Schemas should be flexible unlike the relational model and should evolve with data. Relax consistency: We should embrace the concept of eventual consistency which means data will eventually be propagated to all the nodes in the cluster (in case of replication). Eventual consistency allows data replication across nodes with minimum overhead. This allows for fast writes with the need for distributed locking. De-normalization of data: De-normalize data to optimize queries. This has to be done at the cost of writing and maintaining multiple copies of the same data. What is Cassandra and why Cassandra Cassandra is a fully distributed, masterless database, offering superior scalability and fault tolerance to traditional single master databases. Compared with other popular distributed databases like Riak, HBase, and Voldemort, Cassandra offers a uniquely robust and expressive interface for modeling and querying data. What follows is an overview of several desirable database capabilities, with accompanying discussions of what Cassandra has to offer in each category. Horizontal scalability Horizontal scalability refers to the ability to expand the storage and processing capacity of a database by adding more servers to a database cluster. A traditional single-master database's storage capacity is limited by the capacity of the server that hosts the master instance. If the data set outgrows this capacity, and a more powerful server isn't available, the data set must be sharded among multiple independent database instances that know nothing of each other. Your application bears responsibility for knowing to which instance a given piece of data belongs. Cassandra, on the other hand, is deployed as a cluster of instances that are all aware of each other. From the client application's standpoint, the cluster is a single entity; the application need not know, nor care, which machine a piece of data belongs to. Instead, data can be read or written to any instance in the cluster, referred to as a node; this node will forward the request to the instance where the data actually belongs. The result is that Cassandra deployments have an almost limitless capacity to store and process data. When additional capacity is required, more machines can simply be added to the cluster. When new machines join the cluster, Cassandra takes careof rebalancing the existing data so that each node in the expanded cluster has a roughly equal share. Also, the performance of a Cassandra cluster is directly proportional to the number of nodes within the cluster. As you keep on adding instances, the read and write throughput will keep increasing proportionally. Cassandra is one of the several popular distributed databases inspired by the Dynamo architecture, originally published in a paper by Amazon. Other widely used implementations of Dynamo include Riak and Voldemort. You can read the original paper at http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf High availability The simplest database deployments are run as a single instance on a single server. This sort of configuration is highly vulnerable to interruption: if the server is affected by a hardware failure or network connection outage, the application's ability to read and write data is completely lost until the server is restored. If the failure is catastrophic, the data on that server might be lost completely. Master-slave architectures improve this picture a bit. The master instance receives all write operations, and then these operations are replicated to follower instances. The application can read data from the master or any of the follower instances, so a single host becoming unavailable will not prevent the application from continuing to read data. A failure of the master, however, will still prevent the application from performing any write operations, so while this configuration provides high read availability, it doesn't completely provide high availability. Cassandra, on the other hand, has no single point of failure for reading or writing data. Each piece of data is replicated to multiple nodes, but none of these nodes holds the authoritative master copy. All the nodes in a Cassandra cluster are peers without a master node. If a machine becomes unavailable, Cassandra will continue writing data to the other nodes that share data with that machine, and will queue the operations and update the failed node when it rejoins the cluster. This means in a typical configuration, multiple nodes must fail simultaneously for there to be any application-visible interruption in Cassandra's availability. How many copies? When you create a keyspace - Cassandra's version of a database, you specify how many copies of each piece of data should be stored; this is called the replication factor. A replication factor of 3 is a common choice for most use cases. Write optimization Traditional relational and document databases are optimized for read performance. Writing data to a relational database will typically involve making in-place updates to complicated data structures on disk, in order to maintain a data structure that can be read efficiently and flexibly. Updating these data structures is a very expensive operation from a standpoint of disk I/O, which is often the limiting factor for database performance. Since writes are more expensive than reads, you'll typically avoid any unnecessary updates to a relational database, even at the expense of extra read operations. Cassandra, on the other hand, is highly optimized for write throughput, and in fact never modifies data on disk; it only appends to existing files or creates new ones. This is much easier on disk I/O and means that Cassandra can provide astonishingly high write throughput. Since both writing data to Cassandra, and storing data in Cassandra, are inexpensive, denormalization carries little cost and is a good way to ensure that data can be efficiently read in various access scenarios. Because Cassandra is optimized for write volume; you shouldn't shy away from writing data to the database. In fact, it's most efficient to write without reading whenever possible, even if doing so might result in redundant updates. Just because Cassandra is optimized for writes doesn't make it bad at reads; in fact, a well-designed Cassandra database can handle very heavy read loads with no problem. Structured records The first three database features we looked at are commonly found in distributed data stores. However, databases like Riak and Voldemort are purely key-value stores; these databases have no knowledge of the internal structure of a record that's stored at a particular key. This means useful functions like updating only part of a record, reading only certain fields from a record, or retrieving records that contain a particular value in a given field are not possible. Relational databases like PostgreSQL, document stores like MongoDB, and, to a limited extent, newer key-value stores like Redis do have a concept of the internal structure of their records, and most application developers are accustomed to taking advantage of the possibilities this allows. None of these databases, however, offer the advantages of a masterless distributed architecture. In Cassandra, records are structured much in the same way as they are in a relational database—using tables, rows, and columns. Thus, applications using Cassandra can enjoy all the benefits of masterless distributed storage while also getting all the advanced data modeling and access features associated with structured records. Secondary indexes A secondary index, commonly referred to as an index in the context of a relational database, is a structure allowing efficient lookup of records by some attribute other than their primary key. This is a widely useful capability: for instance, when developing a blog application, you would want to be able to easily retrieve all of the posts written by a particular author. Cassandra supports secondary indexes; while Cassandra's version is not as versatile as indexes in a typical relational database, it's a powerful feature in the right circumstances. Materialized views Data modelling principles in Cassandra compel us to denormalize data as much as possible. Prior to Cassandra 3.0, the only way to query on a non-primary key column was to create a secondary index and query on it. However, secondary indexes have a performance tradeoff if it contains high cardinality data. Often, high cardinality secondary indexes have to scan data on all the nodes and aggregate them to return the query results. This defeats the purpose of having a distributed system. To avoid secondary indexes and client-side denormalization, Cassandra introduced the feature of Materialized views which does server-side denormalization. You can create views for a base table and Cassandra ensures eventual consistency between the base and view. This lets us do very fast lookups on each view following the normal Cassandra read path. Materialized views maintain a correspondence of one CQL row each in the base and the view, so we need to ensure that each CQL row which is required for the views will be reflected in the base table's primary keys. Although, materialized view allows for fast lookups on non-primary key indexes; this comes at a performance hit to writes. Efficient result ordering It's quite common to want to retrieve a record set ordered by a particular field; for instance, a photo sharing service will want to retrieve the most recent photographs in descending order of creation. Since sorting data on the fly is a fundamentally expensive operation, databases must keep information about record ordering persisted on disk in order to efficiently return results in order. In a relational database, this is one of the jobs of a secondary index. In Cassandra, secondary indexes can't be used for result ordering, but tables can be structured such that rows are always kept sorted by a given column or columns, called clustering columns. Sorting by arbitrary columns at read time is not possible, but the capacity to efficiently order records in any way, and to retrieve ranges of records based on this ordering, is an unusually powerful capability for a distributed database. Immediate consistency When we write a piece of data to a database, it is our hope that that data is immediately available to any other process that may wish to read it. From another point of view, when we read some data from a database, we would like to be guaranteed that the data we retrieve is the most recently updated version. This guarantee is calledimmediate consistency, and it's a property of most common single-master databases like MySQL and PostgreSQL. Distributed systems like Cassandra typically do not provide an immediate consistency guarantee. Instead, developers must be willing to accept eventual consistency, which means when data is updated, the system will reflect that update at some point in the future. Developers are willing to give up immediate consistency precisely because it is a direct tradeoff with high availability. In the case of Cassandra, that tradeoff is made explicit through tunable consistency. Each time you design a write or read path for data, you have the option of immediate consistency with less resilient availability, or eventual consistency with extremely resilient availability. Discretely writable collections While it's useful for records to be internally structured into discrete fields, a given property of a record isn't always a single value like a string or an integer. One simple way to handle fields that contain collections of values is to serialize them using a format like JSON, and then save the serialized collection into a text field. However, in order to update collections stored in this way, the serialized data must be read from the database, decoded, modified, and then written back to the database in its entirety. If two clients try to perform this kind of modification to the same record concurrently, one of the updates will be overwritten by the other. For this reason, many databases offer built-in collection structures that can be discretely updated: values can be added to, and removed from collections, without reading and rewriting the entire collection. Neither the client nor Cassandra itself needs to read the current state of the collection in order to update it, meaning collection updates are also blazingly efficient. Relational joins In real-world applications, different pieces of data relate to each other in a variety of ways. Relational databases allow us to perform queries that make these relationships explicit, for instance, to retrieve a set of events whose location is in the state of New York (this is assuming events and locations are different record types). Cassandra, however, is not a relational database, and does not support anything like joins. Instead, applications using Cassandra typically denormalize data and make clever use of clustering in order to perform the sorts of data access that would use a join in a relational database. For datasets that aren't already denormalized, applications can also perform client-side joins, which mimic the behavior of a relational database by performing multiple queries and joining the results at the application level. Client-side joins are less efficient than reading data that has been denormalized in advance, but offer more flexibility. Summary In this article, you explored the reasons to choose Cassandra from among the many databases available, and having determined that Cassandra is a great choice, you installed it on your development machine. You had your first taste of the Cassandra Query Language when you issued your first few commands through the CQL shell in order to create a keyspace, table, and insert and read data. You're now poised to begin working with Cassandra in earnest. Resources for Article: Further resources on this subject: Setting up a Kubernetes Cluster [article] Dealing with Legacy Code [article] Evolving the data model [article]
Read more
  • 0
  • 0
  • 2233
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-creating-fabric-policies
Packt
06 Apr 2017
9 min read
Save for later

Creating Fabric Policies

Packt
06 Apr 2017
9 min read
In this article by Stuart Fordham, the author of the book Cisco ACI Cookbook, helps us to understand ACI and the APIC and also explains us how to create fabric policies. (For more resources related to this topic, see here.) Understanding ACI and the APIC ACI is for the data center. It is a fabric (which is just a fancy name for the layout of the components) that can span data centers using OTV or similar overlay technologies, but it is not for the WAN. We can implement a similar level of programmability on our WAN links through APIC-EM (Application Policy Infrastructure Controller Enterprise Module), which uses ISR or ASR series routers along with the APIC-EM virtual machine to control and program them. APIC and APIC-EM are very similar; just the object of their focus is different. APIC-EM is outside the scope of this book, as we will be looking at data center technologies. The APIC is our frontend. Through this, we can create and manage our policies, manage the fabric, create tenants, and troubleshoot. Most importantly, the APIC is not associated with the data path. If we lose the APIC for any reason, the fabric will continue to forward the traffic. To give you the technical elevator pitch, ACI uses a number of APIs (application programming interfaces) such as REST (Representational State Transfer) using languages such as JSON (JavaScript Object Notation) and XML (eXtensible Markup Language), as well as the CLI and the GUI to manage the fabric and other protocols such as OpFlex to supply the policies to the network devices. The first set (those that manage the fabric) are referred to as northbound protocols. Northbound protocols allow lower-level network components talk to higher-level ones. OpFlex is a southbound protocol. Southbound protocols (such as OpFlex and OpenFlow, which is another protocol you will hear in relation to SDN) allow the controllers to push policies down to the nodes (the switches). Figure 1 This is a very brief introduction to the how. Now let's look at the why. What does ACI give us that the traditional network does not? In a multi-tenant environment, we have defined goals. The primary purpose is that one tenant remain separate from another. We can achieve this in a number of ways. We could have each of the tenants in their own DMZ (demilitarized zone), with firewall policies to permit or restrict traffic as required. We could use VLANs to provide a logical separation between tenants. This approach has two drawbacks. It places a greater onus on the firewall to direct traffic, which is fine for northbound traffic (traffic leaving the data center) but is not suitable when the majority of the traffic is east-west bound (traffic between applications within the data center; see figure 2). Figure 2 We could use switches to provide layer-3 routing and use access lists to control and restrict traffic; these are well designed for that purpose. Also, in using VLANs, we are restricted to a maximum of 4,096 potential tenants (due to the 12-bit VLAN ID). An alternative would be to use VRFs (virtual routing and forwarding). VRFs are self-contained routing tables, isolated from each other unless we instruct the router or switch to share the routes by exporting and importing route targets (RTs). This approach is much better for traffic isolation, but when we need to use shared services, such as an Internet pipe, VRFs can become much harder to keep secure. One way around this would be to use route leaking. Instead of having a separate VRF for the Internet, this is kept in the global routing table and then leaked to both tenants. This maintains the security of the tenants, and as we are using VRFs instead of VLANs, we have a service that we can offer to more than 4,096 potential customers. However, we also have a much bigger administrative overhead. For each new tenant, we need more manual configuration, which increases our chances of human error. ACI allows us to mitigate all of these issues. By default, ACI tenants are completely separated from each other. To get them talking to each other, we need to create contracts, which specify what network resources they can and cannot see. There are no manual steps required to keep them separate from each other, and we can offer Internet access rapidly during the creation of the tenant. We also aren't bound by the 4,096 VLAN limit. Communication is through VXLAN, which raises the ceiling of potential segments (per fabric) to 16 million (by using a 24-bit segment ID). Figure 3 VXLAN is an overlay mechanism that encapsulates layer-2 frames within layer-4 UDP packets, also known as MAC-in-UDP (figure 3). Through this, we can achieve layer-2 communication across a layer-3 network. Apart from the fact that through VXLAN, tenants can be placed anywhere in the data center and that the number of endpoints far outnumbers the traditional VLAN approach, the biggest benefit of VXLAN is that we are no longer bound by the Spanning Tree Protocol. With STP, the redundant paths in the network are blocked (until needed). VXLAN, by contrast, uses layer-3 routing, which enables it to use equal-cost multipathing (ECMP) and link aggregation technologies to make use of all the available links, with recovery (in the event of a link failure) in the region of 125 microseconds. With VXLAN, we have endpoints, referred to as VXLAN Tunnel Endpoints (VTEPs), and these can be physical or virtual switchports. Head-End Replication (HER) is used to forward broadcast, unknown destination address, and multicast traffic, which is referred to (quite amusingly) as BUM traffic. This 16M limit with VXLAN is more theoretical, however. Truthfully speaking, we have a limit of around 1M entries in terms of MAC addresses, IPv4 addresses, and IPv6 addresses due to the size of the TCAM (ternary content-addressable memory). The TCAM is high-speed memory, used to speed up the reading of routing tables and performing matches against access control lists. The amount of available TCAM became a worry back in 2014 when the BGP routing table first exceeded 512 thousand routes, which was the maximum number supported by many of the Internet routers. The likelihood of having 1M entries within the fabric is also pretty rare, but even at 1M entries, ACI remains scalable in that the spine switches let the leaf switches know about only the routes and endpoints they need to know about. If you are lucky enough to be scaling at this kind of magnitude, however, it would be time to invest in more hardware and split the load onto separate fabrics. Still, a data center with thousands of physical hosts is very achievable. Creating fabric policies In this recipe, we will create an NTP policy and assign it to our pod. NTP is a good place to start, as having a common and synced time source is critical for third-party authentication, such as LDAP and logging. In this recipe, we will use the Quick Start menu to create an NTP policy, in which we will define our NTP servers. We will then create a POD policy and attach our NTP policy to it. Lastly, we'll create a POD profile, which calls the policy and applies it to our pod (our fabric). We can assign pods to different profiles, and we can share policies between policy groups. So we may have one NTP policy but different SNMP policies for different pods: The ACI fabric is very flexible in this respect. How to do it... From the Fabric menu, select Fabric Policies. From the Quick Start menu, select Create an NTP Policy: A new window will pop up, and here we'll give our new policy a name and (optional) description and enable it. We can also define any authentication keys, if the servers use them. Clicking on Next takes us to the next page, where we specify our NTP servers. Click on the plus sign on the right-hand side, and enter the IP address or Fully Qualified Domain Name (FQDN) of the NTP server(s): We can also select a management EPG, which is useful if the NTP servers are outside of our network. Then, click on OK. Click on Finish.  We can now see our custom policy under Pod Policies: At the moment, though, we are not using it: Clicking on Show Usage at the bottom of the screen shows that no nodes or policies are using the policy. To use the policy, we must assign it to a pod, as we can see from the Quick Start menu: Clicking on the arrow in the circle will show us a handy video on how to do this. We need to go into the policy groups under Pod Policies and create a new policy: To create the policy, click on the Actions menu, and select Create Pod Policy Group. Name the new policy PoD-Policy. From here, we can attach our NTP-POLICY to the PoD-Policy. To attach the policy, click on the drop-down next to Date Time Policy, and select NTP-POLICY from the list of options: We can see our new policy. We have not finished yet, as we still need to create a POD profile and assign the policy to the profile. The process is similar as before: we go to Profiles (under the Pod Policies menu), select Actions, and then Create Pod Profile: We give it a name and associate our policy to it. How it works... Once we create a policy, we must associate it with a POD policy. The POD policy must then be associated with a POD profile. We can see the results here: Our APIC is set to use the new profile, which will be pushed down to the spine and leaf nodes. We can also check the NTP status from the APIC CLI, using the command show ntp . apic1# show ntp nodeid remote refid st t when poll reach delay offset jitter -------- - --------------- -------- ----- -- ------ ------ ------- ------- -------- -------- 1 216.239.35.4 .INIT. 16 u - 16 0 0.000 0.000 0.000 apic1# Summary In this article, we have summarized the ACI and APIC and helps us to develop and create the fabric policies. Resources for Article: Further resources on this subject: The Fabric library – the deployment and development task manager [article] Web Services and Forms [article] The Alfresco Platform [article]
Read more
  • 0
  • 0
  • 32143

article-image-conditional-statements-functions-and-lists
Packt
05 Apr 2017
14 min read
Save for later

Conditional Statements, Functions, and Lists

Packt
05 Apr 2017
14 min read
In this article, by Sai Yamanoor and Srihari Yamanoor, author of the book Python Programming with Raspberry Pi Zero, you will learn about conditional statements and how to make use of logical operators to check conditions using conditional statements. Next, youwill learn to write simple functions in Python and discuss interfacing inputs to the Raspberry Pi's GPIO header using a tactile switch (momentary push button). We will also discuss motor control (this is a run-up to the final project) using the Raspberry Pi Zero and control the motors using the switch inputs. Let's get to it! In this article, we will discuss the following topics: Conditional statements in Python Using conditional inputs to take actions based on GPIO pin states Breaking out of loops using conditional statement Functions in Python GPIO callback functions Motor control in Python (For more resources related to this topic, see here.) Conditional statements In Python, conditional statements are used to determine if a specific condition is met by testing whether a condition is True or False. Conditional statements are used to determine how a program is executed. For example,conditional statements could be used to determine whether it is time to turn on the lights. The syntax is as follows: if condition_is_true: do_something() The condition is usually tested using a logical operator, and the set of tasks under the indented block is executed. Let's consider the example,check_address_if_statement.py where the user input to a program needs to be verified using a yesor no question: check_address = input("Is your address correct(yes/no)? ") if check_address == "yes": print("Thanks. Your address has been saved") if check_address == "no": del(address) print("Your address has been deleted. Try again")check_address = input("Is your address correct(yes/no)? ") In this example, the program expects a yes or no input. If the user provides the input yes, the condition if check_address == "yes"is true, the message Your address has been savedis printed on the screen. Likewise, if the user input is no, the program executes the indented code block under the logical test condition if check_address == "no" and deletes the variable address. An if-else statement In the precedingexample, we used an ifstatement to test each condition. In Python, there is an alternative option named the if-else statement. The if-else statement enables testing an alternative condition if the main condition is not true: check_address = input("Is your address correct(yes/no)? ") if check_address == "yes": print("Thanks. Your address has been saved") else: del(address) print("Your address has been deleted. Try again") In this example, if the user input is yes, the indented code block under if is executed. Otherwise, the code block under else is executed. An if-elif-else statement In the precedingexample, the program executes any piece of code under the else block for any user input other than yesthat is if the user pressed the return key without providing any input or provided random characters instead of no, the if-elif-else statement works as follows: check_address = input("Is your address correct(yes/no)? ") if check_address == "yes": print("Thanks. Your address has been saved") elifcheck_address == "no": del(address) print("Your address has been deleted. Try again") else: print("Invalid input. Try again") If the user input is yes, the indented code block under the ifstatement is executed. If the user input is no, the indented code block under elif (else-if) is executed. If the user input is something else, the program prints the message: Invalid input. Try again. It is important to note that the code block indentation determines the block of code that needs to be executed when a specific condition is met. We recommend modifying the indentation of the conditional statement block and find out what happens to the program execution. This will help understand the importance of indentation in python. In the three examples that we discussed so far, it could be noted that an if-statement does not need to be complemented by an else statement. The else and elif statements need to have a preceding if statement or the program execution would result in an error. Breaking out of loops Conditional statements can be used to break out of a loop execution (for loop and while loop). When a specific condition is met, an if statement can be used to break out of a loop: i = 0 while True: print("The value of i is ", i) i += 1 if i > 100: break In the precedingexample, the while loop is executed in an infinite loop. The value of i is incremented and printed onthe screen. The program breaks out of the while loop when the value of i is greater than 100 and the value of i is printed from 1 to 100. The applications of conditional statements: executing tasks using GPIO Let's discuss an example where a simple push button is pressed. A button press is detected by reading the GPIO pin state. We are going to make use of conditional statements to execute a task based on the GPIO pin state. Let us connect a button to the Raspberry Pi's GPIO. All you need to get started are a button, pull-up resistor, and a few jumper wires. The figure given latershows an illustration on connecting the push button to the Raspberry Pi Zero. One of the push button's terminals is connected to the ground pin of the Raspberry Pi Zero's GPIO pin. The schematic of the button's interface is shown here: Raspberry Pi GPIO schematic The other terminal of the push button is pulled up to 3.3V using a 10K resistor.The junction of the push button terminal and the 10K resistor is connected to the GPIO pin 2. Interfacing the push button to the Raspberry Pi Zero's GPIO—an image generated using Fritzing Let's review the code required to review the button state. We make use of loops and conditional statements to read the button inputs using the Raspberry Pi Zero. We will be making use of the gpiozero. For now, let’s briefly discuss the concept of classes for this example. A class in Python is a blueprint that contains all the attributes that define an object. For example, the Button class of the gpiozero library contains all attributes required to interface a button to the Raspberry Pi Zero’s GPIO interface.These attributes include button states and functions required to check the button states and so on. In order to interface a button and read its states, we need to make use of this blueprint. The process of creating a copy of this blueprint is called instantiation. Let's get started with importing the gpiozero library and instantiate the Button class of the gpiozero. The button is interfaced to GPIO pin 2. We need to pass the pin number as an argument during instantiation: from gpiozero import Button #button is interfaced to GPIO 2 button = Button(2) The gpiozero library's documentation is available athttp://gpiozero.readthedocs.io/en/v1.2.0/api_input.html.According to the documentation, there is a variable named is_pressed in the Button class that could be tested using a conditional statement to determine if the button is pressed: if button.is_pressed: print("Button pressed") Whenever the button is pressed, the message Button pressed is printed on the screen. Let's stick this code snippet inside an infinite loop: from gpiozero import Button #button is interfaced to GPIO 2 button = Button(2) while True: if button.is_pressed: print("Button pressed") In an infinite while loop, the program constantly checks for a button press and prints the message as long as the button is being pressed. Once the button is released, it goes back to checking whether the button is pressed. Breaking out a loop by counting button presses Let's review another example where we would like to count the number of button presses and break out of the infinite loop when the button has received a predetermined number of presses: i = 0 while True: if button.is_pressed: button.wait_for_release() i += 1 print("Button pressed") if i >= 10: break In this example, the program checks for the state of the is_pressed variable. On receiving a button press, the program can be paused until the button is released using the method wait_for_release.When the button is released, the variable used to store the number of presses is incremented by 1. The program breaks out of the infinite loop, when the button has received 10 presses. A red momentary push button interfaced to Raspberry Pi Zero GPIO pin 2 Functions in Python We briefly discussed functions in Python. Functions execute a predefined set of task. print is one example of a function in Python. It enables printing something to the screen. Let's discuss writing our own functions in Python. A function can be declared in Python using the def keyword. A function could be defined as follows: defmy_func(): print("This is a simple function") In this function my_func, the print statement is written under an indented code block. Any block of code that is indented under the function definition is executed when the function is called during the code execution. The function could be executed as my_func(). Passing arguments to a function: A function is always defined with parentheses. The parentheses are used to pass any requisite arguments to a function. Arguments are parameters required to execute a function. In the earlierexample, there are no arguments passed to the function. Let's review an example where we pass an argument to a function: defadd_function(a, b): c = a + b print("The sum of a and b is ", c) In this example, a and b are arguments to the function. The function adds a and b and prints the sum onthe screen. When the function add_function is called by passing the arguments 3 and 2 as add_function(3,2) where a=3 and b=2, respectively. Hence, the arguments a and b are required to execute function, or calling the function without the arguments would result in an error. Errors related to missing arguments could be avoided by setting default values to the arguments: defadd_function(a=0, b=0): c = a + b print("The sum of a and b is ", c) The preceding function expects two arguments. If we pass only one argument to thisfunction, the other defaults to zero. For example,add_function(a=3), b defaults to 0, or add_function(b=2), a defaults to 0. When an argument is not furnished while calling a function, it defaults to zero (declared in the function). Similarly,the print function prints any variable passed as an argument. If the print function is called without any arguments, a blank line is printed. Returning values from a function Functions can perform a set of defined operations and finally return a value at the end. Let's consider the following example: def square(a): return a**2 In this example, the function returns a square of the argument. In Python, the return keyword is used to return a value requested upon completion of execution. The scope of variables in a function There are two types of variables in a Python program:local and global variables. Local variables are local to a function,that is, it is a variable declared within a function is accessible within that function.Theexample is as follows: defadd_function(): a = 3 b = 2 c = a + b print("The sum of a and b is ", c) In this example, the variables a and b are local to the function add_function. Let's consider an example of a global variable: a = 3 b = 2 defadd_function(): c = a + b print("The sum of a and b is ", c) add_function() In this case, the variables a and b are declared in the main body of the Python script. They are accessible across the entire program. Now, let's consider this example: a = 3 defmy_function(): a = 5 print("The value of a is ", a) my_function() print("The value of a is ", a) In this case, when my_function is called, the value of a is5 and the value of a is 3 in the print statement of the main body of the script. In Python, it is not possible to explicitly modify the value of global variables inside functions. In order to modify the value of a global variable, we need to make use of the global keyword: a = 3 defmy_function(): global a a = 5 print("The value of a is ", a) my_function() print("The value of a is ", a) In general, it is not recommended to modify variables inside functions as it is not a very safe practice of modifying variables. The best practice would be passing variables as arguments and returning the modified value. Consider the following example: a = 3 defmy_function(a): a = 5 print("The value of a is ", a) return a a = my_function(a) print("The value of a is ", a) In the preceding program, the value of a is 3. It is passed as an argument to my_function. The function returns 5, which is saved to a. We were able to safely modify the value of a. GPIO callback functions Let's review some uses of functions with the GPIO example. Functions can be used in order tohandle specific events related to the GPIO pins of the Raspberry Pi. For example,the gpiozero library provides the capability of calling a function either when a button is pressed or released: from gpiozero import Button defbutton_pressed(): print("button pressed") defbutton_released(): print("button released") #button is interfaced to GPIO 2 button = Button(2) button.when_pressed = button_pressed button.when_released = button_released while True: pass In this example, we make use of the attributeswhen_pressed and when_releasedof the library's GPIO class. When the button is pressed, the function button_pressed is executed. Likewise, when the button is released, the function button_released is executed. We make use of the while loop to avoid exiting the program and keep listening for button events. The pass keyword is used to avoid an error and nothing happens when a pass keyword is executed. This capability of being able to execute different functions for different events is useful in applications like Home Automation. For example, it could be used to turn on lights when it is dark and vice versa. DC motor control in Python In this section, we will discuss motor control using the Raspberry Pi Zero. Why discuss motor control? In order to control a motor, we need an H Bridge motor driver (Discussing H bridge is beyond our scope. There are several resources for H bridge motor drivers: http://www.mcmanis.com/chuck/robotics/tutorial/h-bridge/). There are several motor driver kits designed for the Raspberry Pi. In this section, we will make use of the following kit: https://www.pololu.com/product/2753. The Pololu product page also provides instructions on how to connect the motor. Let's get to writing some Python code tooperate the motor: from gpiozero import Motor from gpiozero import OutputDevice import time motor_1_direction = OutputDevice(13) motor_2_direction = OutputDevice(12) motor = Motor(5, 6) motor_1_direction.on() motor_2_direction.on() motor.forward() time.sleep(10) motor.stop() motor_1_direction.off() motor_2_direction.off() Raspberry Pi based motor control In order to control the motor, let's declare the pins, the motor's speed pins and direction pins. As per the motor driver's documentation, the motors are controlled by GPIO pins 12,13 and 5,6, respectively. from gpiozero import Motor from gpiozero import OutputDevice import time motor_1_direction = OutputDevice(13) motor_2_direction = OutputDevice(12) motor = Motor(5, 6) Controlling the motor is as simple as turning on the motor using the on() method and moving the motor in the forward direction using the forward() method: motor.forward() Similarly, reversing the motor direction could be done by calling the method reverse(). Stopping the motor could be done by: motor.stop() Some mini-project challenges for the reader: In this article, we discussed interfacing inputs for the Raspberry Pi and controlling motors. Think about a project where we could drive a mobile robot that reads inputs from whisker switches and operate a mobile robot. Is it possible to build a wall following robot in combination with the limit switches and motors? We discussed controlling a DC motor in this article. How do we control a stepper motor using a Raspberry Pi? How can we interface a motion sensor to control the lights at home using a Raspberry Pi Zero. Summary In this article, we discussed conditional statements and the applications of conditional statements in Python. We also discussed functions in Python, passing arguments to a function, returning values from a function and scope of variables in a Python program. We discussed callback functions and motor control in Python. Resources for Article: Further resources on this subject: Sending Notifications using Raspberry Pi Zero [article] Raspberry Pi Gaming Operating Systems [article] Raspberry Pi LED Blueprints [article]
Read more
  • 0
  • 0
  • 67979

article-image-using-react-router-client-side-redirecting
Antonio Cucciniello
05 Apr 2017
6 min read
Save for later

Using React Router for Client Side Redirecting

Antonio Cucciniello
05 Apr 2017
6 min read
If you are using React in the front end of your web application and you would like to use React Router in order to handle routing, then you have come to the right place. Today, we will learn how to have your client side redirect to another page after you have completed some processing. Let's get started! Installs First we will need to make sure we have a couple of things installed. The first thing here is to make sure you have Node and NPM installed. In order to make this as simple as possible, we are going to use create-react-app to get React fully working in our application. Install this by doing: $ npm install -g create-react-app Now create a directory for this example and enter it; here we will call it client-redirect. $ mkdir client-redirect $ cd client-redirect Once in that directory, initialize create-react-app in the repo. $ create-react-app client Once it is done, test that it is working by running: $ npm start You should see something like this: The last thing you must install is react-router: $ npm install --save react-router Now that you are all set up, let's start with some code. Code First we will need to edit the main JavaScript file, in this case, located at client-redirect/client/src/App.js. App.js App.js is where we handle all the routes for our application and acts as the main js file. Here is what the code looks like for App.js: import React, { Component } from 'react' import { Router, Route, browserHistory } from 'react-router' import './App.css' import Result from './modules/result' import Calculate from './modules/calculate' class App extends Component { render () { return ( <Router history={browserHistory}> <Route path='/' component={Calculate} /> <Route path='/result' component={Result} /> </Router> ) } } export default App If you are familiar with React, you should not be too confused as to what is going on. At the top, we are importing all of the files and modules we need. We are importing React, react-router, our App.css file for styles, and result.js and calculate.js files (do not worry we will show the implementation for these shortly). The next part is where we do something different. We are using react-router to set up our routes. We chose to use a history of type browserHistory. History in react-router listens to the browser's address bar for anything changing and parses the URL from the browser and stores it in a location object so the router can match the specified route and render the different components for that path. We then use <Route> tags in order to specify what path we would like a component to be rendered on. In this case, we are using '/' path for the components in calculate.js and '/result' path for the components in result.js. Let's define what those pages will look like and see how the client can redirect using browserHistory. Calculate.js This page is a basic page with two text boxes and a button. Each text box should receive a number and when the button is clicked, we are going to calculate the sum of the two numbers given. Here is what that looks like: import React, { Component } from 'react' import { browserHistory } from 'react-router' export default class Calculate extends Component { render () { return ( <div className={'Calculate-page'} > <InputBox type='text' name='first number' id='firstNum' /> <InputBox type='text' name='second number' id='secondNum' /> <CalculateButton type='button' value='Calculate' name='Calculate' onClick='result()' /> </div> ) } } var InputBox = React.createClass({ render: function () { return <div className={'input-field'}> <input type={this.props.type} value={this.props.value} name={this.props.name} id={this.props.id} /> </div> } }) var CalculateButton = React.createClass({ result: function () { var firstNum = document.getElementById('firstNum').value var secondNum = document.getElementById('secondNum').value var sum = Number(firstNum) + Number(secondNum) if (sum !== undefined) { const path = '/result' browserHistory.push(path) } window.sessionStorage.setItem('sum', sum) return console.log(sum) }, render: function () { return <div className={'calculate-button'}> <button type={this.props.type} value={this.props.value} name={this.props.name} onClick={this.result} > Calculate </button> </div> } }) The important part we want to focus on is inside the result function of the CalculateButton class. We take the two numbers and sum them. Once we have the sum, we create a path variable to hold the route we would like to go to next. Then browserHistory.push(path) redirects the client to a new path of localhost:3000/result. We then store the sum in sessionStorage in order to retrieve it on the next page. result.js This is simply a page that will display your result from the calculation, but it serves as the page you redirected to with react-router. Here is the code: import React, { Component } from 'react' export default class Result extends Component { render () { return ( <div className={'result-page'} > Result : <DisplayNumber id='result' /> </div> ) } } var DisplayNumber = React.createClass({ componentDidMount () { document.getElementById('result').innerHTML = window.sessionStorage.getItem('sum') }, render: function () { return ( <div className={'display-number'}> <p id={this.props.id} /> </div> ) } }) We simply create a class that wraps a paragraph tag. It also has a componentDidMount() function, which allows us to access the sessionStorage for the sum once the component output has been rendered by the DOM. We update the innerHTML of the paragraph element with the sum's value. Test Let's get back to the client directory of our application. Once we are there, we can run: $ npm start This should open a tab in your web browser at localhost:3000. This is what it should look like when you add numbers in the text boxes (here I added 17 and 17). This should redirect you to another page: Conclusion There you go! You now have a web app that utilizes client-side redirecting! To summarize what we did, here is a quick list of what happened: Installed the prerequisites: node, npm Installed create-react-app Created a create-react-app Installed react-router Added our routing to our App.js Created a module that calculated the sum of two numbers and redirected to a new page Displayed the number on the new redirected page Check out the code for this tutorial on GitHub. Possible Resources Check out my GitHub View my personal blog GitHub pages for: react-router create-react-app About the author Antonio Cucciniello is a software engineer with a background in C, C++, and JavaScript (Node.js). He is from New Jersey, USA. His most recent project called Edit Docs is an Amazon Echo skill that allows users to edit Google Drive files using their voice. He loves building cool things with software, reading books on self-help and improvement, finance, and entrepreneurship. To contact Antonio, e-mail him at Antonio.cucciniello16@gmail.com, follow him on twitter at @antocucciniello, and follow him on GitHub here: https://github.com/acucciniello.
Read more
  • 0
  • 0
  • 69128

article-image-layout-management-python-gui
Packt
05 Apr 2017
13 min read
Save for later

Layout Management for Python GUI

Packt
05 Apr 2017
13 min read
In this article written by Burkhard A. Meierwe, the author of the book Python GUI Programming Cookbook - Second Edition, we will lay out our GUI using Python 3.6 and above: Arranging several labels within a label frame widget Using padding to add space around widgets How widgets dynamically expand the GUI Aligning the GUI widgets by embedding frames within frames (For more resources related to this topic, see here.)  In this article, we will explore how to arrange widgets within widgets to create our Python GUI. Learning the fundamentals of GUI layout design will enable us to create great looking GUIs. There are certain techniques that will help us to achieve this layout design. The grid layout manager is one of the most important layout tools built into tkinter that we will be using. We can very easily create menu bars, tabbed controls (aka Notebooks) and many more widgets using tkinter . Arranging several labels within a label frame widget The LabelFrame widget allows us to design our GUI in an organized fashion. We are still using the grid layout manager as our main layout design tool, yet by using LabelFrame widgets we get much more control over our GUI design. Getting ready We are starting to add more and more widgets to our GUI, and we will make the GUI fully functional in the coming recipes. Here we are starting to use the LabelFrame widget. Add the following code just above the main event loop towards the bottom of the Python module. Running the code will result in the GUI looking like this: Uncomment line 111 and notice the different alignment of the LabelFrame. We can easily align the labels vertically by changing our code, as shown next. Note that the only change we had to make was in the column and row numbering. Now the GUI LabelFrame looks as such: How it works... In line 109 we create our first ttk LabelFrame widget and assign the resulting instance to the variable buttons_frame. The parent container is win, our main window. In lines 114 -116, we create labels and place them in the LabelFrame. buttons_frame is the parent of the labels. We are using the important grid layout tool to arrange the labels within the LabelFrame. The column and row properties of this layout manager give us the power to control our GUI layout. The parent of our labels is the buttons_frame instance variable of the LabelFrame, not the win instance variable of the main window. We can see the beginning of a layout hierarchy here. We can see how easy it is to change our layout via the column and row properties. Note how we change the column to 0, and how we layer our labels vertically by numbering the row values sequentially. The name ttk stands for themed tk. The tk-themed widget set was introduced in Tk 8.5. There's more... In a recipe later in this article we will embed LabelFrame widgets within LabelFrame widgets, nesting them to control our GUI layout. Using padding to add space around widgets Our GUI is being created nicely. Next, we will improve the visual aspects of our widgets by adding a little space around them, so they can breathe. Getting ready While tkinter might have had a reputation for creating ugly GUIs, this has dramatically changed since version 8.5. You just have to know how to use the tools and techniques that are available. That's what we will do next. tkinter version 8.6 ships with Python 3.6. How to do it... The procedural way of adding spacing around widgets is shown first, and then we will use a loop to achieve the same thing in a much better way. Our LabelFrame looks a bit tight as it blends into the main window towards the bottom. Let's fix this now. Modify line 110 by adding padx and pady: And now our LabelFrame got some breathing space: How it works... In tkinter, adding space horizontally and vertically is done by using the built-in properties named padx and pady. These can be used to add space around many widgets, improving horizontal and vertical alignments, respectively. We hard-coded 20 pixels of space to the left and right of the LabelFrame, and we added 40 pixels to the top and bottom of the frame. Now our LabelFrame stands out more than it did before. The screenshot above only shows the relevant change. We can use a loop to add space around the labels contained within the LabelFrame: Now the labels within the LabelFrame widget have some space around them too: The grid_configure() function enables us to modify the UI elements before the main loop displays them. So, instead of hard-coding values when we first create a widget, we can work on our layout and then arrange spacing towards the end of our file, just before the GUI is being created. This is a neat technique to know. The winfo_children() function returns a list of all the children belonging to the buttons_frame variable. This enables us to loop through them and assign the padding to each label. One thing to notice is that the spacing to the right of the labels is not really visible. This is because the title of the LabelFrame is longer than the names of the labels. We can experiment with this by making the names of the labels longer. Now our GUI looks like this. Note how there is now some space added to the right of the long label next to the dots. The last dot does not touch the LabelFrame which it otherwise would without the added space. We can also remove the name of the LabelFrame to see the effect padx has on positioning our labels. By setting the text property to an empty string, we remove the name that was previously displayed for the LabelFrame. How widgets dynamically expand the GUI You probably noticed in previous screenshots and by running the code that widgets have a capability to extend themselves to the space they need to visually display their text. Java introduced the concept of dynamic GUI layout management. In comparison, visual development IDEs like VS.NET lay out the GUI in a visual manner, and are basically hard-coding the x and y coordinates of UI elements. Using tkinter, this dynamic capability creates both an advantage and a little bit of a challenge, because sometimes our GUI dynamically expands when we would prefer it rather not to be so dynamic! Well, we are dynamic Python programmers so we can figure out how to make the best use of this fantastic behavior! Getting ready At the beginning of the previous recipe we added a LabelFrame widget. This moved some of our controls to the center of column 0. We might not wish this modification to our GUI layout. Next we will explore some ways to solve this. How to do it... Let us first become aware of the subtle details that are going on in our GUI layout in order to understand it better. We are using the grid layout manager widget and it lays out our widgets in a zero-based grid. This is very similar to an Excel spreadsheet or a database table. Grid Layout Manager Example with 2 Rows and 3 Columns: Row 0; Col 0 Row 0; Col 1 Row 0; Col 2 Row 1; Col 0 Row 1; Col 1 Row 1; Col 2 Using the grid layout manager, what is happening is that the width of any given column is determined by the longest name or widget in that column. This affects all rows. By adding our LabelFrame widget and giving it a title that is longer than some hard-coded size widget like the top-left label and the text entry below it, we dynamically move those widgets to the center of column 0, adding space to the left and right sides of those widgets. Incidentally, because we used the sticky property for the Checkbutton and ScrolledText widgets, those remain attached to the left side of the frame. Let's look in more detail at the screenshot from the first recipe of this article. We added the following code to create the LabelFrame and then placed labels into this frame. Since the text property of the LabelFrame, which is displayed as the title of the LabelFrame, is longer than both our Enter a name: label and the text box entry below it, those two widgets are dynamically centered within the new width of column 0. The Checkbutton and Radiobutton widgets in column 0 did not get centered because we used the sticky=tk.W property when we created those widgets. For the ScrolledText widget we used sticky=tk.WE, which binds the widget to both the west (aka left) and east (aka right) side of the frame. Let's remove the sticky property from the ScrolledText widget and observe the effect this change has. Now our GUI has new space around the ScrolledText widget both on the left and right sides. Because we used the columnspan=3 property, our ScrolledText widget still spans all three columns. If we remove columnspan=3 we get the following GUI which is not what we want. Now our ScrolledText only occupies column 0 and, because of its size, it stretches the layout. One way to get our layout back to where we were before adding the LabelFrame is to adjust the grid column position. Change the column value from 0 to 1. Now our GUI looks like this: How it works... Because we are still using individual widgets, our layout can get messed up. By moving the column value of the LabelFrame from 0 to 1, we were able to get the controls back to where they used to be and where we prefer them to be. At least the left-most label, text, Checkbutton, ScrolledText, and Radiobutton widgets are now located where we intended them to be. The second label and text Entry located in column 1 aligned themselves to the center of the length of the Labels in a Frame widget so we basically moved our alignment challenge one column to the right. It is not so visible because the size of the Choose a number: label is almost the same as the size of the Labels in a Frame title, and so the column width was already close to the new width generated by the LabelFrame. There's more... In the next recipe we will embed frames within frames to avoid the accidental misalignment of widgets we just experienced in this recipe. Aligning the GUI widgets by embedding frames within frames We have much better control of our GUI layout if we embed frames within frames. This is what we will do in this recipe. Getting ready The dynamic behavior of Python and its GUI modules can create a little bit of a challenge to really get our GUI looking the way we want. Here we will embed frames within frames to get more control of our layout. This will establish a stronger hierarchy among the different UI elements, making the visual appearance easier to achieve. We will continue to use the GUI we created in the previous recipe. How to do it... Here, we will create a top-level frame that will contain other frames and widgets. This will help us to get our GUI layout just the way we want. In order to do so, we will have to embed our current controls within a central ttk.LabelFrame. This ttk.LabelFrame is a child of the main parent window and all controls will be children of this ttk.LabelFrame. Up to this point in our recipes we have assigned all widgets to our main GUI frame directly. Now we will only assign our LabelFrame to our main window and after that we will make this LabelFrame the parent container for all the widgets. This creates the following hierarchy in our GUI layout: In this diagram, win is the variable that holds a reference to our main GUI tkinter window frame; mighty is the variable that holds a reference to our LabelFrame and is a child of the main window frame (win); and Label and all other widgets are now placed into the LabelFrame container (mighty). Add the following code towards the top of our Python module: Next we will modify all the following controls to use mighty as the parent, replacing win. Here is an example of how to do this: Note how all the widgets are now contained in the Mighty Python LabelFrame which surrounds all of them with a barely visible thin line. Next, we can reset the Labels in a Frame widget to the left without messing up our GUI layout: Oops - maybe not. While our frame within another frame aligned nicely to the left, it again pushed our top widgets into the center (a default). In order to align them to the left, we have to force our GUI layout by using the sticky property. By assigning it 'W' (West) we can control the widget to be left-aligned. How it works... Note how we aligned the label, but not the text box below it. We have to use the sticky property for all the controls we want to left-align. We can do that in a loop, using the winfo_children() and grid_configure(sticky='W') properties, as we did before in the second recipe of this article. The winfo_children() function returns a list of all the children belonging to the parent. This enables us to loop through all of the widgets and change their properties. Using tkinter to force left, right, top, bottom the naming is very similar to Java: west, east, north and south, abbreviated to: 'W' and so on. We can also use the following syntax: tk.W instead of 'W'. This requires having imported the tkinter module aliased as tk. In a previous recipe we combined both 'W' and 'E' to make our ScrolledText widget attach itself both to the left and right sides of its container using 'WE'. We can add more combinations: 'NSE' will stretch our widget to the top, bottom and right side. If we have only one widget in our form, for example a button, we can make it fill in the entire frame by using all options: 'NSWE'. We can also use tuple syntax: sticky=(tk.N, tk.S, tk.W, tk.E). Let's align the entry in column 0 to the left. Now both the label and the  Entry are aligned towards the West (left). In order to separate the influence that the length of our Labels in a Frame LabelFrame has on the rest of our GUI layout we must not place this LabelFrame into the same LabelFrame as the other widgets but assign it directly to the main GUI form (win). Summary We have learned the layout management for Python GUI using the following reciepes: Arranging several labels within a label frame widget Using padding to add space around widgets How widgets dynamically expand the GUI Aligning the GUI widgets by embedding frames within frames Resources for Article:  Further resources on this subject: Python Scripting Essentials [article] An Introduction to Python Lists and Dictionaries [article] Test all the things with Python [article]
Read more
  • 0
  • 0
  • 15558
article-image-getting-started-c-features
Packt
05 Apr 2017
7 min read
Save for later

Getting started with C++ Features

Packt
05 Apr 2017
7 min read
In this article by Jacek Galowicz author of the book C++ STL Cookbook, we will learn new C++ features and how to use structured bindings to return multiple values at once. (For more resources related to this topic, see here.) Introduction C++ got a lot of additions in C++11, C++14, and most recently C++17. By now, it is a completely different language than it was just a decade ago. The C++ standard does not only standardize the language, as it needs to be understood by the compilers, but also the C++ standard template library (STL). We will see how to access individual members of pairs, tuples, and structures comfortably with structured bindings, and how to limit variable scopes with the new if and switch variable initialization capabilities. The syntactical ambiguities, which were introduced by C++11 with the new bracket initialization syntax, which looks the same for initializer lists, were fixed by new bracket initializer rules. The exact type of template class instances can now be deduced from the actual constructor arguments, and if different specializations of a template class shall result in completely different code, this is now easily expressible with constexpr-if. The handling of variadic parameter packs in template functions became much easier in many cases with the new fold expressions. At last, it became more comfortable to define static globally accessible objects in header-only libraries with the new ability to declare inline variables, which was only possible for functions before. Using structured bindings to return multiple values at once C++17 comes with a new feature which combines syntactic sugar and automatic type deduction: Structured bindings. These help assigning values from pairs, tuples, and structs into individual variables. How to do it... Applying a structured binding in order to assign multiple variables from one bundled structure is always one step: Accessing std::pair: Imagine we have a mathematical function divide_remainder, which accepts a dividend and a divisor parameter, and returns the fraction of both as well as the remainder. It returns those values using an std::pair bundle: std::pair<int, int> divide_remainder(int dividend, int divisor); Instead of accessing the individual values of the resulting pair like this: const auto result (divide_remainder(16, 3)); std::cout << "16 / 3 is " << result.first << " with a remainder of " << result.second << "n"; We can now assign them to individual variables with expressive names, which is much better to read: auto [fraction, remainder] = divide_remainder(16, 3); std::cout << "16 / 3 is " << fraction << " with a remainder of " << remainder << "n"; Structured bindings also work with std::tuple: Let's take the following example function, which gets us online stock information: std::tuple<std::string, std::chrono::time_point, double> stock_info(const std::string &name); Assigning its result to individual variables looks just like in the example before: const auto [name, valid_time, price] = stock_info("INTC"); Structured bindings also work with custom structures: Let's assume a structure like the following: struct employee { unsigned id; std::string name; std::string role; unsigned salary; }; Now we can access these members using structured bindings. We will even do that in a loop, assuming we have a whole vector of those: int main() { std::vector<employee> employees {/* Initialized from somewhere */}; for (const auto &[id, name, role, salary] : employees) { std::cout << "Name: " << name << "Role: " << role << "Salary: " << salary << "n"; } } How it works... Structured bindings are always applied with the same pattern: auto [var1, var2, ...] = <pair, tuple, struct, or array expression>; The list of variables var1, var2, ... must exactly match the number of variables which are contained by the expression being assigned from. The <pair, tuple, struct, or array expression> must be one of the following: An std::pair. An std::tuple. A struct. All members must be non-static and be defined in the same base class. An array of fixed size. The type can be auto, const auto, const auto& and even auto&&. Not only for the sake of performance, always make sure to minimize needless copies by using references when appropriate. If we write too many or not enough variables between the square brackets, the compiler will error out, telling us about our mistake: std::tuple<int, float, long> tup {1, 2.0, 3}; auto [a, b] = tup; This example obviously tries to stuff a tuple variable with three members into only two variables. The compiler immediately chokes on this and tells us about our mistake: error: type 'std::tuple<int, float, long>' decomposes into 3 elements, but only 2 names were provided auto [a, b] = tup; There's more... A lot of fundamental data structures from the STL are immediately accessible using structured bindings without us having to change anything. Consider for example a loop, which prints all items of an std::map: std::map<std::string, size_t> animal_population { {"humans", 7000000000}, {"chickens", 17863376000}, {"camels", 24246291}, {"sheep", 1086881528}, /* … */ }; for (const auto &[species, count] : animal_population) { std::cout << "There are " << count << " " << species << " on this planet.n"; } This particular example works, because when we iterate over a std::map container, we get std::pair<key_type, value_type> items on every iteration step. And exactly those are unpacked using the structured bindings feature (Assuming that the species string is the key, and the population count the value being associated with the key), in order to access them individually in the loop body. Before C++17, it was possible to achieve a similar effect using std::tie: int remainder; std::tie(std::ignore, remainder) = divide_remainder(16, 5); std::cout << "16 % 5 is " << remainder << "n"; This example shows how to unpack the result pair into two variables. std::tie is less powerful than structured bindings in the sense that we have to define all variables we want to bind to before. On the other hand, this example shows a strength of std::tie which structured bindings do not have: The value std::ignore acts as a dummy variable. The fraction part of the result is assigned to it, which leads to that value being dropped because we do not need it in that example. Back in the past, the divide_remainder function would have been implemented the following way, using output parameters: bool divide_remainder(int dividend, int divisor, int &fraction, int &remainder); Accessing it would have looked like the following: int fraction, remainder; const bool success {divide_remainder(16, 3, fraction, remainder)}; if (success) { std::cout << "16 / 3 is " << fraction << " with a remainder of " << remainder << "n"; } A lot of people will still prefer this over returning complex structures like pairs, tuples, and structs, arguing that this way the code would be faster, due to avoided intermediate copies of those values. This is not true any longer for modern compilers, which optimize intermediate copies away. Apart from the missing language features in C, returning complex structures via return value was considered slow for a long time, because the object had to be initialized in the returning function, and then copied into the variable which shall contain the return value on the caller side. Modern compilers support return value optimization (RVO), which enables for omitting intermediate copies. Summary Thus we successfully studied how to use structured bindings to return multiple values at once in C++ 17 using code examples. Resources for Article: Further resources on this subject: Creating an F# Project [article] Hello, C#! Welcome, .NET Core! [article] Exploring Structure from Motion Using OpenCV [article]
Read more
  • 0
  • 0
  • 21933

article-image-understanding-dependencies-c-application
Packt
05 Apr 2017
9 min read
Save for later

Understanding the Dependencies of a C++ Application

Packt
05 Apr 2017
9 min read
This article by Richard Grimes, author of the book, Beginning C++ Programming explains the dependencies of a C++ application. A C++ project will produce an executable or library, and this will be built by the linker from object files. The executable or library is dependent upon these object files. An object file will be compiled from a C++ source file (and potentially one or more header files). The object file is dependent upon these C++ source and header files. Understanding dependencies is important because it helps you understand the order to compile the files in your project, and it allows you to make your project builds quicker by only compiling those files that have changed. (For more resources related to this topic, see here.) Libraries When you include a file within your source file the code within that header file will be accessible to your code. Your include file may contain whole function or class definitions (these will be covered in later chapters) but this will result in a problem: multiple definitions of a function or class. Instead, you can declare a class or function prototype, which indicates how calling code will call the function without actually defining it. Clearly the code will have to be defined elsewhere, and this could be a source file or a library, but the compiler will be happy because it only sees one definition. A library is code that has already been defined, it has been fully debugged and tested, and therefore users should not need to have access to the source code. The C++ Standard Library is mostly shared through header files, which helps you when you debug your code, but you must resist any temptation to edit these files. Other libraries will be provided as compiled libraries. There are essentially two types of compiled libraries: static libraries and dynamic link libraries. If you use a static library then the compiler will copy the compiled code that you use from the static library and place it in your executable. If you use a dynamic link (or shared) library then the linker will add information used during runtime (it may be when the executable is loaded, or it may even be delayed until the function is called) to load the shared library into memory and access the function. Windows uses the extension lib for static libraries and dll for dynamic link libraries. GNU gcc uses the extension a for static libraries and so for shared libraries. If you use library code in a static or dynamic link library the compiler will need to know that you are calling a function correctly—to make sure your code calls a function with the correct number of parameters and correct types. This is the purpose of a function prototype—it gives the compiler the information it needs to know about calling the function without providing the actual body of the function, the function definition. In general, the C++ Standard Library will be included into your code through the standard header files. The C Runtime Library (which provides some code for the C++ Standard Library) will be static linked, but if the compiler provides a dynamic linked version you will have a compiler option to use this. Pre-compiled Headers When you include a file into your source file the preprocessor will include the contents of that file (after taking into account any conditional compilation directives) and recursively any files included by that file. As illustrated earlier, this could result in thousands of lines of code. As you develop your code you will often compile the project so that you can test the code. Every time you compile your code the code defined in the header files will also be compiled even though the code in library header files will not have changed. With a large project this can make the compilation take a long time. To get around this problem compilers often offer an option to pre-compile headers that will not change. Creating and using precompiled headers is compiler specific. For example, with gcc you compile a header as if it is a C++ source file (with the /x switch) and the compiler creates a file with an extension of gch. When gcc compiles source files that use the header it will search for the gch file and if it finds the precompiled header it will use that, otherwise it will use the header file. In Visual C++ the process is a bit more complicated because you have to specifically tell the compiler to look for a precompiled header when it compiles a source file. The convention in Visual C++ projects is to have a source file called stdafx.cpp which has a single line that includes the file stdafx.h. You put all your stable header file includes in stdafx.h. Next, you create a precompiled header by compiling stdafx.cpp using the /Yc compiler option to specify that stdafx.h contains the stable headers to compile. This will create a pch file (typically, Visual C++ will name it after your project) containing the code compiled up to the point of the inclusion of the stdafx.h header file. Your other source files must include the stdafx.h header file as the first header file, but it may also include other files. When you compile your source files you use the /Yu switch to specify the stable header file (stdafx.h) and the compiler will use the precompiled header pch file instead of the header. When you examine large projects you will often find precompiled headers are used, and as you can see, it alters the file structure of the project. The example later in this chapter will show how to create and use precompiled headers. Project Structure It is important to organize your code into modules to enable you to maintain it effectively. Even if you are writing C-like procedural code (that is, your code involves calls to functions in a linear way) you will also benefit from organizing it into modules. For example, you may have functions that manipulate strings and other functions that access files, so you may decide to put the definition of the string functions in one source file, string.cpp, and the definition of the file functions in another file, file.cpp. So that other modules in the project can use these files you must declare the prototypes of the functions in a header file and include that header in the module that uses the functions. There is no absolute rule in the language about the relationship between the header files and the source files that contain the definition of the functions. You may have a header file called string.h for the functions in string.cpp and a header file called file.h for the functions in file.cpp. Or you may have just one file called utilities.h that contains the declarations for all the functions in both files. The only rule that you have to abide by is that at compile time the compiler must have access to a declaration of the function in the current source file, either through a header file, or the function definition itself. The compiler will not look forward in a source file, so if a function calls another function in the same source file that called function must have already been defined before the calling function, or there must be a prototype declaration. This leads to a typical convention of having a header file associated with each source file that contains the prototypes of the functions in the source file, and the source file includes this header. This convention becomes more important when you write classes. Managing Dependencies When a project is built with a building tool, checks are performed to see if the output of the build exist and if not, perform the appropriate actions to build it. Common terminology is that the output of a build step is called a target and the inputs of the build step (for example, source files) are the dependencies of that target. Each target's dependencies are the files used to make them. The dependencies may themselves be a target of a build action and have their own dependencies. For example, the following picture shows the dependencies in a project: In this project there are three source files (main.cpp, file1.cpp, file2.cpp) each of these includes the same header utils.h which is precompiled (and hence why there is a fourth source file, utils.cpp, that only contains utils.h). All of the source files depend on utils.pch, which in turn depends upon utils.h. The source file main.cpp has the main function and calls functions in the other two source files (file1.cpp and file2.cpp), and accesses the functions through the associated header files file1.h and file2.h. On the first compilation the build tool will see that the executable depends on the four object files and so it will look for the rule to build each one. In the case of the three C++ source files this means compiling the cpp files, but since utils.obj is used to support the precompiled header, the build rule will be different to the other files. When the build tool has made these object files it will then link them together along with any library code (not shown here). Subsequently, if you change file2.cpp and build the project, the build tool will see that only file2.cpp has changed and since only file2.obj depends on file2.cpp all the make tool needs to do is compile file2.cpp and then link the new file2.obj with the existing object files to create the executable. If you change the header file, file2.h, the build tool will see that two files depend on this header file, file2.cpp and main.cpp and so the build tool will compile these two source files and link the new two object files file2.obj and main.obj with the existing object files to form the executable. If, however, the precompiled header source file, util.h, changes it means that all of the source files will have to be compiled. Summary For a small project, dependencies are easy to manage, and as you have seen, for a single source file project you do not even have to worry about calling the linker because the compiler will do that automatically. As a C++ project gets bigger, managing dependencies gets more complex and this is where development environments like Visual C++ become vital. Resources for Article: Further resources on this subject: Introduction to C# and .NET [article] Preparing to Build Your Own GIS Application [article] Writing a Fully Native Application [article]
Read more
  • 0
  • 0
  • 45642

article-image-getting-started-docker-storage
Packt
05 Apr 2017
12 min read
Save for later

Getting Started with Docker Storage

Packt
05 Apr 2017
12 min read
In this article by Scott Gallagher, author of the book Mastering Docker – Second Edition, we will cover the places you store your containers, such as Docker Hub and Docker Hub Enterprises. We will also cover Docker Registry that you can use to run your own local storage for the Docker containers. We will review the differences between them all and when and how to use each of them. It will also cover how to set up automated builds using web hooks as well as the pieces that are all required to set them up. Lastly, we will run through an example of how to set up your own Docker Registry. Let's take a quick look at the topics we will be covering in this article: Docker Hub Docker Hub Enterprise Docker Registry Automated builds (For more resources related to this topic, see here.) Docker Hub In this section, we will focus on that Docker Hub, which is a free public option, but also has a private option that you can use to secure your images. We will focus on the web aspect of Docker Hub and the management you can do there. The login page is like the one shown in the following screenshot: Dashboard After logging into the Docker Hub, you will be taken to the following landing page. This page is known as the Dashboard of Docker Hub. From here, you can get to all the other sub pages of Docker Hub. In the upcoming sections, we will go through everything you see on the dashboard, starting with the dark blue bar you have on the top. Exploring the repositories page The following is the screenshot of the Explore link you see next to Dashboard at the top of the screen: As you can see in the screenshot, this is a link to show you all the official repositories that Docker has to offer. Official repositories are those that come directly from Docker or from the company responsible for the product. They are regularly updated and patched as needed. Organizations Organizations are those that you have either created or have been added to. Organizations allow you to layer on control, for say, a project that multiple people are collaborating on. The organization gets its own setting such as whether to store repositories as public or private by default, changing plans that will allow for different amounts of private repositories, and separate repositories all together from the ones you or others have. You can also access or switch between accounts or organizations from the Dashboard just below the Docker log, where you will typically see your username when you log in. This is a drop-down list, where you can switch between all the organizations you belong to. The Create menu The Create menu is the new item along the top bar of the Dashboard. From this drop-down menu, you can perform three actions: Create repository Create automated build Create organization A pictorial representation is shown in the following screenshot: The Settings Page Probably, the first section everyone jumps to once they have created an account on the Docker Hub—the Settings page. I know, that's what I did at least. The Account Settings page can be found under the drop-down menu that is accessed in the upper-right corner of the dashboard on selecting Settings. The page allows you to set up your public profile; change your password; see what organization you belong to, the subscriptions for e-mail updates you belong to, what specific notifications you would like to receive, what authorized services have access to your information, linked accounts (such as your GitHub or Bitbucket accounts); as well as your enterprise licenses, billing, and global settings. The only global setting as of now is the choice between having your repositories default to public or private upon creation. The default is to create them as public repositories. The Stars page Below the dark blue bar at the top of the Dashboard page are two more areas that are yet to be covered. The first, the Stars page, allows you to see what repositories you yourself have starred. This is very useful if you come across some repositories that you prefer to use and want to access them to see whether they have been updated recently or whether any other changes have occurred on these repositories. The second is a new setting in the new version of Docker Hub called Contributed. In this section, there will be a list of repositories you have contributed to outside of the ones within your Repositories list. Docker Hub Enterprise Docker Hub Enterprise, as it is currently known, will eventually be called Docker Subscription. We will focus on Docker Subscription, as it's the new and shiny piece. We will view the differences between Docker Hub and Docker Subscription (as we will call it moving forward) and view the options to deploy Docker Subscription. Let's first start off by comparing Docker Hub to Docker Subscription and see why each is unique and what purpose each serves: Docker Hub Shareable image, but it can be private No hassle of self-hosting Free (except for a certain number of private images) Docker Subscription Integrated into your authentication services (that is, AD/LDAP) Deployed on your own infrastructure (or cloud) Commercial support Docker Subscription for server Docker Subscription for server allows you to deploy both Docker Trusted Registry as well as Docker Engine on the infrastructure that you manage. Docker Trusted Registry is the location where you store the Docker images that you have created. You can set these up to be internal only or share them out publicly as well. Docker Subscription gives you all the benefits of running your own dedicated Docker hosted registry with the added benefits of getting support in case you need it. Docker Subscription for cloud As we saw in the previous section, we can also deploy Docker Subscription to a cloud provider if we wish. This allows us to leverage our existing cloud environments without having to roll our own server infrastructure up to host our Docker images. The setup is the same as we reviewed in the previous section; but this time, we will be targeting our existing cloud environment instead. Docker Registry In this section, we will be looking at Docker Registry. Docker Registry is an open source application that you can run anywhere you please and store your Docker image in. We will look at the comparison between Docker Registry and Docker Hub and how to choose among the two. By the end of the section, you will learn how to run your own Docker Registry and see whether it's a true fit for you. An overview of Docker Registry Docker Registry, as stated earlier, is an open source application that you can utilize to store your Docker images on a platform of your choice. This allows you to keep them 100% private if you wish or share them as needed. The registry can be found at https://docs.docker.com/registry/. This will run you through the setup and the steps to follow while pushing images to Docker Registry compared to Docker Hub. Docker Registry makes a lot of sense if you want to roll your own registry without having to pay for all the private features of Docker Hub. Next, let's take a look at some comparisons between Docker Hub and Docker Registry, so you can make an educated decision as to which platform to choose to store your images. Docker Registry will allow you to do the following: Host and manage your own registry from which you can serve all the repositories as private, public, or a mix between the two Scale the registry as needed based on how many images you host or how many pull requests you are serving out All are command-line-based for those that live on the command line Docker Hub will allow you to: Get a GUI-based interface that you can use to manage your images A location already set up on the cloud that is ready to handle public and/or private images Peace of mind of not having to manage a server that is hosting all your images Automated builds In this section, we will look at automated builds. Automated builds are those that you can link to your GitHub or Bitbucket account(s) and, as you update the code in your code repository, you can have the image automatically built on Docker Hub. We will look at all the pieces required to do so and, by the end, you'll be automating all your builds. Setting up your code The first step to create automated builds is to set up your GitHub or Bitbucket code. These are the two options you have while selecting where to store your code. For our example, I will be using GitHub; but the setup will be the same for GitHub and Bitbucket. First, we set up our GitHub code that contains just a simple README file that we will edit for our purpose. This file could be anything as far as a script or even multiple files that you want to manipulate for your automated builds. One key thing is that we can't just leave the README file alone. One key piece is that a Dockerfile is required to do the builds when you want it to for them to be automated. Next, we need to set up the link between our code and Docker Hub. Setting up Docker Hub On Docker Hub, we are going to use the Create drop-down menu and select Create Automated Build. After selecting it, we will be taken to a screen that will show you the accounts you have linked to either GitHub or Bitbucket. You then need to search and select the repository from either of the locations you want to create the automated build from. This will essentially create a web hook that when a commit is done on a selected code repository, then a new build will be created on Docker Hub. After you select the repository you would like to use, you will be taken to a screen similar to the following one: For the most part, the defaults will be used by most. You can select a different branch if you want to use one, say a testing branch if you use one before the code may go to the master branch. The one thing that will not be filled out, but is required, is the description field. You must enter something here or you will not be able to continue past this page. Upon clicking Create, you will be taken to a screen similar to the next screenshot: On this screen, you can see a lot of information on the automated build you have set up. Information such as tags, the Dockerfile in the code repository, build details, build settings, collaborators on the code, web hooks, and settings that include making the repository public or private and deleting the automated build repository as well. Putting all the pieces together So, let's take a run at doing a Docker automated build and see what happens when we have all the pieces in place and exactly what we have to do to kick off this automated build and be able to create our own magic: Update the code or any file inside your GitHub or Bitbucket repository. Upon committing the update, the automated build will be kicked off and logged in Docker Hub for that automated repository. Creating your own registry To create a registry of your own, use the following command: $ docker-machine create --driver vmwarefusion registry Creating SSH key... Creating VM... Starting registry... Waiting for VM to come online... To see how to connect Docker to this machine, run the following command: $ docker-machine env registry export DOCKER_TLS_VERIFY="1" export DOCKER_HOST="tcp://172.16.9.142:2376" export DOCKER_CERT_PATH="/Users/scottpgallagher/.docker/machine/machines/ registry" export DOCKER_MACHINE_NAME="registry" # Run this command to configure your shell: # eval "$(docker-machine env registry)" $ eval "$(docker-machine env registry)" $ docker pull registry $ docker run -p 5000:5000 -v <HOST_DIR>:/tmp/registry-dev registry:2 This will specify to use version 2 of the registry. For AWS (as shown in example from https://hub.docker.com/_/registry/): $ docker run -e SETTINGS_FLAVOR=s3 -e AWS_BUCKET=acme-docker -e STORAGE_PATH=/registry -e AWS_KEY=AKIAHSHB43HS3J92MXZ -e AWS_SECRET=xdDowwlK7TJajV1Y7EoOZrmuPEJlHYcNP2k4j49T -e SEARCH_BACKEND=sqlalchemy -p 5000:5000 registry:2 Again, this will use version 2 of the self-hosted registry. Then, you need to modify your Docker startups to point to the newly set up registry. Add the following line to the Docker startup in the /etc/init.d/docker file: -H tcp://127.0.0.1:2375 -H unix:///var/run/docker.sock --insecureregistry <REGISTRY_HOSTNAME>:5000 Most of these settings might already be there and you might only need to add --insecure-registry <REGISTRY_HOSTNAME>:5000: To access this file, you will need to use docker-machine: $ docker-machine ssh <docker-host_name> Now, you can pull a registry from the public Docker Hub as follows: $ docker pull debian Tag it, so when we do a push, it will go to the registry we set up: $ docker tag debian <REGISTRY_URL>:5000/debian Then, we can push it to our registry: $ docker push <REGISTRY_URL>:5000/debian We can also pull it for any future clients (or after any updates we have pushed for it): $ docker pull <REGISTRY_URL>:5000/debian Summary In this article, we dove deep into Docker Hub and also reviewed the new shiny Docker Subscription as well as the self-hosted Docker Registry. We have gone through the extensive review of each of them. You learned of the differences between them all and how to utilize each one. In this article, we also looked deep into setting up automated builds. We took a look at how to set up your own Docker Hub Registry. We have encompassed a lot in this chapter and I hope you have learned a lot and will like to put it all into good use. Resources for Article: Further resources on this subject: Docker in Production [article] Docker Hosts [article] Hands On with Docker Swarm [article]
Read more
  • 0
  • 0
  • 5507
article-image-api-and-intent-driven-networking
Packt
05 Apr 2017
19 min read
Save for later

API and Intent-Driven Networking

Packt
05 Apr 2017
19 min read
In this article by Eric Chou, author of the book Mastering Python Networking we will look at following topics: Treating Infrastructure as code and data modeling Cisco NX-API and Application centric infrastruucture (For more resources related to this topic, see here.) Infrastructure as Python code In a perfect world, network engineers and people who design and manage networks should focus on what they want the network to achieve instead of the device-level interactions. In my first job as an Intern for a local ISP, wide-eyed and excited, I received my first assignment to install a router on customer site to turn up their fractional frame relay link (remember those?). How would I do that? I asked, I was handed a standard operating procedure for turning up frame relay links. I went to the customer site, blindly type in the commands and looked at the green lights flash, happily packed my bag and pad myself on the back for a job well done. As exciting as that first assignment was, I did not fully understand what I was doing. I was simply following instructions without thinking about the implication of the commands I was typing in. How would I troubleshoot something if the light was red instead of green? I think I would have called back to the office. Of course network engineering is not about typing in commands onto a device, it is about building a way that allow services to be delivered from one point to another with as little friction as possible. The commands we have to use and the output we have to interpret are merely a mean to an end. I would like to hereby argue that we should focus as much on the Intent of the network as possible for an Intent-Driven Networking and abstract ourselves from the device-level interaction on an as-needed basis. In using API, it is my opinion that it gets us closer to a state of Intent-Driven Networking. In short, because we abstract the layer of specific command executed on destination device, we focus on our intent instead of the specific command given to the device. For example, if our intend is to deny an IP from entering our network, we might use 'access-list and access-group' on a Cisco and 'filter-list' on a Juniper. However, in using API, our program can start asking the executor for their Intent while masking what kind of physical device it is they are talking to. Screen Scraping vs. API Structured Output Imagine a common scenario where we need to log in to the device and make sure all the interfaces on the devices are in an up/up state (both status and protocol are showing as up). For the human network engineers getting to a Cisco NX-OS device, it is simple enough to issue the show IP interface brief command and easily tell from the output which interface is up: nx-osv-2# show ip int brief IP Interface Status for VRF "default"(1) Interface IP Address Interface Status Lo0 192.168.0.2 protocol-up/link-up/admin-up Eth2/1 10.0.0.6 protocol-up/link-up/admin-up nx-osv-2# The line break, white spaces, and the first line of column title are easily distinguished from the human eye. In fact, they are there to help us line up, say, the IP addresses of each interface from line 1 to line 2 and 3. If we were to put ourselves into the computer's eye, all these spaces and line breaks are only taking away from the real important output, which is 'which interfaces are in the up/up state'? To illustrate this point, we can look at the Paramiko output again: >>> new_connection.send('sh ip int briefn') 16 >>> output = new_connection.recv(5000) >>> print(output) b'sh ip int briefrrnIP Interface Status for VRF "default"(1)rnInterface IP Address Interface StatusrnLo0 192.168.0.2 protocol-up/link-up/admin-up rnEth2/1 10.0.0.6 protocol-up/link-up/admin-up rnrnxosv- 2# ' >>> If we were to parse out that data, of course there are many ways to do it, but here is what I would do in a pseudo code fashion: Split each line via line break. I may or may not need the first line that contain the executed command, for now, I don't think I need it. Take out everything on the second line up until the VRF and save that in a variable as we want to know which VRF the output is showing of. For the rest of the lines, because we do not know how many interfaces there are, we will do a regular expression to search if the line starts with possible interfaces, such as lo for loopback and 'Eth'. We will then split this line into three sections via space, each consist of name of interface, IP address, then the interface status. The interface status will then be split further using the forward slash (/) to give us the protocol, link, and admin status. Whew, that is a lot of work just for something that a human being can tell in a glance! You might be able to optimize the code and the number of lines, but in general this is what we need to do when we need to 'screen scrap' something that is somewhat unstructured. There are many downsides to this method, the few bigger problems that I see are: Scalability: We spent so much time in painstakingly details for each output, it is hard to imagine we can do this for the hundreds of commands that we typically run. Predicability: There is really no guarantee that the output stays the same. If the output is changed ever so slightly, it might just enter with our hard earned battle of information gathering. Vendor and software lock-in: Perhaps the biggest problem is that once we spent all these time parsing the output for this particular vendor and software version, in this case Cisco NX-OS, we need to repeat this process for the next vendor that we pick. I don't know about you, but if I were to evaluate a new vendor, the new vendor is at a severe on-boarding disadvantage if I had to re-write all the screen scrap code again. Let us compare that with an output from an NX-API call for the same 'show IP interface brief' command. We will go over the specifics of getting this output from the device later in this article, but what is important here is to compare the follow following output to the previous screen scraping steps: { "ins_api":{ "outputs":{ "output":{ "body":{ "TABLE_intf":[ { "ROW_intf":{ "admin-state":"up", "intf-name":"Lo0", "iod":84, "ip-disabled":"FALSE", "link-state":"up", "prefix":"192.168.0.2", "proto-state":"up" } }, { "ROW_intf":{ "admin-state":"up", "intf-name":"Eth2/1", "iod":36, "ip-disabled":"FALSE", "link-state":"up", "prefix":"10.0.0.6", "proto-state":"up" } } ], "TABLE_vrf":[ { "ROW_vrf":{ "vrf-name-out":"default" } }, { "ROW_vrf":{ "vrf-name-out":"default" } } ] }, "code":"200", "input":"show ip int brief", "msg":"Success" } }, "sid":"eoc", "type":"cli_show", "version":"1.2" } } NX-API can return output in XML or JSON, this is obviously the JSON output that we are looking at. Right away you can see the answered are structured and can be mapped directly to Python dictionary data structure. There is no parsing required, simply pick the key you want and retrieve the value associated with that key. There is also an added benefit of a code to indicate command success or failure, with a message telling the sender reasons behind the success or failure. You no longer need to keep track of the command issued, because it is already returned to you in the 'input' field. There are also other meta data such as the version of the NX-API. This type of exchange makes life easier for both vendors and operators. On the vendor side, they can easily transfer configuration and state information, as well as add and expose extra fields when the need rises. On the operator side, they can easily ingest the information and build their infrastructure around it. It is generally agreed on that automation is much needed and a good thing, the questions usually centered around which format and structure the automation should take place. As you can see later in this article, there are many competing technologies under the umbrella of API, on the transport side alone, we have REST API, NETCONF, RESTCONF, amongst others. Ultimately the overall market will decide, but in the mean time, we should all take a step back and decide which technology best suits our need. Data modeling for infrastructure as code According to Wikipedia, "A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner." The data modeling process can be illustrated in the following graph:  Data Modeling Process (source:  https://en.wikipedia.org/wiki/Data_model) When applied to networking, we can applied this concept as an abstract model that describe our network, let it be datacenter, campus, or global Wide Area Network. If we take a closer look at a physical datacenter, a layer 2 Ethernet switch can be think of as a device containing a table of Mac addresses mapped to each ports. Our switch data model described how the Mac address should be kept in a table, which are the keys, additional characteristics (think of VLAN and private VLAN), and such. Similarly, we can move beyond devices and map the datacenter in a model. We can start with the number of devices are in each of the access, distribution, core layer, how they are connected, and how they should behave in a production environment. For example, if we have a Fat-Tree network, how many links should each of the spine router should have, how many routes they should contain, and how many next-hop should each of the prefixes have. These characteristics can be mapped out in a format that can be referenced against as the ideal state that we should always checked against. One of the relatively new network data modeling language that is gaining traction is YANG. Yet Another Next Generation (YANG) (Despite common belief, some of the IETF workgroup do have a sense of humor). It was first published in RFC 6020 in 2010, and has since gain traction among vendors and operators. At the time of writing, the support for YANG varies greatly from vendors to platforms, the adaptation rate in production is therefore relatively low. However, it is a technology worth keeping an eye out for. Cisco API and ACI Cisco Systems, as the 800 pound gorilla in the networking space, has not missed on the trend of network automation. The problem has always been the confusion surrounding Cisco's various product lines and level of technology support. With product lines spans from routers, switches, firewall, servers (unified computing), wireless, collaboration software and hardware, analytic software, to name a few, it is hard to know where to start. Since this book focuses on Python and networking, we will scope the section to the main networking products. In particular we will cover the following: Nexus product automation with NX-API Cisco NETCONF and YANG examples Cisco application centric infrastructure for datacenter Cisco application centric infrastructure for enterprise For the NX-API and NETCONF examples here, we can either use the Cisco DevNet always-on lab devices or locally run Cisco VIRL. Since ACI is a separated produce and license on top of the physical switches, for the following ACI examples, I would recommend using the DevNet labs to get an understanding of the tools. Unless, of course, that you are one of the lucky ones who have a private ACI lab that you can use. Cisco NX-API Nexus is Cisco's product line of datacenter switches. NX-API (http://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/6-x/programmability/guide/b_Cisco_Nexus_9000_Series_NX-OS_Programmability_Guide/b_Cisco_Nexus_9000_Series_NX-OS_Programmability_Guide_chapter_011.html) allows the engineer to interact with the switch outside of the device via a variety of transports, including SSH, HTTP, and HTTPS. Installation and preparation Here are the Ubuntu Packages that we will install, you may already have some of the packages such as Python development, pip, and Git: $ sudo apt-get install -y python3-dev libxml2-dev libxslt1-dev libffi-dev libssl-dev zlib1g-dev python3-pip git python3-requests If you are using Python 2: sudo apt-get install -y python-dev libxml2-dev libxslt1-dev libffi-dev libssl-dev zlib1g-dev python-pip git python-requests The ncclient (https://github.com/ncclient/ncclient) library is a Python library for NETCONF clients, we will install from the GitHub repository to install the latest version: $ git clone https://github.com/ncclient/ncclient $ cd ncclient/ $ sudo python3 setup.py install $ sudo python setup.py install NX-API on Nexus devices is off by default, we will need to turn it on. We can either use the user already created or create a new user for the NETCONF procedures. feature nxapi username cisco password 5 $1$Nk7ZkwH0$fyiRmMMfIheqE3BqvcL0C1 role networkopera tor username cisco role network-admin username cisco passphrase lifetime 99999 warntime 14 gracetime 3 For our lab, we will turn on both HTTP and sandbox configuration, they should be turned off in production. nx-osv-2(config)# nxapi http port 80 nx-osv-2(config)# nxapi sandbox We are now ready to look at our first NX-API example. NX-API examples Since we have turned on sandbox, we can launch a web browser and take a look at the various message format, requests, and response based on the CLI command that we are already familiar with. In the following example, I selected JSON-RPC and CLI command type for the command show version:   The sandbox comes in handy if you are unsure about the supportability of message format or if you have questions about the field key for which the value you want to retrieve in your code. In our first example, we are just going to connect to the Nexus device and print out the capabilities exchanged when the connection was first made: #!/usr/bin/env python3 from ncclient import manager conn = manager.connect( host='172.16.1.90', port=22, username='cisco', password='cisco', hostkey_verify=False, device_params={'name': 'nexus'}, look_for_keys=False) for value in conn.server_capabilities: print(value) conn.close_session() The connection parameters of host, port, username, and password are pretty self-explanatory. The device parameter specifies the kind of device the client is connecting to, as we will also see a differentiation in the Juniper NETCONF sections. The hostkey_verify bypass the known_host requirement for SSH while the look_for_keys option disables key authentication but use username and password for authentication. The output will show that the XML and NETCONF supported feature by this version of NX-OS: $ python3 cisco_nxapi_1.py urn:ietf:params:xml:ns:netconf:base:1.0 urn:ietf:params:netconf:base:1.0 Using ncclient and NETCONF over SSH is great because it gets us closer to the native implementation and syntax. We will use the library more later on. For NX-API, I personally feel that it is easier to deal with HTTPS and JSON-RPC. In the earlier screenshot of NX-API Developer Sandbox, if you noticed in the Request box, there is a box labeled Python. If you click on it, you would be able to get an automatically converted Python script based on the requests library.  Requests is a very popular, self-proclaimed HTTP for humans library used by companies like Amazon, Google, NSA, amongst others. You can find more information about it on the official site (http://docs.python-requests.org/en/master/). For the show version example, the following Python script is automatically generated for you. I am pasting in the output without any modification: """ NX-API-BOT """ import requests import json """ Modify these please """ url='http://YOURIP/ins' switchuser='USERID' switchpassword='PASSWORD' myheaders={'content-type':'application/json-rpc'} payload=[ { "jsonrpc": "2.0", "method": "cli", "params": { "cmd": "show version", "version": 1.2 }, "id": 1 } ] response = requests.post(url,data=json.dumps(payload), headers=myheaders,auth=(switchuser,switchpassword)).json() In cisco_nxapi_2.py file, you will see that I have only modified the URL, username and password of the preceding file, and parse the output to only include the software version. Here is the output: $ python3 cisco_nxapi_2.py 7.2(0)D1(1) [build 7.2(0)ZD(0.120)] The best part about using this method is that the same syntax works with both configuration command as well as show commands. This is illustrated in cisco_nxapi_3.py file. For multi-line configuration, you can use the id field to specify the order of operations. In cisco_nxapi_4.py, the following payload was listed for changing the description of interface Ethernet 2/12 in the interface configuration mode. { "jsonrpc": "2.0", "method": "cli", "params": { "cmd": "interface ethernet 2/12", "version": 1.2 }, "id": 1 }, { "jsonrpc": "2.0", "method": "cli", "params": { "cmd": "description foo-bar", "version": 1.2 }, "id": 2 }, { "jsonrpc": "2.0", "method": "cli", "params": { "cmd": "end", "version": 1.2 }, "id": 3 }, { "jsonrpc": "2.0", "method": "cli", "params": { "cmd": "copy run start", "version": 1.2 }, "id": 4 } ] In the next section, we will look at examples for Cisco NETCONF and YANG model. Cisco and YANG model Earlier in the article, we looked at the possibility of expressing the network using data modeling language YANG. Let us look into it a little bit. First off, we should know that YANG only defines the type of data sent over NETCONF protocol and NETCONF exists as a standalone protocol as we saw in the NX-API section. YANG being relatively new, the supportability is spotty across vendors and product lines. For example, if we run the same capability exchange script we saw preceding to a Cisco 1000v running  IOS-XE, this is what we would see: urn:cisco:params:xml:ns:yang:cisco-virtual-service?module=ciscovirtual- service&revision=2015-04-09 http://tail-f.com/ns/mibs/SNMP-NOTIFICATION-MIB/200210140000Z? module=SNMP-NOTIFICATION-MIB&revision=2002-10-14 urn:ietf:params:xml:ns:yang:iana-crypt-hash?module=iana-crypthash& revision=2014-04-04&features=crypt-hash-sha-512,crypt-hashsha- 256,crypt-hash-md5 urn:ietf:params:xml:ns:yang:smiv2:TUNNEL-MIB?module=TUNNELMIB& revision=2005-05-16 urn:ietf:params:xml:ns:yang:smiv2:CISCO-IP-URPF-MIB?module=CISCOIP- URPF-MIB&revision=2011-12-29 urn:ietf:params:xml:ns:yang:smiv2:ENTITY-STATE-MIB?module=ENTITYSTATE- MIB&revision=2005-11-22 urn:ietf:params:xml:ns:yang:smiv2:IANAifType-MIB?module=IANAifType- MIB&revision=2006-03-31 <omitted> Compare that to the output that we saw, clearly IOS-XE understand more YANG model than NX-OS. Industry wide network data modeling for networking is clearly something that is beneficial to network automation. However, given the uneven support across vendors and products, it is not something that is mature enough to be used across your production network, in my opinion. For the book I have included a script called cisco_yang_1.py that showed how to parse out NETCONF XML output with YANG filters urn:ietf:params:xml:ns:yang:ietf-interfaces as a starting point to see the existing tag overlay. You can check the latest vendor support on the YANG Github project page (https://github.com/YangModels/yang/tree/master/vendor). Cisco ACI Cisco Application Centric Infrastructure (ACI) is meant to provide a centralized approach to all of the network components. In the datacenter context, it means the centralized controller is aware of and manages the spine, leaf, top of rack switches as well as all the network service functions. This can be done thru GUI, CLI, or API. Some might argue that the ACI is Cisco's answer to the broader defined software defined networking. One of the somewhat confusing point for ACI, is the difference between ACI and ACI-EM. In short, ACI focuses on datacenter operations while ACI-EM focuses on enterprise modules. Both offers a centralized view and control of the network components, but each has it own focus and share of tools. For example, it is rare to see any major datacenter deploy customer facing wireless infrastructure but wireless network is a crucial part of enterprises today. Another example would be the different approaches to network security. While security is important in any network, in the datacenter environment lots of security policy is pushed to the edge node on the server for scalability, in enterprises security policy is somewhat shared between the network devices and servers. Unlike NETCONF RPC, ACI API follows the REST model to use the HTTP verb (GET, POST, PUT, DELETE) to specify the operation intend. We can look at the cisco_apic_em_1.py file, which is a modified version of the Cisco sample code on lab2-1-get-network-device-list.py (https://github.com/CiscoDevNet/apicem-1.3-LL-sample-codes/blob/master/basic-labs/lab2-1-get-network-device-list.py). The abbreviated section without comments and spaces are listed here. The first function getTicket() uses HTTPS POST on the controller with path /api/vi/ticket with username and password embedded in the header. Then parse the returned response for a ticket with limited valid time. def getTicket(): url = "https://" + controller + "/api/v1/ticket" payload = {"username":"usernae","password":"password"} header = {"content-type": "application/json"} response= requests.post(url,data=json.dumps(payload), headers=header, verify=False) r_json=response.json() ticket = r_json["response"]["serviceTicket"] return ticket The second function then calls another path /api/v1/network-devices with the newly acquired ticket embedded in the header, then parse the results. url = "https://" + controller + "/api/v1/network-device" header = {"content-type": "application/json", "X-Auth-Token":ticket} The output displays both the raw JSON response output as well as a parsed table. A partial output when executed against a DevNet lab controller is shown here: Network Devices = { "version": "1.0", "response": [ { "reachabilityStatus": "Unreachable", "id": "8dbd8068-1091-4cde-8cf5-d1b58dc5c9c7", "platformId": "WS-C2960C-8PC-L", &lt;omitted&gt; "lineCardId": null, "family": "Wireless Controller", "interfaceCount": "12", "upTime": "497 days, 2:27:52.95" } ] } 8dbd8068-1091-4cde-8cf5-d1b58dc5c9c7 Cisco Catalyst 2960-C Series Switches cd6d9b24-839b-4d58-adfe-3fdf781e1782 Cisco 3500I Series Unified Access Points &lt;omitted&gt; 55450140-de19-47b5-ae80-bfd741b23fd9 Cisco 4400 Series Integrated Services Routers ae19cd21-1b26-4f58-8ccd-d265deabb6c3 Cisco 5500 Series Wireless LAN Controllers As one can see, we only query a single controller device, but we are able to get a high level view of all the network devices that the controller is aware of. The downside is, of course, the ACI controller only supports Cisco devices at this time. Summary In this article, we looked at various ways to communicate and manage network devices from Cisco. Resources for Article: Further resources on this subject: Network Exploitation and Monitoring [article] Introduction to Web Experience Factory [article] Web app penetration testing in Kali [article]
Read more
  • 0
  • 0
  • 30892

article-image-it-operations-management
Packt
05 Apr 2017
16 min read
Save for later

IT Operations Management

Packt
05 Apr 2017
16 min read
In this article by Ajaykumar Guggilla, the author of the book ServiceNow IT Operations Management, we will learn the ServiceNow ITOM capabilities within ServiceNow, which include: Dependency views Cloud management Discovery Credentials (For more resources related to this topic, see here.) ServiceNow IT Operations Management overview Every organization and business focuses on key strategies, some of them include: Time to market Agility Customer satisfaction Return on investment Information technology is heavily involved in supporting these strategic goals, either directly or indirectly, providing the underlying IT Services with the required IT infrastructure. IT infrastructure includes network, servers, routers, switches, desktops, laptops, and much more. IT supports these infrastructure components enabling the business to achieve their goals. IT continuously supports the IT infrastructure and its components with a set of governance, processes, and tools, which is called IT Operations Management. IT cares and feeds a business, and the business expects reliability of services provided by IT to support the underlying business services. A business cares and feeds the customers who expect satisfaction of the services offered to them without service disruption. Unlike any other tools it is important to understand the underlying relationship between IT, businesses, and customers. IT just providing the underlying infrastructure and associated components is not going to help, to effectively and efficiently support the business IT needs to understand how the infrastructure components and process are aligned and associated with the business services to understand the impact to the business with an associated incident, problem, event, or change that is arising out of an IT infrastructure component. IT needs to have a consolidated and complete view of the dependency between the business and the customers, not compromising on the technology used, the process followed, the infrastructure components used, which includes the technology used. There needs to be a connected way for IT to understand the relations of these seamless technology components to be able to proactively stop the possible outages before they occur and handle a change in the environment. On the other hand, a business expects service reliability to be able to support the business services to the customers. There is a huge financial impact of businesses not being able to provide the agreed service levels to their customers. So there is always a pressure and dependence from the business to IT to provide a reliable service and it does not matter what technology or processes are used. Customers as always expect satisfaction of the services provided by the business, at times these are adversely affected with service outages caused from the IT infrastructure. Customer satisfaction is also a key strategic goal for the business to be able to sustain in the competitive market. IT is also expected as necessarily to be able to integrate with the customer infrastructure components to provide a holistic view of the IT infrastructure view to be able to effectively support the business by proactively identifying and fixing the outages before they happen to reduce the outages and increase the reliability of IT services delivered. Most of the tools do not understand the context of the Service-Oriented Architecture (SOA) connecting the business services to the impacted IT infrastructure components to be able to effectively support the business and also IT to be able to justify the cost and impact of providing end to end service. Most of the traditional tools perform certain aspects of ITOM functions, some partially and some support the integration with the IT Service Management (ITSM) tool suite. The missing integration piece between the traditional tools and a full blown cloud solution platform is leaning to the SOA. ServiceNow, a cloud based solution, has focused the lens of true SOA that brings together the ITOM suite providing and leveraging the native data and that is also able to connect to the customer infrastructure to provide a holistic and end to end view of the IT Service at a given snapshot. With ServiceNow IT has a complete view of the business service and technical dependencies in real time leveraging powerful individual capabilities, applications, and plugins within ServiceNow ITOM. ServiceNow ITOM comprises of the following applications and capabilities, some of the plugins, applications, and technology might have license restrictions that require separate licensing to be purchased: Management, Instrumentation, and Discovery (MID) Server: MID Server helps to establish communication and data movement between ServiceNow and the external corporate network and application Credentials: Is a platform that stores credentials including usernames, passwords, or certificates in an encrypted field on the credentials table that is leveraged by ServiceNow discovery Service mapping: Service mapping discovers and maps the relationships between IT components that comprise specific business services, even in dynamic, virtualized environments Service mapping: Service mapping creates relationships between different IT components and business services Dependency views: Dependency views graphically displays an infrastructure view with relationships of configuration items and the underlying business services Event management: Event management provides a holistic view of all the event that are triggered from various event monitoring tools Orchestration: Orchestration helps in automating IT and business processes for operations management. Discovery: Works with MID Server and explores the IT infrastructure environment to discover the configuration items and populating the Configuration Management Database (CMDB) Cloud management: Helps to easily manage third-party cloud providers, which includes AWS, Microsoft Azure, and VMware clouds Understanding ServiceNow IT Operations Management components Now that we have covered what ITOM is about and focusing on ServiceNow ITOM capabilities, let's deep dive and explore more about each capability. Dependency views Maps like the preceding one are becoming so important in everyday life; imagine a world without GPS devices or electronic maps. There were hard copies of the maps that were available all over the streets for us to get to the place and also there were special maps to the utilities and other public service agencies to be able to identify the impact to either digging a tunnel or a water pipe or an underground electric cable. These maps help them to identify the impact of making a change to the ground. Maps also helps us to understand the relationships between a states, countries, cities, and streets with different set of information in real time that includes real-time traffic information showing accident information, any constructions, and so on. Dependency views is also similar to the real life navigation maps, they provide a map of relationships between the IT Infrastructure components and the business services that are defined under the scope, unlike the real-time traffic updates on the maps the dependency views show real-time active incidents, change, and problems reported on an individual configuration item or an infrastructure component. Changes frequently happen in the environment, some of the changes are handled with a legacy knowledge of how the individual components are connected to the business services through the service mapping plugin down to the individual component level. Making a change without understanding the relationships between each IT infrastructure component might adversely affect the service levels and impact the business service. ServiceNow dependency views provide a snapshot of how the underlying business service is connected to individual Configuration Item (CI) elements. Drilling down to the individual CI elements provides a view of associated service operations and service transition data that includes incidents logged against on a given CI, any underlying problem reported against the given CI, and also changes associated with the given CI. Dependency views are based on D3 and Angular technology that provides a graphical view of configuration items and their relationships. The dependency views provide a view of the CI and their relationships, in order to get a perspective from a business stand point you will need to enable the service mapping plugin. Having a detailed view of how the individual CI components are connected from the Business service to the CI components compliments the change management to perform effective impact analysis before any changes are made to the respective CI: Image source: wiki.servicenow.com A dependency map starts with a root node, which is usually termed as a root CI that is grayed out with a gray frame. Relationships start building up and they map from the upstream and downstream dependencies of the infrastructure components that are scoped to discover by the ServiceNow auto discovery. Administrators have the control of the number of levels to display on the dependency maps. It is also easy to manage the maps that allow creating or modifying existing relationships right from the map that posts the respective changes to the CMDB automatically. Each of the CI component of the dependency maps have an indicator that shows any active and pending issues against a CI that includes any incidents, problems, changes, and any events associated with the respective configuration item. Cloud management In the earlier versions prior to Helsinki, there was not a direct way to manage cloud instances, people had to create orchestration scripts to be able to manage the cloud instances and also create custom roles. Managing and provisioning has become easy with the ServiceNow cloud management application. The cloud management application seamlessly integrates with the ServiceNow service catalog and also provides providing automation capability with orchestration workflows. The cloud management application fully integrates the life cycle management of virtual resources into standard ServiceNow data collection, management, analytics, and reporting capabilities. The ServiceNow cloud management application provides easy and quick options to key private cloud providers, which include: AWS Cloud: Manages Amazon Web Services (AWS) using AWS Cloud Microsoft Azure Cloud: The Microsoft Azure Cloud application integrates with Azure through the service catalog and provides the ability to manage virtual resources easily VMware Cloud: The VMware Cloud application integrates with VMware vCenter to manage the virtual resources by integrating with the service catalog The following figure describes a high-level architecture of the cloud management application: Key features with the cloud management applications include the following: Single pane of glass to manage the virtual services in public and private cloud environment including approvals, notifications, security, asset management, and so on Ability to repurpose configurations through resource templates that help to reuse the capability sets Seamless integration with the service catalog, with a defined workflow and approvals integration can be done end to end right from the user request to the cloud provisioning Ability to control the leased resources through date controls and role-based security access Ability to use the ServiceNow discovery application or the standalone capability to discover virtual resources and their relationships in their environments Ability to determine the best virtualization server for a VM based on the discovered data by the CMDB auto discovery Ability to control and manage virtual resources effectively with a controlled termination shutdown date Ability to increate virtual server resources through a controlled fashion, for example, increasing storage or memory, integrating with the service catalog, and with right and appropriate approvals the required resources can be increased to the required Ability to perform a price calculation and integration of managed virtual machines with asset management Ability to auto or manually provision the required cloud environment with zero click options There are different roles within the cloud management applications, here are some of them: Virtual provisioning cloud administrator: The administrator owns the cloud admin portal and end to end management including configuration of the cloud providers. They have access to be able to configure the service catalog items that will be used by the requesters and the approvals required to provision the cloud environment. Virtual provisioning cloud approver: Who either approves or rejects requests for virtual resources. Virtual provisioning cloud operator: The operator fulfills the requests to manage the virtual resources and the respective cloud management providers. Cloud operators are mostly involved when there is a manual human intervention required to manage or provision the virtual resources. Virtual provisioning cloud user: Users have access to the my virtual assets portal that helps them to manage the virtual resources they own, or requested, or are responsible for.   How clouds are provisioned The cloud administrator creates a service catalog item for users to be able to request for cloud resources The cloud user requests for a virtual machine through the service catalog The request goes to the approver who either approves or rejects it The cloud operator provisions the requests manually or virtual resources are auto provisioned Discovery Imagine how an atlas is mapped and how places have been discovered by the satellite using exploration devices including manually, satellite, survey maps, such as street maps collector devices. These devices crawl through all the streets to collect different data points that include information about the streets, houses, and much more details are collected. This information is used by the consumers for various purposes including GPS devices, finding and exploring different areas, address of a location, on the way finding for any incidents, constructions, road closures, and so on. ServiceNow discovery works the same way, ServiceNow discovery explores through the enterprise network identifying for the devices in scope. ServiceNow discovery probes and sensors perform the collection of infrastructure devices connected to a given enterprise network. Discovery uses Shazzam probes to determine the TCP ports opened and to see if it responds to the SNMP queries and sensors to explore any given computer or device, starting first with basic probes and then using more specific probes as it learns more. Discovery explores to check on the type of device, for each type of device, discovery uses different kinds of probes to extract more information about the computer or device, and the software that is running on it. CMDB is updated or data is federated through the ServiceNow discovery. They are identified with the discovery that is set and actioned to search the CMDB for a CI that again matches the discovered CI on the network. When a device match is found what actions to be taken are defined by the administrator when discovery runs based on the configuration when a CI is discovered; either CMDB gets updated with an existing CI or a new CI is created within the CMDB. Discovery can be scheduled to perform the scan on certain intervals; configuration management keeps the up to date status of the CI through the discovery. During discovery the MID Server looks back on the probes to run from the ServiceNow instance and executes probes to retrieves the results to the ServiceNow instance or the CMDB for processing. No data is retained on the MID Server. The data collected by these probes are processed by sensors. ServiceNow is hosted in the ServiceNow data centers spanned across the globe. ServiceNow as an application does not have the ability to communicate with any given enterprise network. Traditionally, there are two different types of discovery tools on the market: Agent: A piece of software is installed on the servers or individual systems that sends all information about the system to the CMDB. Agentless: Usually doesn't require any individual installations on the systems or components. They utilize a single system or software to usually probe and sense the network by scanning and federating the CMDB. ServiceNow is an agentless discovery that does not require any individual software to be installed, it uses MID Server. Discovery is available as a separate subscription from the rest of the ServiceNow platform and requires the discovery plugin. MID Server is a Java software that runs on any windows or UNIX or Linux system that resides within the enterprise network that needs to be discovered. MID Server is the bridge and communicator between the ServiceNow instance that is sitting somewhere on the cloud and the enterprise network that is secured and controlled. MID Server uses several techniques to probe devices without using agents. Depending on the type of infrastructure components, MID Server uses the appropriate protocol to gather information from the infrastructure component, for example, to gather information from network devices MID Server will use Simple Network Management Protocol (SNMP), to be able to connect to the Unix systems MID Server will use SSH. The following table shows different ServiceNow discovery probe types: Device Probe type Windows computers and servers Remote WMI queries, shell commands UNIX and Linux servers Shell command (via SSH protocol) Storage CIM/WBEM queries Printers SNMP queries Network gear (switches, routers, and so on) SNMP queries Web servers HTTP header examination Uninterruptible Power Supplies (UPS) SNMP queries Credentials ServiceNow discovery and orchestration features require credentials to be able to access the enterprise network; these credentials vary from network and devices. Credentials such as usernames, passwords, and certificates need a secure place to store these credentials. ServiceNow credentials applications store credentials in an encrypted format on a specific table within the credentials table. Credential tagging allows workflow creators to assign individual credentials to any activity in an orchestration workflow or assign different credentials to each occurrence of the same activity type in an orchestration workflow. Credential tagging also works with credential affinities. Credentials can be assigned an order value that forces the discovery and orchestration to try all the credentials when orchestration attempts to run a command or discovery tries to query. Credentials tables contain many credentials, based on pattern of usage the credential applications which places on the highly used list that enables the discovery and orchestration to work faster after first successful connection and system knowing which credential to use for a faster logon to the device next time. Image source: wiki.servicenow.com Credentials are encrypted automatically with a fixed instance key when they are submitted or updated in the credentials (discovery_credentials) table. When credentials are requested by the MID Server, the platform decrypts the credentials using the following process: The credentials are decrypted on the instance with the password2 fixed key. The credentials are re-encrypted on the instance with the MID Server's public key. The credentials are encrypted on the load balancer with SSL. The credentials are decrypted on the MID Server with SSL. The credentials are decrypted on the MID Server with the MID Server's private key. The ServiceNow credential application integrates with the CyberArk credential storage. The MID Server integration with CyberArk vault enables orchestration and discovery to run without storing any credentials on the ServiceNow instance. The instance maintains a unique identifier for each credential, the credential type (such as SSH, SNMP, or Windows), and any credential affinities. The MID Server obtains the credential identifier and IP address from the instance, and then uses the CyberArk vault to resolve these elements into a usable credential. The CyberArk integration requires the external credential storage plugin, which is available by request. The CyberArk integration supports these ServiceNow credential types: CIM JMS SNMP community SSH SSH private key (with key only) VMware Windows Orchestration activities that use these network protocols support the use of credentials stored on a CyberArk vault: SSH PowerShell JMS SFTP Summary In this article, we covered an overview of ITOM, explored different ServiceNow ITOM components including high level architecture, functional aspects of ServiceNow ITOM components that include discovery, credentials, dependency views, and, cloud management.  Resources for Article: Further resources on this subject: Management of SOA Composite Applications [article] Working with Business Rules to Define Decision Points in Oracle SOA Suite 11g R1 [article] Introduction to SOA Testing [article]
Read more
  • 0
  • 0
  • 3777
Modal Close icon
Modal Close icon