How-To Tutorials

11 Nov 2016

11 min read

Getting Started with Flocker

11 Nov 2016

0
0
2962

How-To Tutorials

article-image-devops-tools-and-technologies

Packt

11 Nov 2016

15 min read

DevOps Tools and Technologies

Packt

11 Nov 2016

15 min read

In this article by Ritesh Modi, the author of the book DevOps with Windows Server 2016, we will introduce foundational platforms and technologies instrumental in enabling and implementing DevOps practices. (For more resources related to this topic, see here.) These include: Technology stack for implementing Continuous Integration, Continuous Deployment, Continuous Deliver, Configuration Management, and Continuous Improvement. These form the backbone for DevOps processes and include source code services, build services, and release services through Visual Studio Team Services. Platform and technology used to create and deploy a sample web application. This includes technologies such as Microsoft .NET, ASP.NET and SQL Server databases. Tools and technology for configuration management, testing of code and application, authoring infrastructure as code, and deployment of environments. Examples of these tools and technologies are Pester for environment validation, environment provisioning through Azure Resource Manager (ARM) templates, Desired State Configuration (DSC) and Powershell, application hosting on containers through Windows Containers and Docker, application and database deployment through Web Deploy packages, and SQL Server bacpacs. Cloud technology Cloud is ubiquitous. Cloud is used for our development environment, implementation of DevOps practices, and deployment of applications. Cloud is a relatively new paradigm in infrastructure provisioning, application deployment, and hosting space. The only options prior to the advent of cloud was either self-hosted on-premises deployments or using services from a hosting service provider. However, cloud is changing the way enterprises look at their strategy in relation to infrastructure and application development, deployment, and hosting. In fact, the change is so enormous that it has found its way into every aspect of an organization's software development processes, tools, and practices. Cloud computing refers to the practice of deploying applications and services on the Internet with a cloud provider. A cloud provider provides multiple types of services on cloud. They are divided into three categories based on their level of abstraction and degree of control on services. These categories are as follows: Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Software as a Service (SaaS) These three categories differ based on the level of control a cloud provider exercises compared to the cloud consumer. The services provided by a cloud provider can be divided into layers, with each layer providing a type of service. As we move higher in the stack of layers, the level of abstraction increases in line with the cloud provider's control over services. In other words, the cloud consumer starts to lose control over services as you move higher in each column: Figure 1: Cloud Services – IaaS, PaaS and SaaS Figure 1 shows the three types of service available through cloud providers and the layers that comprise these services. These layers are stacked vertically on each other and show the level of control a cloud provider has compared to a consumer. From Figure 1, it is clear that for IaaS, a cloud provider is responsible for providing, controlling, and managing layers from the network layer up to the virtualization layer. Similarly, for PaaS, a cloud provider controls and manages from the hardware layer up to the runtime layer, while the consumer controls only the application and data layers. Infrastructure as a Service (IaaS) As the name suggests, Infrastructure as a Service is an infrastructure service provided by a cloud provider. This service includes the physical hardware and its configuration, network hardware and its configuration, storage hardware and its configuration, load balancers, compute, and virtualization. Any layer above virtualization is the responsibility of the consumer to provision, configure, and manage. The consumer can decide to use the provided underlying infrastructure in whatever way best suits their requirements. Consumers can consume the storage, network, and virtualization to provision their virtual machines on top of. It is the consumer's responsibility to manage and control the virtual machines and the things deployed within it. Platform as a Service (PaaS) Platform as a Service enables consumers to deploy their applications and services on the provided platform, consuming the underlying runtime, middleware, and services. The cloud provider provides the services from infrastructure to runtime. The consumers cannot provision virtual machines as they cannot access and control them. Instead, they can only control and manage their applications. This is a comparatively faster method of development and deployment because now the consumer can focus on application development and deployment. Examples of Platform as a Service include Azure Automation, Azure SQL, and Azure App Services. Software as a Service (SaaS) Software as a Service provides complete control of the service to the cloud provider. The cloud provider provisions, configures, and manages everything from infrastructure to the application. It includes the provisioning of infrastructure, deployment and configuration of applications, and provides application access to the consumer. The consumer does not control and manage the application, and can use and configure only parts of the application. They control only their data and configuration. Generally, multi-tenant applications used by multiple consumers, such as Office 365 and Visual Studio Team Services, are examples of SaaS. Advantages of using cloud computing There are multiple distinct advantages of using cloud technologies. The major among them are as follows: Cost effective: Cloud computing helps organizations to reduce the cost of storage, networks, and physical infrastructure. It also prevents them from having to buy expensive software licenses. The operational cost of managing these infrastructures also reduces due to lesser effort and manpower requirements. Unlimited capacity: Cloud provides unlimited resources to the consumer. This ensures applications will never get throttled due to limited resource availability. Elasticity: Cloud computing provides the notion of unlimited capacity and applications deployed on it can scale up or down on an as-needed basis. When demand for the application increases, cloud can be configured to scale up the infrastructure and application by adding additional resources. At the same time, it can scale down unnecessary resources during periods of low demand. Pay as you go: Using cloud eliminates capital expenditure and organizations pay only for what they use, thereby providing maximum return on investment. Organizations do not need to build additional infrastructure to host their application for times of peak demand. Faster and better: Cloud provides ready-to-use applications and faster provisioning and deployment of environments. Moreover, organizations get better-managed services from their cloud provider with higher service-level agreements. We will use Azure as our preferred cloud computing provider for the purpose of demonstrating samples and examples. However, you can use any cloud provider that provides complete end-to-end services for DevOps. We will use multiple features and services provided by Azure across IaaS and PaaS. We will consume Operational Insights and Application Insights to monitor our environment and application, which will help capture relevant telemetry for auditing purposes. We will provision Azure virtual machines running Windows and Docker Containers as a hosting platform. We will use Windows Server 2016 as the target operating system for our applications on cloud. Azure Resource Manager (ARM). We will also use Desired State Configuration and PowerShell as our configuration management platform and tool. We will use Visual Studio Team Services (VSTS), a suite of PaaS services on cloud provided by Microsoft, to set up and implement our end-to-end DevOps practices. Microsoft also provides the same services as part of Team Foundation Services (TFS) as an on-premises solution. Technologies like Pester, DSC, and PowerShell can be deployed and configured to run on any platform. These will help both in the validation of our environment and in the configuration of both application and environment, as part of our Configuration management process. Windows Server 2016 is a breakthrough operating system from Microsoft also referred to as Cloud Operating System. We will look into Windows Server 2016 in the following section. Windows Server 2016 Windows Server 2016 has come a long way. All the way from Windows NT to Windows 2000 and 2003, then Windows 2008 (R2) and 2012 (R2), and now Windows Server 2016. Windows NT was the first popular Windows server among enterprises. However, the true enterprise servers were Windows 2000 and Windows 2003. The popularity of Windows Server 2003 was unprecedented and it was widely adopted. With Windows Server 2008 and 2008 R2, the idea of the data center took priority and enterprises with their own data center adopted it. Even the Windows Server 2008 series was quite popular among enterprises. In 2010, the Microsoft cloud, Azure, was launched. The first steps towards a cloud operating system were Windows Server 2012 and 2012 R2. They had the blueprints and technology to be seamlessly provisioned on Azure. Now, when Azure and cloud are gaining enormous popularity, Windows Server 2016 is released as a true cloud operating system. The evolution of Windows Server is shown in Figure 2: Figure 2: Windows Server evolution Windows Server 2016 is referred to as a cloud operating system. It is built with cloud in mind. It is also referred to as the first operating system that enables DevOps seamlessly by providing relevant tools and technologies. It makes implementing DevOps simpler and easier through its productivity tools. Let us look briefly into these tools and technologies. Multiple choices for Application platform Windows Server 2016 comes with many choices for application platform for applications. It provides the following: Windows Server 2016 Nano Server Windows and Docker Containers Hyper-V Containers Nested virtual machines Windows Server as a hosting platform Windows server 2016 can be used in the ways it has always been used, such as hosting applications and providing server functionalities. It provides the services necessary to make applications secure, scalable, and highly available. It also provides virtualization, directory services, certificate services, web server, databases, and more. These services can be consumed by the enterprise’s services and applications. Nano Server Windows Server provides a new option to host applications and services. This is a new variety of lightweight, scaled-down Windows server containing only the kernel and drivers necessary to run as an operating system. They are also known as headless servers. They do not have any graphical user interface and the only way to interact and manage them is through remote PowerShell. Out of the box, they do not contain any service or feature. The services need to be added to Nano servers explicitly before use. So far, they are the most secure servers from Microsoft. They are very lightweight and their resource requirements and consumption is less than 80% of a normal Windows server. The number of services running, the number of ports open, the number of processes running and the amount of memory and storage required, also are less than 80% compared to normal Windows server. Even though Nano Server out of box just has the kernel and drivers, its capabilities can be enhanced by adding features and deploying any Windows application on it. Windows Containers and Docker Containers are one of the most revolutionary features added to Windows Server 2016 after Nano Server. With the popularity and adoption of Docker Containers, which primarily run on Linux, Microsoft decided to introduce container services to Windows Server 2016. Containers are operating system virtualization. This means that multiple containers can be deployed on the same operating system and each one of them will share the host operating system kernel. It is the next level of virtualization after server virtualization (virtual machines). Containers generate the notion of complete operating system isolation and independence, even though it uses the same host operating system underneath it. This is possible through the use of namespace isolation and image layering. Containers are created from images. Images are immutable and cannot be modified. Each image has a base operating system and a series of instructions that are executed against it. Each instruction creates a new image on top of the previous image and contains only the modification. Finally, a writable image is stacked on top of these images. These images are combined into a single image, which can then be used for provisioning containers. A container made up of multiple image layers is shown in Figure 3: Figure 3: Containers made up of multiple image layers Namespace isolation helps provide containers with pristine new environments. The containers cannot see the host resources and the host cannot view the container resources. For the application within the container, a complete new installation of the operating system is available. The containers share the host's memory, CPU, and storage. Containers offer operating system virtualization, which means the containers can host only those operating systems supported by the host operating system. There cannot be a Windows container running on a Linux host, and a Linux container cannot run on a Windows host operating system. Hyper-V containers Another type of container technology Windows Server 2016 provides is Hyper-V Containers. These containers are similar to Windows Containers. They are managed through the same Docker client and extend the same Docker APIs. However, these containers contain their own scaled down operating system kernel. They do not share the host operating system but have their own dedicated operating system, and their own dedicated memory and CPU assigned in exactly the same way virtual machines are assigned resources. Hyper-V Containers brings in a higher level of isolation of containers from the host. While Windows Containers runs in full trust on the host operating system, Hyper-V Containers does not have full trust from the host’s perspective. It is this isolation that differentiates Hyper-V Containers from Windows Containers. Hyper-V Containers is ideal for hosting applications that might harm the host server affecting every other container and service on it. Scenarios where users can bring in and execute their own code are examples of such applications. Hyper-V Containers provides adequate isolation and security to ensure that applications cannot access the host resources and change them. Nested virtual machines Another breakthrough innovation of Windows Server 2016 is that now, virtual machines can host virtual machines. Now, we can deploy multiple virtual machines containing all tiers of an application within a single virtual machine. This is made possible through software-defined networks and storage. Enabling Microservices Nano Servers and Containers helps provide advanced lightweight deployment options through which we can now decompose the entire application into multiple smaller, independent services, each with their own scalability and high availability configuration, and deploy them independently of each other. Microservices helps in making the entire DevOps lifecycle agile. With Microservices, changes to services do not demand that every other Microservices undergo every test validation. Only the changed service needs to be tested rigorously, along with its integration with other services. Compare this to a monolithic application. Even a single small change will result in having to test the entire application. Microservices helps in that it requires smaller teams for its development, testing of a service can happen independently of other services, and deployment can be done for each service in isolation. Continuous Integration, Continuous Deployment, and Continuous Delivery for each service can be executed in isolation rather than compiling, testing, and deploying the whole application every time there is a change. Reduced maintenance Because of their intrinsic nature, Windows Nano Servers and Containers are lightweight and quick to provision. They help to quickly provision and configure environments, thereby reducing the overall time needed for Continuous Integration and deployment. Also, these resources can be provisioned on Azure on-demand without waiting for hours. Because of their small footprint in terms of size, storage, memory, and features, they need less maintenance. These servers are patched less often, with fewer fixes, they are secure by default, and have less chance of failing applications, which makes them ideal for operations. The operations team needs to spend fewer hours maintaining these servers compared to normal servers. This reduces the overall cost for the organization and helps DevOps ensure a high-quality delivery. Configuration management tools Windows Server 2016 comes with Windows Management Framework 5.0 installed by default. Desired State Configuration (DSC) is the new configuration management platform available out of the box in Windows Server 2016. It has a rich, mature set of features that enables configuration management for both environments and applications. With DSC, the desired state and configuration of environments are authored as part of Infrastructure as Code and executed on every server on a scheduled basis. They help check the current state of servers with the documented desired state and bring them back to the desired state. DSC is available as part of PowerShell and PowerShell helps with authoring these configuration documents. Windows server 2016 provides a PowerShell unit testing framework known as PESTER. Historically, unit testing for infrastructure environments was always missing as a feature. PESTER enables the testing of infrastructure provisioned either manually or through Infrastructure as Code using DSC configuration or ARM templates. These help with the operational validation of the entire environment, bringing in a high level of cadence and confidence in Continuous Integration and deployment processes. Deployment and packaging Package management and the deployment of utilities and tools through automation is a new concept in the Windows world. Package management has been ubiquitous in the Linux world for a long time. Packing management helps search, save, install, deploy, upgrade, and remove software packages from multiple sources and repositories on demand. There are public repositories such as Chocolatey and PSGallery available for storing readily deployable packages. Tools such as NuGet can connect these repositories and help with package management. They also help with the versioning of packages. Applications that rely on a specific package version can download it on an as-needed basis. Package management helps with the building of environments and application deployment. Package deployment is much easier and faster with this out-of-the-box Windows feature. Summary We have covered a lot of ground in this article. DevOps concepts were discussed mapping technology to those concepts. In this we saw the impetus DevOps can get from technology. We looked at cloud computing and the different services provided by cloud providers. From there, we went on to look at the benefits Windows Server 2016 brings to DevOps practices and how Windows Server 2016 makes DevOps easier and faster with its native tools and features. Resources for Article: Further resources on this subject: Introducing Dynamics CRM [article] Features of Dynamics GP [article] Creating Your First Plug-in [article]

0
0
11686

article-image-planning-and-structuring-your-test-driven-ios-app

Packt

11 Nov 2016

13 min read

Planning and Structuring Your Test-Driven iOS App

Packt

11 Nov 2016

13 min read

0
0
27364

Packt

10 Nov 2016

15 min read

Introduction to JavaScript

Packt

10 Nov 2016

15 min read

In this article by Simon Timms, author of the book, Mastering JavaScript Design Patterns - Second Edition, we will explore the history of JavaScript and how it came to be the important language that it is today (For more resources related to this topic, see here.) JavaScript is an evolving language that has come a long way from its inception. Possibly more than any other programming language, it has grown and changed with the growth of the World Wide Web. As JavaScript has evolved and grown in importance, the need to apply rigorous methods to its construction has also grown. The road to JavaScript We'll never know how language first came into being. Did it slowly evolve from a series of grunts and guttural sounds made during grooming rituals? Perhaps it developed to allow mothers and their offspring to communicate. Both of these are theories, all but impossible to prove. Nobody was around to observe our ancestors during that important period. In fact, the general lack of empirical evidence lead the Linguistic Society of Paris to ban further discussions on the topic, seeing it as unsuitable for serious study. The early days Fortunately, programming languages have developed in recent history and we've been able to watch them grow and change. JavaScript has one of the more interesting histories of modern programming languages. During what must have been an absolutely frantic 10 days in May of 1995, a programmer at Netscape wrote the foundation for what would grow up to be modern JavaScript. At the time, Netscape was involved in the first of the browser wars with Microsoft. The vision for Netscape was far grander than simply developing a browser. They wanted to create an entire distributed operating system making use of Sun Microsystems' recently-released Java programming language. Java was a much more modern alternative to the C++ Microsoft was pushing. However, Netscape didn't have an answer to Visual Basic. Visual Basic was an easier to use programming language, which was targeted at developers with less experience. It avoided some of the difficulties around memory management that make C and C++ notoriously difficult to program. Visual Basic also avoided strict typing and overall allowed more leeway: Brendan Eich was tasked with developing Netscape repartee to VB. The project was initially codenamed Mocha, but was renamed LiveScript before Netscape 2.0 beta was released. By the time the full release was available, Mocha/LiveScript had been renamed JavaScript to tie it into the Java applet integration. Java Applets were small applications which ran in the browser. They had a different security model from the browser itself and so were limited in how they could interact with both the browser and the local system. It is quite rare to see applets these days, as much of their functionality has become part of the browser. Java was riding a popular wave at the time and any relationship to it was played up. The name has caused much confusion over the years. JavaScript is a very different language from Java. JavaScript is an interpreted language with loose typing, which runs primarily on the browser. Java is a language that is compiled to bytecode, which is then executed on the Java Virtual Machine. It has applicability in numerous scenarios, from the browser (through the use of Java applets), to the server (Tomcat, JBoss, and so on), to full desktop applications (Eclipse, OpenOffice, and so on). In most laypersons' minds, the confusion remains. JavaScript turned out to be really quite useful for interacting with the web browser. It was not long until Microsoft had also adopted JavaScript into their Internet Explorer to complement VBScript. The Microsoft implementation was known as JScript. By late 1996, it was clear that JavaScript was going to be the winning web language for the near future. In order to limit the amount of language deviation between implementations, Sun and Netscape began working with the European Computer Manufacturers Association (ECMA) to develop a standard to which future versions of JavaScript would need to comply. The standard was released very quickly (very quickly in terms of how rapidly standards organizations move), in July of 1997. On the off chance that you have not seen enough names yet for JavaScript, the standard version was called ECMAScript, a name which still persists in some circles. Unfortunately, the standard only specified the very core parts of JavaScript. With the browser wars raging, it was apparent that any vendor that stuck with only the basic implementation of JavaScript would quickly be left behind. At the same time, there was much work going on to establish a standard Document Object Model (DOM) for browsers. The DOM was, in effect, an API for a web page that could be manipulated using JavaScript. For many years, every JavaScript script would start by attempting to determine the browser on which it was running. This would dictate how to address elements in the DOM, as there were dramatic deviations between each browser. The spaghetti of code that was required to perform simple actions was legendary. I remember reading a year-long 20-part series on developing a Dynamic HTML (DHTML) drop down menu such that it would work on both Internet Explorer and Netscape Navigator. The same functionally can now be achieved with pure CSS without even having to resort to JavaScript. DHTML was a popular term in the late 1990s and early 2000s. It really referred to any web page that had some sort of dynamic content that was executed on the client side. It has fallen out of use, as the popularity of JavaScript has made almost every page a dynamic one. Fortunately, the efforts to standardize JavaScript continued behind the scenes. Versions 2 and 3 of ECMAScript were released in 1998 and 1999. It looked like there might finally be some agreement between the various parties interested in JavaScript. Work began in early 2000 on ECMAScript 4, which was to be a major new release. A pause Then, disaster struck. The various groups involved in the ECMAScript effort had major disagreements about the direction JavaScript was to take. Microsoft seemed to have lost interest in the standardization effort. It was somewhat understandable, as it was around that time that Netscape self-destructed and Internet Explorer became the de-facto standard. Microsoft implemented parts of ECMAScript 4 but not all of it. Others implemented more fully-featured support, but without the market leader on-board, developers didn't bother using them. Years passed without consensus and without a new release of ECMAScript. However, as frequently happens, the evolution of the Internet could not be stopped by a lack of agreement between major players. Libraries such as jQuery, Prototype, Dojo, and Mootools, papered over the major differences in browsers, making cross-browser development far easier. At the same time, the amount of JavaScript used in applications increased dramatically. The way of GMail The turning point was, perhaps, the release of Google's GMail application in 2004. Although XMLHTTPRequest, the technology behind Asynchronous JavaScript and XML (AJAX), had been around for about five years when GMail was released, it had not been well-used. When GMail was released, I was totally knocked off my feet by how smooth it was. We've grown used to applications that avoid full reloads, but at the time, it was a revolution. To make applications like that work, a great deal of JavaScript is needed. AJAX is a method by which small chunks of data are retrieved from the server by a client instead of refreshing the entire page. The technology allows for more interactive pages that avoid the jolt of full page reloads. The popularity of GMail was the trigger for a change that had been brewing for a while. Increasing JavaScript acceptance and standardization pushed us past the tipping point for the acceptance of JavaScript as a proper language. Up until that point, much of the use of JavaScript was for performing minor changes to the page and for validating form input. I joke with people that in the early days of JavaScript, the only function name which was used was Validate(). Applications such as GMail that have a heavy reliance on AJAX and avoid full page reloads are known as Single Page Applications or SPAs. By minimizing the changes to the page contents, users have a more fluid experience. By transferring only JavaScript Object Notation (JSON) payload instead of HTML, the amount of bandwidth required is also minimized. This makes applications appear to be snappier. In recent years, there have been great advances in frameworks that ease the creation of SPAs. AngularJS, backbone.js, and ember are all Model View Controller style frameworks. They have gained great popularity in the past two to three years and provide some interesting use of patterns. These frameworks are the evolution of years of experimentation with JavaScript best practices by some very smart people. JSON is a human-readable serialization format for JavaScript. It has become very popular in recent years, as it is easier and less cumbersome than previously popular formats such as XML. It lacks many of the companion technologies and strict grammatical rules of XML, but makes up for it in simplicity. At the same time as the frameworks using JavaScript are evolving, the language is too. 2015 saw the release of a much-vaunted new version of JavaScript that had been under development for some years. Initially called ECMAScript 6, the final name ended up being ECMAScript-2015. It brought with it some great improvements to the ecosystem. Browser vendors are rushing to adopt the standard. Because of the complexity of adding new language features to the code base, coupled with the fact that not everybody is on the cutting edge of browsers, a number of other languages that transcompile to JavaScript are gaining popularity. CoffeeScript is a Python-like language that strives to improve the readability and brevity of JavaScript. Developed by Google, Dart is being pushed by Google as an eventual replacement for JavaScript. Its construction addresses some of the optimizations that are impossible in traditional JavaScript. Until a Dart runtime is sufficiently popular, Google provides a Dart to the JavaScript transcompiler. TypeScript is a Microsoft project that adds some ECMAScript-2015 and even some ECMAScript-201X syntax, as well as an interesting typing system, to JavaScript. It aims to address some of the issues that large JavaScript projects present. The point of this discussion about the history of JavaScript is twofold: first, it is important to remember that languages do not develop in a vacuum. Both human languages and computer programming languages mutate based on the environments in which they are used. It is a popularly held belief that the Inuit people have a great number of words for "snow", as it was so prevalent in their environment. This may or may not be true, depending on your definition for the word and exactly who makes up the Inuit people. There are, however, a great number of examples of domain-specific lexicons evolving to meet the requirements for exact definitions in narrow fields. One need look no further than a specialty cooking store to see the great number of variants of items which a layperson such as myself would call a pan. The Sapir–Whorf hypothesis is a hypothesis within the linguistics domain, which suggests that not only is language influenced by the environment in which it is used, but also that language influences its environment. Also known as linguistic relativity, the theory is that one's cognitive processes differ based on how the language is constructed. Cognitive psychologist Keith Chen has proposed a fascinating example of this. In a very highly-viewed TED talk, Dr. Chen suggested that there is a strong positive correlation between languages that lack a future tense and those that have high savings rates (https://www.ted.com/talks/keith_chen_could_your_language_affect_your_ability_to_save_money/transcript). The hypothesis at which Dr. Chen arrived is that when your language does not have a strong sense of connection between the present and the future, this leads to more reckless behavior in the present. Thus, understanding the history of JavaScript puts one in a better position to understand how and where to make use of JavaScript. The second reason I explored the history of JavaScript is because it is absolutely fascinating to see how quickly such a popular tool has evolved. At the time of writing, it has been about 20 years since JavaScript was first built and its rise to popularity has been explosive. What more exciting thing is there than to work in an ever-evolving language? JavaScript everywhere Since the GMail revolution, JavaScript has grown immensely. The renewed browser wars, which pit Internet Explorer and Edge against Chrome, against Firefox, have lead to building a number of very fast JavaScript interpreters. Brand new optimization techniques have been deployed and it is not unusual to see JavaScript compiled to machine-native code for the added performance it gains. However, as the speed of JavaScript has increased, so has the complexity of the applications built using it. JavaScript is no longer simply a language for manipulating the browser, either. The JavaScript engine behind the popular Chrome browser has been extracted and is now at the heart of a number of interesting projects such as Node.js. Node.js started off as a highly asynchronous method of writing server-side applications. It has grown greatly and has a very active community supporting it. A wide variety of applications have been built using the Node.js runtime. Everything from build tools to editors have been built on the base of Node.js. Recently, the JavaScript engine for Microsoft Edge, ChakraCore, was also open sourced and can be embedded in NodeJS as an alternative to Google's V8. SpiderMonkey, the FireFox equivalent, is also open source and is making its way into more tools. JavaScript can even be used to control microcontrollers. The Johnny-Five framework is a programming framework for the very popular Arduino. It brings a much simpler approach to programming devices than the traditional low-level languages used for programming these devices. Using JavaScript and Arduino opens up a world of possibilities, from building robots to interacting with real-world sensors. All of the major smartphone platforms (iOS, Android, and Windows Phone) have an option to build applications using JavaScript. The tablet space is much the same with tablets supporting programming using JavaScript. Even the latest version of Windows provides a mechanism for building applications using JavaScript: JavaScript is becoming one of the most important languages in the world. Although language usage statistics are notoriously difficult to calculate, every single source which attempts to develop a ranking puts JavaScript in the top 10: Language index Rank of JavaScript Langpop.com 4 Statisticbrain.com 4 Codeval.com 6 TIOBE 8 What is more interesting is that most of of these rankings suggest that the usage of JavaScript is on the rise. The long and short of it is that JavaScript is going to be a major language in the next few years. More and more applications are being written in JavaScript and it is the lingua franca for any sort of web development. Developer of the popular Stack Overflow website Jeff Atwood created Atwood's Law regarding the wide adoption of JavaScript: "Any application that can be written in JavaScript, will eventually be written in JavaScript" – Atwood's Law, Jeff Atwood This insight has been proven to be correct time and time again. There are now compilers, spreadsheets, word processors—you name it—all written in JavaScript. As the applications which make use of JavaScript increase in complexity, the developer may stumble upon many of the same issues as have been encountered in traditional programming languages: how can we write this application to be adaptable to change? This brings us to the need for properly designing applications. No longer can we simply throw a bunch of JavaScript into a file and hope that it works properly. Nor can we rely on libraries such as jQuery to save ourselves. Libraries can only provide additional functionality and contribute nothing to the structure of an application. At least some attention must now be paid to how to construct the application to be extensible and adaptable. The real world is ever-changing, and any application that is unable to change to suit the changing world is likely to be left in the dust. Design patterns provide some guidance in building adaptable applications, which can shift with changing business needs. Summary JavaScript has an interesting history and is really coming of age. With server-side JavaScript taking off and large JavaScript applications becoming common, there is a need for more diligence in building JavaScript applications. For more information on JavaScript, you can check other books by Packt mentioned as follows: Mastering JavaScript Promises: https://www.packtpub.com/application-development/mastering-javascript-promises Mastering JavaScript High Performance: https://www.packtpub.com/web-development/mastering-javascript-high-performance JavaScript : Functional Programming for JavaScript Developers: https://www.packtpub.com/web-development/javascript-functional-programming-javascript-developers Resources for Article: Further resources on this subject: API with MongoDB and Node.js [article] Tips & Tricks for Ext JS 3.x [article] Saying Hello! [article]

0
0
12306

How-To Tutorials

article-image-testing-components-service-dependencies

Victor Mejia

10 Nov 2016

5 min read

Testing Components with Service Dependencies

Victor Mejia

10 Nov 2016

5 min read

It is very common for your Angular 2 components to depend on a service that performs actions, such as fetching data. In this post we will look at testing components with service dependencies, and at testing asynchronous actions. We will be using Jasmine for our tests. If you have not read Getting Started Testing Angular 2 Components, I strongly suggest you do so before continuing. Angular 2 Component with a Service Dependency Continuing with our contact manager application, we need to have a ContactService that fetches data from a server: import { Injectable } from '@angular/core'; import { Http } from '@angular/http'; import 'rxjs/add/operator/map'; @Injectable() export class ContactService { constructor(private http: Http){ } getContacts() { return this.http.get('/contacts.json') .map(res => res.json()); } } The Http service is injected here, and TypeScript will automatically assign the injected service to this.http. With this service ready to use, we are now ready to inject it into our ContactsComponent : import { Component, OnInit } from '@angular/core'; import { ContactService } from '../shared/contact.service'; @Component({ selector: 'contacts', template: ` <button (click)="getContacts()">Get Contacts</button> <profile *ngFor="let profile of contacts" [info]="profile"></profile> ` }) export class ContactsComponent implements OnInit { contacts: Array<any>; constructor(private contactService: ContactService) { } ngOnInit() { } getContacts() { this.contactService.getContacts() .subscribe(data => { this.contacts = data; }); } } We have an action set up, so when we click on the button, we make a call to our ContactService to fetch the data and assign the result. Once the call is resolved, the data will display. Setting up your unit test What we have to keep in mind is that we want to test our components in isolation. What this means is that instead of using the actual ContactService implementation, we create a MockContactService that returns mock data (array of Profile s). let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } When configuring our testing module, we add a new property,providers, where we specify the usage of our mock service: TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); We can now go ahead and get handles on fixture, component, and element : import { TestBed, async, ComponentFixture } from '@angular/core/testing'; import { ContactsComponent } from './contacts.component'; import { ContactService } from '../shared/contact.service'; import { Profile } from '../shared/profile.model'; import { Observable } from 'rxjs/Observable'; import { Observer } from 'rxjs/Observer'; let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } let fixture: ComponentFixture<ContactsComponent>; let component: ContactsComponent; let element: HTMLElement; describe('Component: Contacts', () => { beforeEach(async(() => { TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); TestBed.compileComponents() .then(() => { fixture = TestBed.createComponent(ContactsComponent); component = fixture.debugElement.componentInstance; element = fixture.debugElement.nativeElement }); })); }); Ensuring calls to our service A good test to always perform is to ensure that your actions are making the correct calls to your service. To do so, we can spy on the getContacts() function on the service, calling the component action and then ensuring that the function was indeed called: describe('getContacts', () => { it('should make a call to contactService.getContacts()', () => { spyOn(component.contactService, 'getContacts').and.callThrough(); component.getContacts(); expect(component.contactService.getContacts).toHaveBeenCalled(); }); }); Ensuring data is set A follow-up test to be performed is to ensure that the data is being set on the component after the call to the API is resolved. Since our call to getContacts() is performing an asynchronous action, we should use the async function in the it: it('should set the contacts property after fetching data', async(() => { ... })); It wraps the test function in an asynchronous “test zone”. Basically, it automatically completes when the asynchronous actions are complete. Next, we can make a call to component.getContacts() . However, we don’t want to run our specs until after that call has been resolved. There is a useful function we can use in our fixture, fixture.whenStable(). This returns a promise that resolves after asynchronous activity. Our test should now look as follows: it('should set the contacts property after fetching data', async(() => { component.getContacts(); fixture.whenStable().then(() => { expect(component.contacts).toEqual(mockData); }); })); We simply run a check to ensure that the contacts property is set to what the API call returns. Finer Async Control There are times when you want finer control, such as dealing with time intervals, and so on. To do so, we can simply use the fakeAsync in conjunction with the tick() function to simulate the passage of time. it('asynchronous timed test...', fakeAsync(() => { component.asyncActionWithTime(); tick(2000); // "advance" 2 seconds expect(...).toBe(...); })); Conclusion Angular 2 has wonderful APIs that make it really easy to test your components. We have seen how to test components with service dependencies, along with asynchronous actions. Time to start writing tests!

0
0
9531

How-To Tutorials

Packt

10 Nov 2016

7 min read

Managing Users and Groups

Packt

10 Nov 2016

7 min read

In this article, we will cover the following recipes: Creating user account Creating user accounts in batch mode Creating a group Introduction In this article by Uday Sawant, the author of the book Ubuntu Server Cookbook, you will see how to add new users to the Ubuntu server, update existing users. You will get to know the default setting for new users and how to change them. (For more resources related to this topic, see here.) Creating user account While installing Ubuntu, we add a primary user account on the server; if you are using the cloud image, it comes preinstalled with the default user. This single user is enough to get all tasks done in Ubuntu. There are times when you need to create more restrictive user accounts. This recipe shows how to add a new user to the Ubuntu server. Getting ready You will need super user or root privileges to add a new user to the Ubuntu server. How to do it… Follow these steps to create the new user account: To add a new user in Ubuntu, enter following command in your shell: $ sudo adduser bob Enter your password to complete the command with sudo privileges: Now enter a password for the new user: Confirm the password for the new user: Enter the full name and other information about new user; you can skip this part by pressing the Enter key. Enter Y to confirm that information is correct: This should have added new user to the system. You can confirm this by viewing the file /etc/passwd: How it works… In Linux systems, the adduser command is higher level command to quickly add a new user to the system. Since adduser requires root privileges, we need to use sudo along with the command, adduser completes following operations: Adds a new user Adds a new default group with the same name as the user Chooses UID (user ID) and GID (group ID) conforming to the Debian policy Creates a home directory with skeletal configuration (template) from /etc/skel Creates a password for the new user Runs the user script, if any If you want to skip the password prompt and finger information while adding the new user, use the following command: $ sudo adduser --disabled-password --gecos "" username Alternatively, you can use the useradd command as follows: $ sudo useradd -s <SHELL> -m -d <HomeDir> -g <Group> UserName Where: -s specifies default login shell for the user -d sets the home directory for the user -m creates a home directory if one does not already exist -g specifies the default group name for the user Creating a user with the command useradd does not set password for the user account. You can set or change the user password with the following command: $sudo passwd bob This will change the password for the user account bob. Note that if you skip the username part from the preceding command you will end up changing the password of root account. There's more… With adduser, you can do five different tasks: Add a normal user Add a system user with system option Add user group with the--group option and without the--system option Add a system group when called with the --system option Add an existing user to existing group when called with two non-option arguments Check out the manual page man adduser to get more details. You can also configure various default settings for the adduser command. A configuration file /etc/adduser.conf can be used to set the default values to be used by the adduser, addgroup, and deluser commands. A key value pair of configuration can set various default values, including the home directory location, directory structure skel to be used, default groups for new users, and so on. Check the manual page for more details on adduser.conf with following command: $ man adduser.conf See also Check out the command useradd, a low level command to add new user to system Check out the command usermod, a command to modify a user account See why every user has his own group at: http://unix.stackexchange.com/questions/153390/why-does-every-user-have-his-own-group Creating user accounts in batch mode In this recipe, we will see how to create multiple user accounts in batch mode without using any external tool. Getting ready You will need a user account with root or root privileges. How to do it... Follow these steps to create a user account in batch mode: Create a new text file users.txt with the following command: $ touch users.txt Change file permissions with the following command: $ chmod 600 users.txt Open users.txt with GNU nano and add user accounts details: $ nano users.txt Press Ctrl + O to save the changes. Press Ctrl + X to exit GNU nano. Enter $ sudo newusers users.txt to import all users listed in users.txt file. Check /etc/passwd to confirm that users are created: How it works… We created a database of user details listed in format as the passwd file. The default format for each row is as follows: username:passwd:uid:gid:full name:home_dir:shell Where: username: This is the login name of the user. If a user exists, information for user will be changed; otherwise, a new user will be created. password: This is the password of the user. uid: This is the uid of the user. If empty, a new uid will be assigned to this user. gid: This is the gid for the default group of user. If empty, a new group will be created with the same name as the username. full name: This information will be copied to the gecos field. home_dir: This defines the home directory of the user. If empty, a new home directory will be created with ownership set to new or existing user. shell: This is the default login shell for the user. The new user command reads each row and updates the user information if user already exists, or it creates a new user. We made the users.txt file accessible to owner only. This is to protect this file, as it contains the user's login name and password in unencrypted format. Creating a group Group is a way to organize and administer user accounts in Linux. Groups are used to collectively assign rights and permissions to multiple user accounts. Getting ready You will need super user or root privileges to add a group to the Ubuntu server. How to do it... Enter the following command to add a new group: $ sudo addgroup guest Enter your password to complete addgroup with root privileges. How it works… Here, we are simply adding a new group guest to the server. As addgroup needs root privileges, we need to use sudo along with the command. After creating a new group, addgroup displays the GID of the new group. There's more… Similar to adduser, you can use addgroup in different modes: Add a normal group when used without any options Add a system group with the--system option Add an existing user to existing group when called with two non-option arguments Check out groupadd, a low level utility to add new group to the server See also Check out groupadd, a low level utility to add new group to the server Summary In this article, we have discussed how to create user account, how to create a group and also about how to create user accounts in batch mode. Resources for Article: Further resources on this subject: Directory Services [article] Getting Started with Ansible [article] Lync 2013 Hybrid and Lync Online [article]

0
0
25543

article-image-connecting-react-redux-and-firebase-part-2

AJ Webb

10 Nov 2016

11 min read

Connecting React to Redux and Firebase – Part 2

AJ Webb

10 Nov 2016

11 min read

This is the second part in a series on using Redux and Firebase with React. If you haven't read through the first part, then you should go back and do so, since this post will build on the other. If you have gone through the first part, then you're in the right place. In this final part of this two-part series, you will be creating actions in your app to update the store. Then you will create a Firebase app and set up async actions that will subscribe to Firebase. Whenever the data on Firebase changes, the store will automatically update and your app will receive the latest state. After all of that is wired up, you'll create a quick user interface. Lets get started. Creating Actions in Redux For your actions, you are going to create a new file inside of src/: [~/code/yak]$ touch src/actions.js Inside of actions.js, you will create your first action; an action in Redux is simply an object that contains a type and a payload. The type allows you to catch the action inside of your reducer and then use the payload to update state. The type can be a string literal or a constant. I recommend using constants for the same reasons given by the Redux team. In this app you will set up some constants to follow this practice. [~/code/yak]$ touch src/constants.js Create your first constant in src/constants.js: export const SEND_MESSAGE = 'CHATS:SEND_MESSAGE'; It is good practice to namespace your constants, like the constant above the CHATS. Namespace can be specific to the CHATS portion of the application and helps document your constants. Now go ahead and use that constant to create an action. In src/actions.js, add: import { SEND_MESSAGE } from './constants'; export function sendMessage(message) { return { type: SEND_MESSAGE, payload: message }; } You now have your first action! You just need to add a way to handle that action in your reducer. So open up src/reducers.js and do just that. First import the constant. import { SEND_MESSAGE } from './constants'; Then inside of the yakApp function, you'll handle different actions: export function yakApp(state = initialState, action) { switch(action.type) { case SEND_MESSAGE: return Object.assign({}, state, { messages: state.messages.concat([action.payload]) }); default: return state; } } A couple of things are happening here, you'll notice that there is a default case that returns state, your reducer must return some sort of state. If you don't return state your app will stop working. Redux does a good job of letting you know what's happening, but it is still good to know. Another thing to notice is that the SEND_MESSAGE case does not mutate the state. Instead it creates a new state and returns it. Mutating the state can result in bad side effects that are hard to debug; do your very best to never mutate the state. You should also avoid mutating the arguments passed to a reducer, performing side effects and calling any non-pure functions within your reducer. For the most part, your reducer is set up to take the new state and return it in conjunction with the old state. Now that you have an action and a reducer that is handling that action, you are ready for your component to interact with them. In src/App.js add an input and a button: <p className="App-intro"> <input type="text" />{' '} <button>Send Message</button> </p> Now that you have the input and button in, you're going to add two things. The first is a function that runs when the button is clicked. The second is a reference on the input so that the function can easily find the value of your input. js <p className="App-intro"> <input type="text" ref={(i) => this.message = i}/>{' '} <button onClick={this.handleSendMessage.bind(this)}>Send Message</button> </p> Now that you have those set, you will need to add the sendMessage function, but you'll also need to be able to dispatch an action. So you'll want to map your actions to props, similar to how you mapped state to props in the previous guide. At the end of the file, under the mapStateToProps function, add the following function: function mapDispatchToProps(dispatch) { return { sendMessage: (msg) => dispatch(sendMessage(msg)) }; } Then you'll need to add the mapDispatchToProps function to the export statement: export default connect(mapStateToProps, mapDispatchToProps)(App); And you'll need to import the sendMessage action into src/App.js: import { sendMessage } from './actions'; Finally you'll need to create the handleSendMessage method inside of your App class, just above the render method: handleSendMessage() { const { sendMessage } = this.props; const { value } = this.message; if (!value) { return false; } sendMessage(value); this.message.value = ''; } If you still have a console.log statement inside of your render method, from the last guide, you should see your message being added to the array each time you click on the button. The final piece of UI that you need to create is a list of all the messages you've received. Add the following to the render method in src/App.js, just below the <p/> that contains your input: {messages.map((message, i) => ( <p key={i} style={{textAlign: 'left', padding: '0 10px'}}>{message}</p> ))} Now you have a crude chat interface; you can improve it later. Set up Firebase If you've never used Firebase before, you'll first need a Google account. If you don't have a Google account, sign up for one; then you'll be able to create a new Firebase project. Head over to Firebase and create a new project. If you have trouble naming it, try YakApp_[YourName]. After you have created your Firebase project, you'll be taken to the project. There is a button that says "Add Firebase to your web app"; click on it and you'll be able to get the configuration information that you will need for your app to be able to work with Firebase. A dialog will open and you should see something like this: You will only be using the database for this project, so you'll only need a portion of the config. Keep the config handy while you prepare the app to use Firebase. First add the Firebase package to your app: [~/code/yak]$ npm install --save firebase Open src/actions.js and import firebase: import firebase from 'firebase' You will only be using Firebase in your actions; that is the reason you are importing it here. Once imported, you'll need to initialize Firebase. After the import section, add your Firebase config (the one from the image above), initialize Firebase and create a reference to messages: const firebaseConfig = { apiKey: '[YOUR API KEY]', databaseURL: '[YOUR DATABASE URL]' } firebase.initializeApp(firebaseConfig); const messages = firebase.database().ref('messages'); You'll notice that you only need the apiKey and databaseUrl for now. Eventually you might want to add auth and storage, and other facets of Firebase, but for now you only need database. Now that Firebase is set up and initialized, you can use it to store the chat history. One last thing before you leave the Firebase console: Firebase automatically sets the database rules to require users to be logged in. That is a great idea, but authentication is outside of the scope of this post. So you'll need to turn that off in the rules. In the Firebase console, click on the "Database" navigation item on the left side of the page; now there should be a tab bar with a "Rules" option. Click on "Rules" and replace what is in the textbox with: { "rules": { ".read": true, ".write": true } } Create subscriptions In order to subscribe to Firebase, you will need a way to send async actions. So you'll need another package to help with that. Go ahead and install redux-thunk. [~/code/yak]$ npm install --save redux-thunk After it's finished installing, you'll need to add it as middleware to your store. That means it's time to head back to src/index.js and add the extra parameters to the createStore function. First import redux-thunk into src/index.js and import another method from redux itself. import { createStore, applyMiddleware } from 'redux';import thunk from 'redux-thunk'; Now update the createStore call to be: const store = createStore(yakApp, undefined, applyMiddleware(thunk)); You are now ready to create async actions. In order to create more actions, you're going to need more constants. Head over to src/constants.js and add the following constant. export const RECEIVE_MESSAGE = 'CHATS:RECEIVE_MESSAGE'; This is the only constant you will need for now; after this guide, you should create edit and delete actions. At that point, you'll need more constants for those actions. For now, only concern yourself with adding and subscribing. You'll need to refactor your actions and reducer to handle these new features. In src/actions.js, refactor the sendMessage action to be an async action. The way redux-thunk works is that it intercepts any action that is dispatched and inspects it. If the action returns a function instead of an object, redux-thunk stops it from reaching the reducer and runs the function. If the action returns an object, redux-thunk ignores it and allows it to continue to the reducer. Change sendMessage to look like the following: export function sendMessage(message) { return function() { messages.push(message); }; } Now when you type a message in and hit the "Submit Message" button, the message will get stored in Firebase. Try it out! UH OH! There's a problem! You are adding messages to Firebase but they aren't showing up in your app anymore! That's ok. You can fix that! You'll need to create a few more actions though, and refactor your reducer. First off, you can delete one of your constants. You no longer need the constant from earlier: export const SEND_MESSAGE = 'CHATS:SEND_MESSAGE'; Go ahead and remove it; that leaves you with only one constant. Change your constant import in both src/actions.js and src/reducer.js: import { RECEIVE_MESSAGE } from './constants'; Now in actions, add the following action: function receiveMessage(message) { return { type: RECEIVE_MESSAGE, payload: message }; } That should look familiar; it's almost identical to the original sendMessage action. You'll also need to rename the action that your reducer is looking for. So now your reducer function should look like this: export function yakApp(state = initialState, action) { switch(action.type) { case RECEIVE_MESSAGE: return Object.assign({}, state, { messages: state.messages.concat([action.payload]) }); default: return state; } } Before the list starts to show up again, you'll need to create a subscription to Firebase. Sounds more complicated than it really is. Add the following action to src/actions.js: export function subscribeToMessages() { return function(dispatch) { messages.on('child_added', data => dispatch(receiveMessage(data.val()))); } } Now, in src/App.js you'll need to dispatch that action and set up the subscription. First change your import statement from ./actions to include subscribeToMessages: import { sendMessage, subscribeToMessages } from './actions'; Now, in mapDispatchToProps you need to map subscribeToMessages: function mapDispatchToProps(dispatch) { return { sendMessage: (msg) => dispatch(sendMessage(msg)), subscribeToMessages: () => dispatch(subscribeToMessages()), }; } Finally, inside of the App class, add the componentWillMount life cycle method just above the handleSendMessage method, and call subscribeToMessages: componentWillMount() { this.props.subscribeToMessages(); } Once you save the file, you should see that your app is subscribed to the Firebase database, which is automatically updating your Redux state, and your UI is displaying all the messages stored in Firebase. Have some fun and open another browser window! Conclusion You have now created a React app from scratch, added Redux to it, refactored it to connect to Firebase and, via subscriptions, updated your entire app state! What will you do now? Of course the app could use a better UI; you could redesign it! Maybe make it a little more usable. Try learning how to create users and show who yakked about what! The sky is the limit, now that you know how to put all the pieces together! Feel free to share what you've done with the app in the comments! About the author AJ Webb is team lead and frontend engineer for @tannerlabs and a co-creator of Payba.cc.

0
0
17251

How-To Tutorials

article-image-why-we-need-design-patterns

Packt

10 Nov 2016

16 min read

Why we need Design Patterns?

Packt

10 Nov 2016

16 min read

0
0
37642

Packt

10 Nov 2016

11 min read

Using ROS with UAVs

Packt

10 Nov 2016

11 min read

In this article by Carol Fairchild and Dr. Thomas L. Harman, co-authors of the book ROS Robotics by Example, you will discover the field of ROS Unmanned Air Vehicles (UAVs), quadrotors, in particular. The reader is invited to learn about the simulated hector quadrotor and take it for a flight. The ROS wiki currently contains a growing list of ROS UAVs. These UAVs are as follows: (For more resources related to this topic, see here.) AscTec Pelican and Hummingbird quadrotors Berkeley's STARMAC Bitcraze Crazyflie DJI Matrice 100 Onboard SDK ROS support Erle-copter ETH sFly Lily CameraQuadrotor Parrot AR.Drone Parrot Bebop Penn's AscTec Hummingbird Quadrotors PIXHAWK MAVs Skybotix CoaX helicopter Refer to http://wiki.ros.org/Robots#UAVs for future additions to this list and to the website http://www.ros.org/news/robots/uavs/ to get the latest ROS UAV news. The preceding list contains primarily quadrotors except for the Skybotix helicopter. A number of universities have adopted the AscTec Hummingbird as their ROS UAV of choice. For this book, we present a simulator called Hector Quadrotor and two real quadrotors Crazyflie and Bebop that use ROS. Introducing Hector quadrotor The hardest part of learning about flying robots is the constant crashing. From the first-time learning of flight control to testing new hardware or flight algorithms, the resulting failures can have a huge cost in terms of broken hardware components. To answer this difficulty, a simulated air vehicle designed and developed for ROS is ideal. A simulated quadrotor UAV for the ROS Gazebo environment has been developed by the Team Hector Darmstadt of Technische Universität Darmstadt. This quadrotor, called Hector Quadrotor, is enclosed in the hector_quadrotor metapackage. This metapackage contains the URDF description for the quadrotor UAV, its flight controllers, and launch files for running the quadrotor simulation in Gazebo. Advanced uses of the Hector Quadrotor simulation allows the user to record sensor data such as Lidar, depth camera, and many more. The quadrotor simulation can also be used to test flight algorithms and control approaches in simulation. The hector_quadrotor metapackage contains the following key packages: hector_quadrotor_description: This package provides a URDF model of Hector Quadrotor UAV and the quadrotor configured with various sensors. Several URDF quadrotor models exist in this package each configured with specific sensors and controllers. hector_quadrotor_gazebo: This package contains launch files for executing Gazebo and spawning one or more Hector Quadrotors. hector_quadrotor_gazebo_plugins: This package contains three UAV specific plugins, which are as follows: The simple controller gazebo_quadrotor_simple_controller subscribes to a geometry_msgs/Twist topic and calculates the required forces and torques A gazebo_ros_baro sensor plugin simulates a barometric altimeter The gazebo_quadrotor_propulsion plugin simulates the propulsion, aerodynamics, and drag from messages containing motor voltages and wind vector input hector_gazebo_plugins: This package contains generic sensor plugins not specific to UAVs such as IMU, magnetic field, GPS, and sonar data. hector_quadrotor_teleop: This package provides a node and launch files for controlling a quadrotor using a joystick or gamepad. hector_quadrotor_demo: This package provides sample launch files that run the Gazebo quadrotor simulation and hector_slam for indoor and outdoor scenarios. The entire list of packages for the hector_quadrotor metapackage appears in the next section. Loading Hector Quadrotor The repository for the hector_quadrotor software is at the following website: https://github.com/tu-darmstadt-ros-pkg/hector_quadrotor The following commands will install the binary packages of hector_quadrotor into the ROS package repository on your computer. If you wish to install the source files, instructions can be found at the following website: http://wiki.ros.org/hector_quadrotor/Tutorials/Quadrotor%20outdoor%20flight%20demo (It is assumed that ros-indigo-desktop-full has been installed on your computer.) For the binary packages, type the following commands to install the ROS Indigo version of Hector Quadrotor: $ sudo apt-get update $ sudo apt-get install ros-indigo-hector-quadrotor-demo A large number of ROS packages are downloaded and installed in the hector_quadrotor_demo download with the main hector_quadrotor packages providing functionality that should now be somewhat familiar. This installation downloads the following packages: hector_gazebo_worlds hector_geotiff hector_map_tools hector_mapping hector_nav_msgs hector_pose_estimation hector_pose_estimation_core hector_quadrotor_controller hector_quadrotor_controller_gazebo hector_quadrotor_demo hector_quadrotor_description hector_quadrotor_gazebo hector_quadrotor_gazebo_plugins hector_quadrotor_model hector_quadrotor_pose_estimation hector_quadrotor_teleop hector_sensors_description hector_sensors_gazebo hector_trajectory_serve hector_uav_msgs message_to_tf A number of these packages will be discussed as the Hector Quadrotor simulations are described in the next section. Launching Hector Quadrotor in Gazebo Two demonstration tutorials are available to provide the simulated applications of the Hector Quadrotor for both outdoor and indoor environments. These simulations are described in the next sections. Before you begin the Hector Quadrotor simulations, check your ROS master using the following command in your terminal window: $ echo $ROS_MASTER_URI If this variable is set to localhost or the IP address of your computer, no action is needed. If not, type the following command: $ export ROS_MASTER_URI=http://localhost:11311 This command can also be added to your .bashrc file. Be sure to delete or comment out (with a #) any other commands setting the ROS_MASTER_URI variable. Flying Hector outdoors The quadrotor outdoor flight demo software is included as part of the hector_quadrotor metapackage. Start the simulation by typing the following command: $ roslaunch hector_quadrotor_demo outdoor_flight_gazebo.launch This launch file loads a rolling landscape environment into the Gazebo simulation and spawns a model of the Hector Quadrotor configured with a Hokuyo UTM-30LX sensor. An rviz node is also started and configured specifically for the quadrotor outdoor flight. A large number of flight position and control parameters are initialized and loaded into the Parameter Server. Note that the quadrotor propulsion model parameters for quadrotor_propulsion plugin and quadrotor drag model parameters for quadrotor_aerodynamics plugin are displayed. Then look for the following message: Physics dynamic reconfigure ready. The following screenshots show both the Gazebo and rviz display windows when the Hector outdoor flight simulation is launched. The view from the onboard camera can be seen in the lower left corner of the rviz window. If you do not see the camera image on your rviz screen, make sure that Camera has been added to your Displays panel on the left and that the checkbox has been checked. If you would like to pilot the quadrotor using the camera, it is best to uncheck the checkboxes for tf and robot_model because the visualizations sometimes block the view: Hector Quadrotor outdoor gazebo view Hector Quadrotor outdoor rviz view The quadrotor appears on the ground in the simulation ready for takeoff. Its forward direction is marked by a red mark on its leading motor mount. To be able to fly the quadrotor, you can launch the joystick controller software for the Xbox 360 controller. In a second terminal window, launch the joystick controller software with a launch file from the hector_quadrotor_teleop package: $ roslaunch hector_quadrotor_teleop xbox_controller.launch This launch file launches joy_node to process all joystick input from the left stick and right stick on the Xbox 360 controller as shown in the following figure. The message published by joy_node contains the current state of the joystick axes and buttons. The quadrotor_teleop node subscribes to these messages and publishes messages on the cmd_vel topic. These messages provide the velocity and direction for the quadrotor flight. Several joystick controllers are currently supported by the ROS joy package including PS3 and Logitech devices. For this launch, the joystick device is accessed as /dev/input/js0 and is initialized with a deadzone of 0.050000. Parameters to set the joystick axes are as follows: * /quadrotor_teleop/x_axis: 5 * /quadrotor_teleop/y_axis: 4 * /quadrotor_teleop/yaw_axis: 1 * /quadrotor_teleop/z_axis: 2 These parameters map to the Left Stick and the Right Stick controls on the Xbox 360 controller shown in the following figure. The direction of these sticks control are as follows: Left Stick: Forward (up) is to ascend Backward (down) is to descend Right is to rotate clockwise Left is to rotate counterclockwise Right Stick: Forward (up) is to fly forward Backward (down) is to fly backward Right is to fly right Left is to fly left Xbox 360 joystick controls for Hector Now use the joystick to fly around the simulated outdoor environment! The pilot's view can be seen in the Camera image view on the bottom left of the rviz screen. As you fly around in Gazebo, keep an eye on the Gazebo launch terminal window. The screen will display messages as follows depending on your flying ability: [ INFO] [1447358765.938240016, 617.860000000]: Engaging motors! [ WARN] [1447358778.282568898, 629.410000000]: Shutting down motors due to flip over! When Hector flips over, you will need to relaunch the simulation. Within ROS, a clearer understanding of the interactions between the active nodes and topics can be obtained by using the rqt_graph tool. The following diagram depicts all currently active nodes (except debug nodes) enclosed in oval shapes. These nodes publish to the topics enclosed in rectangles that are pointed to by arrows. You can use the rqt_graph command in a new terminal window to view the same display: ROS nodes and topics for Hector Quadrotor outdoor flight demo The rostopic list command will provide a long list of topics currently being published. Other command line tools such as rosnode, rosmsg, rosparam, and rosservice will help gather specific information about Hector Quadrotor's operation. To understand the orientation of the quadrotor on the screen, use the Gazebo GUI to show the vehicle's tf reference frame. Select quadrotor in the World panel on the left, then select the translation mode on the top environment toolbar (looks like crossed double-headed arrows). This selection will bring up the red-green-blue axis for the x-y-z axes of the tf frame, respectively. In the following figure, the x axis is pointing to the left, the y axis is pointing to the right (toward the reader), and the z axis is pointing up. Hector Quadrotor tf reference frame An YouTube video of hector_quadrotor outdoor scenario demo shows the hector_quadrotor in Gazebo operated with a gamepad controller: https://www.youtube.com/watch?v=9CGIcc0jeuI Flying Hector indoors The quadrotor indoor SLAM demo software is included as part of the hector_quadrotor metapackage. To launch the simulation, type the following command: $ roslaunch hector_quadrotor_demo indoor_slam_gazebo.launch The following screenshots show both the rviz and Gazebo display windows when the Hector indoor simulation is launched: Hector Quadrotor indoor rviz and gazebo views If you do not see this image for Gazebo, roll your mouse wheel to zoom out of the image. Then you will need to rotate the scene to a top-down view, in order to find the quadrotor press Shift + right mouse button. The environment was the offices at Willow Garage and Hector starts out on the floor of one of the interior rooms. Just like in the outdoor demo, the xbox_controller.launch file from the hector_quadrotor_teleop package should be executed: $ roslaunch hector_quadrotor_teleop xbox_controller.launch If the quadrotor becomes embedded in the wall, waiting a few seconds should release it and it should (hopefully) end up in an upright position ready to fly again. If you lose sight of it, zoom out from the Gazebo screen and look from a top-down view. Remember that the Gazebo physics engine is applying minor environment conditions as well. This can create some drifting out of its position. The rqt graph of the active nodes and topics during the Hector indoor SLAM demo is shown in the following figure. As Hector is flown around the office environment, the hector_mapping node will be performing SLAM and be creating a map of the environment. ROS nodes and topics for Hector Quadrotor indoor SLAM demo The following screenshot shows Hector Quadrotor mapping an interior room of Willow Garage: Hector mapping indoors using SLAM The 3D robot trajectory is tracked by the hector_trajectory_server node and can be shown in rviz. The map along with the trajectory information can be saved to a GeoTiff file with the following command: $ rostopic pub syscommand std_msgs/String "savegeotiff" The savegeotiff map can be found in the hector_geotiff/map directory. An YouTube video of hector_quadrotor stack indoor SLAM demo shows hector_quadrotor in Gazebo operated with a gamepad controller: https://www.youtube.com/watch?v=IJbJbcZVY28 Summary In this article, we learnt about Hector Quadrotors, loading Hector Quadrotors, launching Hector Quadrotor in Gazebo, and also about flying Hector outdoors and indoors. Resources for Article: Further resources on this subject: Working On Your Bot [article] Building robots that can walk [article] Detecting and Protecting against Your Enemies [article]

0
0
37490

How-To Tutorials

Packt

10 Nov 2016

6 min read

Data Clustering

Packt

10 Nov 2016

6 min read

In this article by Rodolfo Bonnin, the author of the book Building Machine Learning Projects with TensorFlow, we will start applying data transforming operations. We will begin finding interesting patterns in some given information, discovering groups of data, or clusters, and using clustering techniques. (For more resources related to this topic, see here.) In this process we'll also gain two new tools: the ability to generate synthetic sample sets from a collection of representative data structures via the scikit-learn library, and the ability to graphically plot our data and model results, this time via the matplotlib library. The topics we will cover in this article are as follows: Getting an idea of how clustering works, and comparing it to other alternative existent classification techniques Using scikit-learn and matplotlib to enrichen the possibilities of dataset choices, and to get professional looking graphical representation of the data Implementing the K-means clustering algorithm Test some variations of the K-means methods to improve the fit and/or the convergence rate Three types of learning from data Based on how we approach the supervision of the samples, we can extract three types of learning: Unsupervised learning: The fully unsupervised approach directly takes a number of undetermined elements and builds a classification of them, looking at different properties that could determine its class Semi-supervised learning: The semi-supervised approach has a number of known classified items and then applies techniques to discover the class of the remaining items Supervised learning: In supervised learning, we start from a population of samples, which have a known type beforehand, and then build a model from it Normally there are three sample populations: one from which the model grows, called training set, one that is used to test the model, called training set, and then there are the samples for which we will be doing classification. Types of data learning based on supervision: unsupervised, semi-supervised, and supervised Unsupervised data clustering One of the simplest operations that can be initially applied to an unknown dataset is to try to understand the possible grouping or common features that the dataset members have. To do so, we could try to find representative points in them that summarize a balance of the parameters of the members of the group. This value could be, for example, the mean or the median of all the cluster members. This also guides to the idea of defining a notion of distance between members: all the members of the groups should be obviously at short distances between them and the representative points, that from the central points of the other groups. In the following image, we can see the results of a typical clustering algorithm and the representation of the cluster centers: Sample clustering algorithm output K-means K-means is a very well-known clustering algorithm that can be easily implemented. It is very straightforward and can guide (depending on the data layout) to a good initial understanding of the provided information. Mechanics of K-means K-means tries to divide a set of samples into K disjoint groups or clusters, using as a main indicator the mean value (be it 1D, 2D, and so on) of the members. This point is normally called centroid, referring to the arithmetic entity with the same name. One important characteristic of K-means is that K should be provided beforehand, and so some previous knowledge of the data is needed to avoid a non-representative result. Algorithm iteration criterion The criterion and goal of this method is to minimize the sum of squared distances from the cluster's member to the actual centroid of all cluster contained samples. This is also known as minimization of inertia. Error minimization criteria for K-means K-means algorithm breakdown The mechanism of the K-means algorithm can be summarized in the following graphic: Simplified flow chart of the K-means process And this is a simplified summary of the algorithm: We start with unclassified samples and take K elements as the starting centroids. There are also possible simplifications of this algorithm that take the first elements in the element list, for the sake of brevity. We then calculate the distances between the samples and the first chosen samples, and so we get the first calculated centroids (or other representative values). You can see in the moving centroids in the illustration toward a more common sense centroid. After the centroids change, their displacement will provoke the individual distances to change, and so the cluster membership can change. So this is the time when we recalculate the centroids and repeat the first steps, in case the stop condition isn't met. The stopping conditions could be of various types: After n iterations (it could be that either we chose a too large number and we'll have unnecessary rounds of computing, or it could converge slowly and we will have a very unconvincing results) if the centroid doesn't have a very stable means. This stop condition could also be used as a last resort if we have a really long iterative process. Referring to the previous mean result, a possibly better criterion for the convergence of the iterations is to take a look at the changes of the centroids, be it in total displacement or total cluster element switches. The last one is employed normally, so we will stop the process once there are no more element-changing clusters: K-means simplified graphic Pros and cons of K-means The advantages of this method are: It scales very well (most of the calculations can be run in parallel) It has been used in a very large range of applications But its simplicity has also a price (no silver bullet rule applies): It requires apriori knowledge (the number of possible clusters should be known beforehand) The outlier values can push the values of the centroids, as they have the same value as any other sample As we assume that the figure is convex and isotropic, it doesn't work very well with non-circle-like delimited clusters Summary In this article, we got a simple overview of some of the most basic models we can implement, but we tried to be as detailed in the explanation as possible. From now on, we are able to generate synthetic datasets, allowing us to rapidly test the adequacy of a model for different data configurations and so evaluate the advantages and shortcoming of them without having to load models with a greater number of unknown characteristics. You can also refer to the following books on the similar topics: Getting Started with TensorFlow: https://www.packtpub.com/big-data-and-business-intelligence/getting-started-tensorflow R Machine Learning Essentials: https://www.packtpub.com/big-data-and-business-intelligence/r-machine-learning-essentials Building Machine Learning Systems with Python - Second Edition: https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python-second-edition Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Unsupervised Learning [article] Preprocessing the Data [article]

0
0
2294

article-image-creating-personal-web-portal-pwp

Packt

10 Nov 2016

8 min read

Creating a Personal Web Portal (PWP)

Packt

10 Nov 2016

8 min read

In this article by Sherwin John Calleja Tragura, author of the book Spring MVC Blueprints, we will discuss about creating a robust and simple personal web portal that can serve as a personal web page, or a professional reference site, for anyone. Usually, these kinds of web sites are used as mashups, or dashboards, of centralized sources of information describing and individual or group. (For more resources related to this topic, see here.) Technically, a personal web portal is a composition of web components like CSS, HTML, and JavaScript, woven together to create a formal, simple or exquisite presentation of any content. It can be used, in its simplest form, as a personal portfolio or an enterprise form like an e-commerce content management system. Commercially, these portals are drafted and designed using the principles of the Rich-client platform or responsive web designs. In the industry, most companies suggest that clients try easy-to-use-tools like PHP frameworks (for example, CodeIgniter, Laravel, Drupal) and seldom advise using JEE-based portals. Overview of the project The personal web portal (PWP) created publishes a simple biography, and professional information, one can at least share through the web. The prototype is a session-driven one that can do dynamic transactions, like updating information on the web pages, and posting notes on the page without using any back-end database. Through using wireframes, the following are the initial drafts and design of the web portal: The Home Page: This is the first page of the site that shows updatable quotes, and inspiring messages coming from the owner of the portal. It contains a sticky-note feature at the side that allows visitors to post their short greetings to the owner in real-time. The Personal Information Page: This page highlights personal information of the owner including the owner's name, age, hobbies, birth date, and age. This page contains some part of the blogger's educational history. The page is dynamic and can be updated at anytime by the owner. The Professional Information Page: This page presents details about the owner's career background. It lists down all the previous jobs of the account owner, and enumerates all skills-related information. This page is also updatable. The Reach Out Page: This serves as the contact information page of the owner. Moreover, it allows visitors to send their contact information, and specifically their electronic mail address, to the portal owner. Update pages: The Home, Personal and Professional pages has updateable pages for the owner to update the content of the portal. The prototype has the capability to update the information presented in the content at anytime the user desires. This simple prototype, called PWP, will give clear steps on how to build personal sites from the ground-up, using Spring MVC 4.x specifications. It will give enthusiasts the opportunity to start creating Spring-based web portals in just a day, without using any database backend. To those who are new to the Spring MVC 4.x concept, this article will be a good start in building full-blown portal sites. Technical requirements In order to start the development, the following tools need to be installed onto the platform: Java Development Kit (JDK) 1.7.x Spring Tool Suite (Eclipse) 3.6 Maven 3.x Spring Framework 4.1 Apache Tomcat 7.x Any operating system First, the JDK 1.7.x installer must be installed. Visit the site http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html to download the installer. Next, setup the Spring Tool Suite 3.6 (Eclipse-based) which will be the official Integrated Development Environment (IDE) of this article. Download the Spring Tool Suite 3.6 at https://spring.io/tools/sts. Setting-up the development environment This article recommends Spring Tool Suite (Eclipse) 3.6 since it has all the Spring Framework 4.x plug-ins, and other dependencies needed by the projects. To start us off, the following screenshot shows the dashboard of the STS IDE: Conversely, Apache Maven 3.x will be used to build and deploy the project for this article. Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information (https://maven.apache.org/). There is already a Maven plugin installed in the STS IDE that can be used to generate the needed development directory structure. Among the many ways to create Spring MVC projects, this article focuses on two styles, namely: Converting a dynamic web project to a Maven specimen Creating a Maven project from scratch Converting a dynamic web project to a Maven project To start creating the project, press CTRL + N to browse the Menu wizard of the IDE. This menu wizard contains all the types of project modules you'll need to start a project. The Menu wizard should look similar to the following screenshot: Once on the menu, browse the Web option and choose Dynamic Web Project. Afterwards, just follow the series of instructions to create the chosen project module until you reached the last menu wizard, which looks like the following figure: This last instruction (Web Module panel) will auto-generate the deployment descriptor (web.xml) of the project. Always click on the Generate web-xml deployment descriptor checkbox option. The deployment descriptor is an XML file that must reside inside the /WEB-INF/ folder of all JEE projects. This file describes how a component, module or application can be deployed. A JEE project must always be in the web.xml file otherwise the project will be defective. Since Spring 4.x container supports the Servlet Specification 3.0 in Tomcat 7 and above, web.xml is no longer mandatory and can be replaced by org.springframework.web.servlet.support.AbstractAnnotationConfigDispatcherServletInitializer or org.springframework.web.servlet.support.AbstractDispatcherServletInitializer class. The next major step is to convert the newly created dynamic web project to a Maven one. To complete the conversion, right-click on the project and navigate to the Configure | Convert Maven Project command set, as shown in the following screenshot: It is always best for the developer to study the directory structure of the project folder created before the actual implementation starts. Below is the directory structure of the Maven project after the conversion: The project directories are just like the usual Eclipse Dynamic Web project without the pom.xml file. Creating a Maven project from scratch Another method of creating a Spring MVC web project is by creating a Maven project from the start. Be sure to install the Maven 3.2 plugin in STS Eclipse. Browse the Menu wizard again, and locate the Maven option. Click on the Maven Project to generate a new Maven project. After clicking this option, a wizard will pop up, asking if an archetype is needed or not to create the Maven project. An archetype is a Maven plugin whose main objective is to create a project structure as per its template. To start quickly, choose an archetype plugin to create a simple java application here. It is recommended to create the project using the archetype maven-archetype-webapp. However, skipping the archetype selection can still be a valid option. After you've done this, proceed with the Select an Archetype window shown in the following screenshot. Locate maven-archetype-webapp then proceed with the last process. The selection of the Archetype maven-archetype-webapp will require the input of Maven parameters before the ending the whole process with a new Maven project: The required parameters for the Maven group or project are as follows: Group Id (groupId): This is the ID of the project's group and must be unique among all the project's groups Artifact Id (artifactId): This is the ID of the project. This is generally the name of the project Version (version): This is the version of the project Package (package): The initial or core package of the sources For more information on Maven plugin and configuration details, visit the documentation and samples on the site http://maven.apache.org/. After providing the Maven parameters, the project source folder structure will be similar to the following screenshot: Summary Using the basic Spring Framework 4.x APIs, web portal creators can create their own platform to promote their personal philosophy, business, ideology, religion, and other concepts. Although it is an advantage to use existing portal platforms made in other language like PHP and Python, it is still fulfilling to design and develop our own portal based on an open-source framework. The PWP is just prototype software that needs to be upgraded to have a backend database, security, and other social media plugins, in order to make the software commercially competitive. Resources for Article: Further resources on this subject: Designing your very own ASP.NET MVC Application [article] ASP.NET MVC 2: Validating MVC [article] Using ASP.NET Controls in SharePoint [article]

0
0
9877

How-To Tutorials

Packt

10 Nov 2016

16 min read

Internationalization

Packt

10 Nov 2016

16 min read

0
0
33122

article-image-introduction-practical-business-intelligence

Packt

10 Nov 2016

20 min read

Introduction to Practical Business Intelligence

Packt

10 Nov 2016

20 min read

0
1
4035

JanuVerma

10 Nov 2016

6 min read

How to Build a Search Engine

JanuVerma

10 Nov 2016

6 min read

In this post, I will discuss how you can build a search engine like Google. We will start with a definition of a search engine, and I will go on to present the core concepts (mathematics and implementation). A search engine is an information retrieval system that returns documents relevant to a user submitted query, ranked by their relevance to the query. There are many subtle issues in the last statement;for example, what does relevance mean in this context? We will make all of these notions clear as we go further. In this post, by documents I mean textual documents or webpages. There are surely search engines for images, videos or any type of documents, the discussion of which is beyond the scope of this post. Information Retrieval System Consider a corpus of textual documents, and we want to build a system that searches for a query, and returns the documents relevant to the query. The easiest way to accomplish this is to return all of the documents that contain the query. Such a model is called Boolean model of information retrieval (BIR). Mathematically, we can represent a document (and query) as a set of the words it contains. The BIR model retrieves documents that have non-zero intersection with the query. Notice that the retrieved documents are all indistinguishable; there is no inherent ranking method. Another class of models is the vector space model, and all modern search engines use such a model as their base system. In these models, a document is represented as a vector in a high-dimensional vector space. The dimension of the vector is equal to the total number of words in the corpus, and there is an entry corresponding to each word. The documents are retrieved based on the similarity of the corresponding vectors. For example, doc = (v1, v2, ... , vk) is the vector corresponding to some document, and q = (q1, q2, ..., qk) is the vector corresponding to the query, then the similarity of the document and the query can be computed as the vector similarity (distance). There are many choices for such similarity,for example,Euclidean distance, Jaccard distance, Manhattan distance, and so on. In most modern search engines cosine similarity is used: cos(doc,q) = (doc . q) / ||doc|| ||q|| where . is the dot product of the vectors, and || is the norm of the vector. There are difference schemes for choosing the vectors as well, in binary scheme, so the vector entry corresponding to a word is 1 if the word is present in the document, and 0 otherwise. The most popular scheme is the so-called term frequence- inverse document frequency (tf-idf), which assigns higher score of words that are characteristic to the current document. Sothe tf-idf of word w in document d can be computed as: tf-idf(w, d) = tf(w,d) * log(N/n) Where tf(w,d) is the frequency of word w in document d, N is the total number of documents, and n is the number of documents that contain w. Thus tf-idf taxes words that are present in a large number of documents, are not good as features in a model. The retrieved documents are ranked according to the similarity value. There are also probablistic models for retrieving documents, where we estimate the probability that the document d is relevant to a query q, and the documents are then ranked by this probability estimate. For implementation of these models and more, check out InfoR. Computational Complexities The process of retrieval based on vector space model is incredibly expensive. For each search query, we need to compute its similarity with all the web documents, even though most of the documents don't even contain that word. Going by the standard method, it's impossible to achieve the results at the speed we experience while using Google. The trick is to store documents in a data structure, which facilitates fast computation. The most commonly used data structure is the inverted index. A forward index stores the lists of words for each document, an invereted index stores documents for each word in a hashmap-style structure: Inverted index = {word1: [doc1, doc24, doc77], word2: [doc34, doc97], .... wordk: [doc11, doc22, doc1020, doc23]} Given such a structure, one can just jump to the words in the query in constant time, and compute the similarity with a very small number of documents, ones containing the words in the query. Now building such an index also presents a serious computational problem. This problem was the reason behind the development of the MapReduce paradigm, which is the distributed, massively parallel system, discovered by Google. The use of MapReduce system has extended to all major big data problems. If you are curious refer to the original paper. Web Search Engines The working of search engine can be described in the following steps: Crawling: First of all, we need to crawl the web to extract the HTML documents. There are many open source services that provide the crawling capabilities, for example, Nutch, Scrappy,and so on. The documents are then processed to extract the text, the hyperlinks, geo-location, and so on. based on the complexity of the retrieval model. Indexing: We want the documents to be stored in such a way that facilitates a quick search. As discussed earlier, building and maintaining an inverted index is a crucial step in building a search engine. Retrieval: We have discussed retrieval models based on vector space and that probability above. Such a model is used as the base for all major web engines. In addition to the cosine similarity, all web search engines use PageRank for ranking the retrieved documents. This algorithm, discovered by founders of Google, uses the quality of hyperlinks in a webpage for ranking. Mathematically, pagerank assumes that the web is a network graph where documents are linked via hyperlinks. A document receives higher pagerank if there are many 'high-quality' links in it. The PageRank can also be interpreted as the probability that a person randomly clicking on links will arrive at this particular page. For the more mathematically inclined, this translates to computing the principal eigenvector of the transition matrix of random jumps. For more details, refer to this article on American Mathematical Society website. Modern web search engines like Google use more than 200 heuristics on top of the standard retrieval model! Conclusion In this post, we discussed the central ideas of information retrieval and challenges in building a web search engine. It's not an easy problem. Even today, 18 years after the origin of Google, significant research, both at universities and in industry, is being pursued in this direction. About the author Janu Verma is a researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational health and biology. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and Indian Statistical Institute. He has also written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals,and so on. His current focus is on the development of visual analytics systems for prediction and understanding. Check out his personal website.

0
0
10832

How-To Tutorials

article-image-machine-learning-technique-supervised-learning

Packt

09 Nov 2016

7 min read

Machine Learning Technique: Supervised Learning

Packt

09 Nov 2016

7 min read

In this article by Andrea Isoni author of the book Machine Learning for the Web, we will the most relevant regression and classification techniques are discussed. All of these algorithms share the same background procedure, and usually the name of the algorithm refers to both a classification and a regression method. The linear regression algorithms, Naive Bayes, decision tree, and support vector machine are going to be discussed in the following sections. To understand how to employ the techniques, a classification and a regression problem will be solved using the mentioned methods. Essentially, a labeled train dataset will be used to train the models, which means to find the values of the parameters, as we discussed in the introduction. As usual, the code is available in the my GitHub folder at https://github.com/ai2010/machine_learning_for_the_web/tree/master/chapter_3/. (For more resources related to this topic, see here.) We will conclude the article with an extra algorithm that may be used for classification, although it is not specifically designed for this purpose (hidden Markov model). We will now begin to explain the general causes of error in the methods when predicting the true labels associated with a dataset. Model error estimation We said that the trained model is used to predict the labels of new data, and the quality of the prediction depends on the ability of the model to generalize, that is, the correct prediction of cases not present in the trained data. This is a well-known problem in literature and related to two concepts: bias and variance of the outputs. The bias is the error due to a wrong assumption in the algorithm. Given a point x(t) with label yt, the model is biased if it is trained with different training sets, and the predicted label ytpred will always be different from yt. The variance error instead refers to the different wrongly predicted labels of the given point x(t). A classic example to explain the concepts is to consider a circle with the true value at the center (true label), as shown in the following figure. The closer the predicted labels are to the center, the more unbiased the model and the lower the variance (top left in the following figure). The other three cases are also shown here: Variance and bias example. A model with low variance and low bias errors will have the predicted labels that is blue dots (as show in the preceding figure) concentrated on the red center (true label). The high bias error occurs when the predictions are far away from the true label, while high variance appears when the predictions are in a wide range of values. We have already seen that labels can be continuous or discrete, corresponding to regression classification problems respectively. Most of the models are suitable for solving both problems, and we are going to use word regression and classification referring to the same model. More formally, given a set of N data points and corresponding labels, a model with a set of parameters with the true parameter values will have the mean square error (MSE), equal to: We will use the MSE as a measure to evaluate the methods discussed in this article. Now we will start describing the generalized linear methods. Generalized linear models The generalized linear model is a group of models that try to find the M parameters that form a linear relationship between the labels yi and the feature vector x(i) that is as follows: Here, are the errors of the model. The algorithm for finding the parameters tries to minimize the total error of the model defined by the cost function J: The minimization of J is achieved using an iterative algorithm called batch gradient descent: Here, a is called learning rate, and it is a trade-off between convergence speed and convergence precision. An alternative algorithm that is called stochastic gradient descent, that is loop for : The qj is updated for each training example i instead of waiting to sum over the entire training set. The last algorithm converges near the minimum of J, typically faster than batch gradient descent, but the final solution may oscillate around the real values of the parameters. The following paragraphs describe the most common model and the corresponding cost function, J. Linear regression Linear regression is the simplest algorithm and is based on the model: The cost function and update rule are: Ridge regression Ridge regression, also known as Tikhonov regularization, adds a term to the cost function J such that: , where l is the regularization parameter. The additional term has the function needed to prefer a certain set of parameters over all the possible solutions penalizing all the parameters qj different from 0. The final set of qj shrank around 0, lowering the variance of the parameters but introducing a bias error. Indicating with the superscript l the parameters from the linear regression, the ridge regression parameters are related by the following formula: This clearly shows that the larger the l value, the more the ridge parameters are shrunk around 0. Lasso regression Lasso regression is an algorithm similar to ridge regression, the only difference being that the regularization term is the sum of the absolute values of the parameters: Logistic regression Despite the name, this algorithm is used for (binary) classification problems, so we define the labels. The model is given the so-called logistic function expressed by: In this case, the cost function is defined as follows: From this, the update rule is formally the same as linear regression (but the model definition, , is different): Note that the prediction for a point p, , is a continuous value between 0 and 1. So usually, to estimate the class label, we have a threshold at =0.5 such that: The logistic regression algorithm is applicable to multiple label problems using the techniques one versus all or one versus one. Using the first method, a problem with K classes is solved by training K logistic regression models, each one assuming the labels of the considered class j as +1 and all the rest as 0. The second approach consists of training a model for each pair of labels ( trained models). Probabilistic interpretation of generalized linear models Now that we have seen the generalized linear model, let’s find the parameters qj that satisfy the relationship: In the case of linear regression, we can assume as normally distributed with mean 0 and variance s2 such that the probability is equivalent to: Therefore, the total likelihood of the system can be expressed as follows: In the case of the logistic regression algorithm, we are assuming that the logistic function itself is the probability: Then the likelihood can be expressed by: In both cases, it can be shown that maximizing the likelihood is equivalent to minimizing the cost function, so the gradient descent will be the same. k-nearest neighbours (KNN) This is a very simple classification (or regression) method in which given a set of feature vectors with corresponding labels yi, a test point x(t) is assigned to the label value with the majority of the label occurrences in the K nearest neighbors found, using a distance measure such as the following: Euclidean: Manhattan: Minkowski: (if q=2, this reduces to the Euclidean distance) In the case of regression, the value yt is calculated by replacing the majority of occurrences by the average of the labels . The simplest average (or the majority of occurrences) has uniform weights, so each point has the same importance regardless of their actual distance from x(t). However, a weighted average with weights equal to the inverse distance from x(t) may be used. Summary In this article, the major classification and regression algorithms, together with the techniques to implement them, were discussed. You should now be able to understand in which situation each method can be used and how to implement it using Python and its libraries (sklearn and pandas). Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Unsupervised Learning [article] Specialized Machine Learning Topics [article]

0
0
1739

Getting Started with Flocker

DevOps Tools and Technologies

Planning and Structuring Your Test-Driven iOS App

Introduction to JavaScript

Testing Components with Service Dependencies

Managing Users and Groups

Connecting React to Redux and Firebase – Part 2

Why we need Design Patterns?

Using ROS with UAVs

Data Clustering

Trending Topics

Creating a Personal Web Portal (PWP)

Internationalization

Introduction to Practical Business Intelligence

How to Build a Search Engine

Machine Learning Technique: Supervised Learning

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access