Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7014 Articles
article-image-getting-started-flocker
Packt
11 Nov 2016
11 min read
Save for later

Getting Started with Flocker

Packt
11 Nov 2016
11 min read
In this article by Russ McKendrick, the author of the book Docker Data Management with Flocker, we are going to look at what both what problems Flocker has been developed to resolve as well as jumping in at the deep end and perform our first installation of Flocker. (For more resources related to this topic, see here.) By the end of the article, you will have: Configured and used the default Docker volume driver Learned a little about how Flocker came about and who wrote it Installed and configured a Flocker control and storage node Integrated your Flocker installation with Docker Have visualized your volumes using the Volume Hub Interacted with your Flocker installation using the Flocker CLI Have installed and used dvol However, before we start to look at both Docker and Flocker we should talk about compute resource. Compute and Storage Resource In later articles, we will be looking at running Flocker on public cloud services such as Amazon Web Services and Google Cloud so we will not be using those services for the practical examples in our early articles, instead I will be server instances launched in DigitalOcean. DigitalOcean has quickly become the platform of choice for both developers & system administrators who want to experiment with new technologies and services. While their cloud service is simple to use their SSD backed virtual machines offer the performance required to run modern software stacks at a fraction of the price of other public clouds making them perfect for launching prototyping and sandbox environments. For more information on DigitalOcean, please see their website at: https://www.digitalocean.com/. The main reason I choose to use an external cloud provider rather than local virtual machine is that we will need to launch multiple hosts to run Flocker. You do not have to use DigitalOcean, as long as you are able to launch multiple virtual machines and attach some sort of block storage to your instances as a secondary device. Finally, where we are doing manual installations of Flocker and Docker I will give instructions, which cover both CentOS 7.2 and Ubuntu 16.04. Docker If you are reading this article then Docker does not need much of an introduction, it is one of the most talked about technologies of recent years quickly gaining support from pretty much all what I would consider the big players in software. Companies such as Google, Microsoft, Red Hat, VMWare, IBM, Amazon Web Services and Hewlett Packard all offer solutions based on Dockers container technology or have contributed towards its development. Rather than me give you a current state of the union on Docker I would recommend you watch the opening key note from the 2016 Docker Conference, it gives a very rundown of how far Docker has come over the last few years. The video can be found at the following URL https://www.youtube.com/watch?v=vE1iDPx6-Ok. Now you are all caught up with what's going in the world of Docker, let's take a look at what storage options you get out of the box with the core Docker Engine. Installing Docker Engine Let start with a small server instance, install Docker and then run a through a couple of scenarios where we may need storage for our containerized application: Before installing Docker, it is always best to check that you have the latest updates installed, to do this run one of the following commands. On CentOS you need to run: yum -y update and for Ubuntu run: apt-get -y update Both of these commands will check for and automatically install any available updates. Once your server instance is up-to-date, you can go-ahead and install Docker, the simplest way of doing this is to run the installation script provided by Docker by entering the following command: curl -fsSL https://get.docker.com/ | sh This method of installation works on both CentOS and Ubuntu, the script run checks to see which packages need to installed and then installs the latest version of the Docker Engine from Docker's own repositories. If you would prefer to check the contents of the script before running it on your server instance you can go to https://get.docker.com/ in your browser, this will display the full bash script so you can see exactly what the script is doing when executed on your server instance. If everything worked as expected, then you will have latest stable version of Docker installed. You check this by running the following command: docker --version This will return the version and build of the version of Docker which was installed, in my case this is Docker version 1.12.0, build 8eab29e. On both operating systems you can check that the Docker is running by using the following command: systemctl status docker If Docker is stopped then you can use the following command to start it: systemctl start docker The Docker local volume driver Now we have Docker installed and running on our server instance, we can start launching containers. From here, the instructions are the same if you are running CentOS or Ubuntu. We are going to be launching an application called Moby Counter, this was written by Kai Davenport to demonstrate maintaining state between containers. It is a Nodejs application which allows you draw (using Docker logos) in your browser window. The coordinates for your drawing are stored in a Redis key value store, the idea being if you stop, start and even remove the Redis service your drawing should persist: Let start by launching our Redis container, to do this run the following command: docker run -itd --name redis redis:alpine redis-server --appendonly Using a backslash: As we will sometimes have a lot options to pass to the commands we are going to be running, we are going to be using to split the command over multiple lines so that it's easier to follow what is going on, like in the command above. The Docker Hub is the registry service provided by Docker. The service allows you to distribute your container images, either by uploading images directly or by building your Docker files which are stored in either GitHub or BitBicket. We will be using the Docker Hub throughout the article, for more information on the Docker Hub see the official documentation at: https://docs.docker.com/docker-hub/. This will download the official Redis image (https://hub.docker.com/_/redis/) from the Docker Hub and then launch a container called Redis by running the redis-server --appendonly command. Adding the --appendonly flag to the Redis server command means that each time Redis receives a command such as SET it will not only execute the command in the dataset which is held in memory, it will also append it to what's called an AOF file. When Redis is restarted the AOF file is replayed to restore the state of the dataset held in RAM. Note that we are using the Alpine Linux version of the container to keep the download size down. Alpine Linux is a Linux distribution which is designed to be small, in fact the official Docker image based on Alpine Linux is only 5 MB in size! Compared to the sizes other base operating system images such as the official Debian, Ubuntu and CentOS which you can see from the following terminal session It is easy to see why the majority of official Docker images are using Alpine Linux as their base image. Now we have Redis up and running, let's launch our application container by running: docker run -itd --name moby-counter -p 80:80 --link redis:redis russmckendrick/moby-counter You check that your two containers are running using the docker ps command, you should see something like the following terminal session: Now you have the Moby Counter application container running, open your browser and go to the IP address of your server instance, in my case this was http://159.203.185.102 once the page loads you should see a white page which says Click to add logos… in the center. As per the instructions, click around the page to add some Docker logos: Now we have some data stored in the Redis store let's do the equivalent of yanking the power cord out of the container by running the following command: docker rm -f redis This will cause your browser to looking something like the following screen capture: Relaunching the Redis container using the following command: docker run -itd --name redis redis:alpine redis-server --appendonly and refreshing our browser takes us back to the page which says Click to add logos… So what gives, why didn't you see your drawing? It's quite simple, we never told Docker to start the container with any persistent storage. Luckily, the official Redis image is smart and assumes that you should really be using persistent storage to store your data in, even if you don't tell it you want to as no-one likes data loss. Using the docker volume ls command you should be able to see that there are two volumes, both with really long names, in my case the output of the command looked like the following terminal capture: So we now have two Docker Volumes, one which contains our original "master piece" and the second which is blank. Before we remove our Redis container again, let's check to see which of the two Docker Volumes is currently mounted on our Redis container, we can do this by running: docker inspect redis There quite a bit of information shown when the docker inspect command, there information we are after is in the Mounts section, and we need to know the Name: As you can see from the terminal output above, the currently mounted volume in my setup is called b76101d13afc2b33206f5a2bba9a3e9b9176f43ce57f74d5836c824c22c. Now we know the name of the blank volume, we know that the second volume has the data for our drawing. Let's terminate the running Redis container by running: docker rm -f redis and now launch our Redis container again, but this time telling Docker to use the local volume driver and also use volume we know contains the data for our drawing: docker run -itd --name redis --volume-driver=local -v 37cb395253624782836cc39be1aa815682b70f73371abb6d500a:/data redis:alpine redis-server --appendonly yes Make sure you replace the name of volume which immediately follows the -v on the fourth line with the name of the volume you know contains your own data. After a few moments, refresh your browser and you should see your original drawing. Our volume was automatically created by the Redis container as it launched which is why it has a Unique Identification Number (UID) as its name, if we wanted to we could give it more friendly name by running the following command to launch our Redis container: docker run -itd --name redis --volume-driver=local -v redis_data:/data redis:alpine redis-server --appendonly yes As you can when running the docker volume ls command again, this is a lot more identifiable: As we have seen with the simple example above, it is easy to create volumes to persist your data on using Docker Engine, however there is one drawback to using the default volume driver and that is your volumes are only available on a single Docker host. For most people just setting out on their journey into using containers this is fine, however for people who want to host their applications in production or would like a more resilient persistent data store using the local driver quickly becomes a single point of failure. This is where Flocker comes in. Summary In this article we have worked through the most basic Flocker installation possible, we have integrated the installation with Docker, the Volume Hub and also the Flocker command to interact with the volumes created and managed by Flocker. In the next article we are going to look at how Flocker can be used to provide more resilient storage for a cluster of Docker hosts rather than just the single host we have been using in this article. Resources for Article: Further resources on this subject: Docker Hosts [article] Hands On with Docker Swarm [article] Understanding Docker [article]
Read more
  • 0
  • 0
  • 2962

article-image-devops-tools-and-technologies
Packt
11 Nov 2016
15 min read
Save for later

DevOps Tools and Technologies

Packt
11 Nov 2016
15 min read
In this article by Ritesh Modi, the author of the book DevOps with Windows Server 2016, we will introduce foundational platforms and technologies instrumental in enabling and implementing DevOps practices. (For more resources related to this topic, see here.) These include: Technology stack for implementing Continuous Integration, Continuous Deployment, Continuous Deliver, Configuration Management, and Continuous Improvement. These form the backbone for DevOps processes and include source code services, build services, and release services through Visual Studio Team Services. Platform and technology used to create and deploy a sample web application. This includes technologies such as Microsoft .NET, ASP.NET and SQL Server databases. Tools and technology for configuration management, testing of code and application, authoring infrastructure as code, and deployment of environments. Examples of these tools and technologies are Pester for environment validation, environment provisioning through Azure Resource Manager (ARM) templates, Desired State Configuration (DSC) and Powershell, application hosting on containers through Windows Containers and Docker, application and database deployment through Web Deploy packages, and SQL Server bacpacs. Cloud technology Cloud is ubiquitous. Cloud is used for our development environment, implementation of DevOps practices, and deployment of applications. Cloud is a relatively new paradigm in infrastructure provisioning, application deployment, and hosting space. The only options prior to the advent of cloud was either self-hosted on-premises deployments or using services from a hosting service provider. However, cloud is changing the way enterprises look at their strategy in relation to infrastructure and application development, deployment, and hosting. In fact, the change is so enormous that it has found its way into every aspect of an organization's software development processes, tools, and practices. Cloud computing refers to the practice of deploying applications and services on the Internet with a cloud provider. A cloud provider provides multiple types of services on cloud. They are divided into three categories based on their level of abstraction and degree of control on services. These categories are as follows: Infrastructure as a Service (IaaS) Platform as a Service (PaaS) Software as a Service (SaaS) These three categories differ based on the level of control a cloud provider exercises compared to the cloud consumer. The services provided by a cloud provider can be divided into layers, with each layer providing a type of service. As we move higher in the stack of layers, the level of abstraction increases in line with the cloud provider's control over services. In other words, the cloud consumer starts to lose control over services as you move higher in each column: Figure 1: Cloud Services – IaaS, PaaS and SaaS Figure 1 shows the three types of service available through cloud providers and the layers that comprise these services. These layers are stacked vertically on each other and show the level of control a cloud provider has compared to a consumer. From Figure 1, it is clear that for IaaS, a cloud provider is responsible for providing, controlling, and managing layers from the network layer up to the virtualization layer. Similarly, for PaaS, a cloud provider controls and manages from the hardware layer up to the runtime layer, while the consumer controls only the application and data layers. Infrastructure as a Service (IaaS) As the name suggests, Infrastructure as a Service is an infrastructure service provided by a cloud provider. This service includes the physical hardware and its configuration, network hardware and its configuration, storage hardware and its configuration, load balancers, compute, and virtualization. Any layer above virtualization is the responsibility of the consumer to provision, configure, and manage. The consumer can decide to use the provided underlying infrastructure in whatever way best suits their requirements. Consumers can consume the storage, network, and virtualization to provision their virtual machines on top of. It is the consumer's responsibility to manage and control the virtual machines and the things deployed within it. Platform as a Service (PaaS) Platform as a Service enables consumers to deploy their applications and services on the provided platform, consuming the underlying runtime, middleware, and services. The cloud provider provides the services from infrastructure to runtime. The consumers cannot provision virtual machines as they cannot access and control them. Instead, they can only control and manage their applications. This is a comparatively faster method of development and deployment because now the consumer can focus on application development and deployment. Examples of Platform as a Service include Azure Automation, Azure SQL, and Azure App Services. Software as a Service (SaaS) Software as a Service provides complete control of the service to the cloud provider. The cloud provider provisions, configures, and manages everything from infrastructure to the application. It includes the provisioning of infrastructure, deployment and configuration of applications, and provides application access to the consumer. The consumer does not control and manage the application, and can use and configure only parts of the application. They control only their data and configuration. Generally, multi-tenant applications used by multiple consumers, such as Office 365 and Visual Studio Team Services, are examples of SaaS. Advantages of using cloud computing There are multiple distinct advantages of using cloud technologies. The major among them are as follows: Cost effective: Cloud computing helps organizations to reduce the cost of storage, networks, and physical infrastructure. It also prevents them from having to buy expensive software licenses. The operational cost of managing these infrastructures also reduces due to lesser effort and manpower requirements. Unlimited capacity: Cloud provides unlimited resources to the consumer. This ensures applications will never get throttled due to limited resource availability. Elasticity: Cloud computing provides the notion of unlimited capacity and applications deployed on it can scale up or down on an as-needed basis. When demand for the application increases, cloud can be configured to scale up the infrastructure and application by adding additional resources. At the same time, it can scale down unnecessary resources during periods of low demand. Pay as you go: Using cloud eliminates capital expenditure and organizations pay only for what they use, thereby providing maximum return on investment. Organizations do not need to build additional infrastructure to host their application for times of peak demand. Faster and better: Cloud provides ready-to-use applications and faster provisioning and deployment of environments. Moreover, organizations get better-managed services from their cloud provider with higher service-level agreements. We will use Azure as our preferred cloud computing provider for the purpose of demonstrating samples and examples. However, you can use any cloud provider that provides complete end-to-end services for DevOps. We will use multiple features and services provided by Azure across IaaS and PaaS. We will consume Operational Insights and Application Insights to monitor our environment and application, which will help capture relevant telemetry for auditing purposes. We will provision Azure virtual machines running Windows and Docker Containers as a hosting platform. We will use Windows Server 2016 as the target operating system for our applications on cloud. Azure Resource Manager (ARM). We will also use Desired State Configuration and PowerShell as our configuration management platform and tool. We will use Visual Studio Team Services (VSTS), a suite of PaaS services on cloud provided by Microsoft, to set up and implement our end-to-end DevOps practices. Microsoft also provides the same services as part of Team Foundation Services (TFS) as an on-premises solution. Technologies like Pester, DSC, and PowerShell can be deployed and configured to run on any platform. These will help both in the validation of our environment and in the configuration of both application and environment, as part of our Configuration management process. Windows Server 2016 is a breakthrough operating system from Microsoft also referred to as Cloud Operating System. We will look into Windows Server 2016 in the following section. Windows Server 2016 Windows Server 2016 has come a long way. All the way from Windows NT to Windows 2000 and 2003, then Windows 2008 (R2) and 2012 (R2), and now Windows Server 2016. Windows NT was the first popular Windows server among enterprises. However, the true enterprise servers were Windows 2000 and Windows 2003. The popularity of Windows Server 2003 was unprecedented and it was widely adopted. With Windows Server 2008 and 2008 R2, the idea of the data center took priority and enterprises with their own data center adopted it. Even the Windows Server 2008 series was quite popular among enterprises. In 2010, the Microsoft cloud, Azure, was launched. The first steps towards a cloud operating system were Windows Server 2012 and 2012 R2. They had the blueprints and technology to be seamlessly provisioned on Azure. Now, when Azure and cloud are gaining enormous popularity, Windows Server 2016 is released as a true cloud operating system. The evolution of Windows Server is shown in Figure 2: Figure 2: Windows Server evolution Windows Server 2016 is referred to as a cloud operating system. It is built with cloud in mind. It is also referred to as the first operating system that enables DevOps seamlessly by providing relevant tools and technologies. It makes implementing DevOps simpler and easier through its productivity tools. Let us look briefly into these tools and technologies. Multiple choices for Application platform Windows Server 2016 comes with many choices for application platform for applications. It provides the following: Windows Server 2016 Nano Server Windows and Docker Containers Hyper-V Containers Nested virtual machines Windows Server as a hosting platform Windows server 2016 can be used in the ways it has always been used, such as hosting applications and providing server functionalities. It provides the services necessary to make applications secure, scalable, and highly available. It also provides virtualization, directory services, certificate services, web server, databases, and more. These services can be consumed by the enterprise’s services and applications. Nano Server Windows Server provides a new option to host applications and services. This is a new variety of lightweight, scaled-down Windows server containing only the kernel and drivers necessary to run as an operating system. They are also known as headless servers. They do not have any graphical user interface and the only way to interact and manage them is through remote PowerShell. Out of the box, they do not contain any service or feature. The services need to be added to Nano servers explicitly before use. So far, they are the most secure servers from Microsoft. They are very lightweight and their resource requirements and consumption is less than 80% of a normal Windows server. The number of services running, the number of ports open, the number of processes running and the amount of memory and storage required, also are less than 80% compared to normal Windows server. Even though Nano Server out of box just has the kernel and drivers, its capabilities can be enhanced by adding features and deploying any Windows application on it. Windows Containers and Docker Containers are one of the most revolutionary features added to Windows Server 2016 after Nano Server. With the popularity and adoption of Docker Containers, which primarily run on Linux, Microsoft decided to introduce container services to Windows Server 2016. Containers are operating system virtualization. This means that multiple containers can be deployed on the same operating system and each one of them will share the host operating system kernel. It is the next level of virtualization after server virtualization (virtual machines). Containers generate the notion of complete operating system isolation and independence, even though it uses the same host operating system underneath it. This is possible through the use of namespace isolation and image layering. Containers are created from images. Images are immutable and cannot be modified. Each image has a base operating system and a series of instructions that are executed against it. Each instruction creates a new image on top of the previous image and contains only the modification. Finally, a writable image is stacked on top of these images. These images are combined into a single image, which can then be used for provisioning containers. A container made up of multiple image layers is shown in Figure 3: Figure 3: Containers made up of multiple image layers Namespace isolation helps provide containers with pristine new environments. The containers cannot see the host resources and the host cannot view the container resources. For the application within the container, a complete new installation of the operating system is available. The containers share the host's memory, CPU, and storage. Containers offer operating system virtualization, which means the containers can host only those operating systems supported by the host operating system. There cannot be a Windows container running on a Linux host, and a Linux container cannot run on a Windows host operating system. Hyper-V containers Another type of container technology Windows Server 2016 provides is Hyper-V Containers. These containers are similar to Windows Containers. They are managed through the same Docker client and extend the same Docker APIs. However, these containers contain their own scaled down operating system kernel. They do not share the host operating system but have their own dedicated operating system, and their own dedicated memory and CPU assigned in exactly the same way virtual machines are assigned resources. Hyper-V Containers brings in a higher level of isolation of containers from the host. While Windows Containers runs in full trust on the host operating system, Hyper-V Containers does not have full trust from the host’s perspective. It is this isolation that differentiates Hyper-V Containers from Windows Containers. Hyper-V Containers is ideal for hosting applications that might harm the host server affecting every other container and service on it. Scenarios where users can bring in and execute their own code are examples of such applications. Hyper-V Containers provides adequate isolation and security to ensure that applications cannot access the host resources and change them. Nested virtual machines Another breakthrough innovation of Windows Server 2016 is that now, virtual machines can host virtual machines. Now, we can deploy multiple virtual machines containing all tiers of an application within a single virtual machine. This is made possible through software-defined networks and storage. Enabling Microservices Nano Servers and Containers helps provide advanced lightweight deployment options through which we can now decompose the entire application into multiple smaller, independent services, each with their own scalability and high availability configuration, and deploy them independently of each other. Microservices helps in making the entire DevOps lifecycle agile. With Microservices, changes to services do not demand that every other Microservices undergo every test validation. Only the changed service needs to be tested rigorously, along with its integration with other services. Compare this to a monolithic application. Even a single small change will result in having to test the entire application. Microservices helps in that it requires smaller teams for its development, testing of a service can happen independently of other services, and deployment can be done for each service in isolation. Continuous Integration, Continuous Deployment, and Continuous Delivery for each service can be executed in isolation rather than compiling, testing, and deploying the whole application every time there is a change. Reduced maintenance Because of their intrinsic nature, Windows Nano Servers and Containers are lightweight and quick to provision. They help to quickly provision and configure environments, thereby reducing the overall time needed for Continuous Integration and deployment. Also, these resources can be provisioned on Azure on-demand without waiting for hours. Because of their small footprint in terms of size, storage, memory, and features, they need less maintenance. These servers are patched less often, with fewer fixes, they are secure by default, and have less chance of failing applications, which makes them ideal for operations. The operations team needs to spend fewer hours maintaining these servers compared to normal servers. This reduces the overall cost for the organization and helps DevOps ensure a high-quality delivery. Configuration management tools Windows Server 2016 comes with Windows Management Framework 5.0 installed by default. Desired State Configuration (DSC) is the new configuration management platform available out of the box in Windows Server 2016. It has a rich, mature set of features that enables configuration management for both environments and applications. With DSC, the desired state and configuration of environments are authored as part of Infrastructure as Code and executed on every server on a scheduled basis. They help check the current state of servers with the documented desired state and bring them back to the desired state. DSC is available as part of PowerShell and PowerShell helps with authoring these configuration documents. Windows server 2016 provides a PowerShell unit testing framework known as PESTER. Historically, unit testing for infrastructure environments was always missing as a feature. PESTER enables the testing of infrastructure provisioned either manually or through Infrastructure as Code using DSC configuration or ARM templates. These help with the operational validation of the entire environment, bringing in a high level of cadence and confidence in Continuous Integration and deployment processes. Deployment and packaging Package management and the deployment of utilities and tools through automation is a new concept in the Windows world. Package management has been ubiquitous in the Linux world for a long time. Packing management helps search, save, install, deploy, upgrade, and remove software packages from multiple sources and repositories on demand. There are public repositories such as Chocolatey and PSGallery available for storing readily deployable packages. Tools such as NuGet can connect these repositories and help with package management. They also help with the versioning of packages. Applications that rely on a specific package version can download it on an as-needed basis. Package management helps with the building of environments and application deployment. Package deployment is much easier and faster with this out-of-the-box Windows feature. Summary We have covered a lot of ground in this article. DevOps concepts were discussed mapping technology to those concepts. In this we saw the impetus DevOps can get from technology. We looked at cloud computing and the different services provided by cloud providers. From there, we went on to look at the benefits Windows Server 2016 brings to DevOps practices and how Windows Server 2016 makes DevOps easier and faster with its native tools and features. Resources for Article: Further resources on this subject: Introducing Dynamics CRM [article] Features of Dynamics GP [article] Creating Your First Plug-in [article]
Read more
  • 0
  • 0
  • 11686

article-image-planning-and-structuring-your-test-driven-ios-app
Packt
11 Nov 2016
13 min read
Save for later

Planning and Structuring Your Test-Driven iOS App

Packt
11 Nov 2016
13 min read
In this article written by Dr. Dominik Hauser, author of the book Test–Driven iOS Development with Swift 3.0, you will learn that when starting TDD, writing unit tests would be easy for most people. The hard part is to transfer the knowledge from writing the test to driving the development. What can be assumed? What should be done before one writes the first test? What should be tested to end up with a complete app? (For more resources related to this topic, see here.) As a developer, you are used to thinking in terms of code. When you see a feature on the requirement list for an app, your brain already starts to layout the code for this feature. And for recurring problems in iOS development (such as building table views), you most probably have already developed your own best practices. In TDD, you should not think about the code while working on the test. The tests have to describe what the unit under test should do and not how it should do it. It should be possible to change the implementation without breaking the tests. To practice this approach of development, we will develop a simple to-do list app in the remainder of this book. It is, on purpose, a boring and easy app. We want to concentrate on the TDD workflow, not complex implementations. An interesting app would distract from what is important in this book—how to do TDD. This article introduces the app that we are going to build, and it shows the views that the finished app will have. We will cover the following topics in this article: The task list view The task detail view The task input view The structure of an app Getting started with Xcode Setting up useful Xcode behaviors for testing The task list view When starting the app, the user sees a list of to-do items. The items in the list consist of a title, an optional location, and the due date. New items can be added to the list by an add (+) button, which is shown in the navigation bar of the view. The task list view will look like this: User stories: As a user, I want to see the list of to-do items when I open the app As a user, I want to add to-do items to the list In a to-do list app, the user will obviously need to be able to check items when they are finished. The checked items are shown below the unchecked items, and it is possible to uncheck them again. The app uses the delete button in the UI of UITableView to check and uncheck items. Checked items will be put at the end of the list in a section with the Finished header. The user can also delete all the items from the list by tapping the trash button. The UI for the to-do item list will look like this: User stories: As a user, I want to check a to-do item to mark it as finished As a user, I want to see all the checked items below the unchecked items As a user, I want to uncheck a to-do item As a user, I want to delete all the to-do items When the user taps an entry, the details of this entry is shown in the task detail view. The task detail view The task detail view shows all the information that's stored for a to-do item. The information consists of a title, due date, location (name and address), and a description. If an address is given, a map with an address is shown. The detail view also allows checking the item as finished. The detail view looks like this: User stories: As a user, given that I have tapped a to-do item in the list, I want to see its details As a user, I want to check a to-do item from its details view The task input view When the user selects the add (+) button in the list view, the task input view is shown. The user can add information for the task. Only the title is required. The Save button can only be selected when a title is given. It is not possible to add a task that is already in the list. The Cancel button dismisses the view. The task input view will look like this: User stories: As a user, given that I have tapped the add (+) button in the item list, I want to see a form to put in the details (title, optional date, optional location name, optional address, and optional description) of a to-do item As a user, I want to add a to-do item to the list of to-do items by tapping on the Save button We will not implement the editing and deletion of tasks. But when you have worked through this book completely, it will be easy for you to add this feature yourself by writing the tests first. Keep in mind that we will not test the look and design of the app. Unit tests cannot figure out whether an app looks like it was intended. Unit tests can test features, and these are independent of their presentation. In principle, it would be possible to write unit tests for the position and color of UI elements. But such things are very likely to change a lot in the early stages of development. We do not want to have failing tests only because a button has moved 10 points. However, we will test whether the UI elements are present on the view. If your user cannot see the information for the tasks, or if it is not possible to add all the information of a task, then the app does not meet the requirements. The structure of the app The following diagram shows the structure of the app: The Table View Controller, the delegate, and the data source In iOS apps, data is often presented using a table view. Table views are highly optimized for performance; they are easy to use and to implement. We will use a table view for the list of to-do items. A table view is usually represented by UITableViewController, which is also the data source and delegate for the table view. This often leads to a massive Table View Controller, because it is doing too much: presenting the view, navigating to other view controllers, and managing the presentation of the data in the table view. It is a good practice to split up the responsibility into several classes. Therefore, we will use a helper class to act as the data source and delegate for the table view. The communication between the Table View Controller and the helper class will be defined using a protocol. Protocols define what the interface of a class looks like. This has a great benefit: if we need to replace an implementation with a better version (maybe because we have learned how to implement the feature in a better way), we only need to develop against the clear interface. The inner workings of other classes do not matter. Table view cells As you can see in the preceding screenshots, the to-do list items have a title and, optionally, they can have a due date and a location name. The table view cells should only show the set data. We will accomplish this by implementing our own custom table view cell. The model The model of the application consists of the to-do item, the location, and an item manager, which allows the addition and removal of items and is also responsible for managing the items. Therefore, the controller will ask the item manager for the items to present. The item manager will also be responsible for storing the items on disc. Beginners often tend to manage the model objects within the controller. Then, the controller has a reference to a collection of items, and the addition and removal of items is directly done by the controller. This is not recommended, because if we decide to change the storage of the items (for example, by using Core Data), their addition and removal would have to be changed within the controller. It is difficult to keep an overview of such a class; and because of this reason, it is a source of bugs. It is much easier to have a clear interface between the controller and the model objects, because if we need to change how the model objects are managed, the controller can stay the same. We could even replace the complete model layer if we just keep the interface the same. Later in the article, we will see that this decoupling also helps to make testing easier. Other view controllers The application will have two more view controllers: a task detail View Controller and a View Controller for the input of the task. When the user taps a to-do item in the list, the details of the item are presented in the task detail View Controller. From the Details screen, the user will be able to check an item. New to-do items will be added to the list of items using the view presented by the input View Controller. The development strategy In this book, we will build the app from inside out. We will start with the model, and then build the controllers and networking. At the end of the book, we will put everything together. Of course, this is not the only way to build apps. But by separating on the basis of layers instead of features, it is easier to follow and keep an overview of what is happening. When you later need to refresh your memory, the relevant information you need is easier to find. Getting started with Xcode Now, let's start our journey by creating a project that we will implement using TDD. Open Xcode and create a new iOS project using the Single View Application template. In the options window, add ToDo as the product name, select Swift as language, choose iPhone in the Devices option, and check the box next to Include Unit Tests. Let the Use Core Data and Include UI Tests boxes stay unchecked. Xcode creates a small iOS project with two targets: one for the implementation code and the other for the unit tests. The template contains code that presents a single view on screen. We could have chosen to start with the master-detail application template, because the app will show a master and a detail view. However, we have chosen the Single View Application template because it comes with hardly any code. And in TDD, we want to have all the implementation code demanded by failing tests. To take a look at how the application target and test target fit together, select the project in Project Navigator, and then select the ToDoTests target. In the General tab, you'll find a setting for the Host Application that the test target should be able to test. It will look like this: Xcode has already set up the test target correctly to allow the testing of the implementations that we will write in the application target. Xcode has also set up a scheme to build the app and run the tests. Click on the Scheme selector next to the stop button in the toolbar, and select Edit Scheme.... In the Test action, all the test bundles of the project will be listed. In our case, only one test bundle is shown—ToDoTests. On the right-hand side of the shown window is a column named Test, with a checked checkbox. This means that if we run the tests while this scheme is selected in Xcode, all the tests in the selected test suite will be run. Setting up useful Xcode behaviors for testing Xcode has a feature called behaviors. With the use of behaviors and tabs, Xcode can show useful information depending on its state. Open the Behaviors window by going to Xcode | Behaviors | Edit Behaviors. On the left-hand side are the different stages for which you can add behaviors (Build, Testing, Running, and so on). The following behaviors are useful when doing TDD. The behaviors shown here are those that I find useful. Play around with the settings to find the ones most useful for you. Overall, I recommend using behaviors because I think they speed up development. Useful build behaviors When building starts, Xcode compiles the files and links them together. To see what is going on, you can activate the build log when building starts. It is recommended that you open the build log in a new tab because this allows switching back to the code editor when no error occurs during the build. Select the Starts stage and check Show tab named. Put in the Log name and select in active window. Check the Show navigator setting and select Issue Navigator. At the bottom of the window, check Navigate to and select current log. After you have made these changes, the settings window will look like this: Build and run to see what the behavior looks like. Testing behaviors To write some code, I have an Xcode tab called Coding. Usually, in this tab, the test is open on the left-hand side, and in the Assistant Editor which is on the right-hand side, there is the code to be tested (or in the case of TDD, the code to be written). It looks like this: When the test starts, we want to see the code editor again. So, we add a behavior to show the Coding tab. In addition to this, we want to see the Test Navigator and debugger with the console view. When the test succeeds, Xcode should show a bezel to notify us that all tests have passed. Go to the Testing | Succeeds stage, and check the Notify using bezel or system notification setting. In addition to this, it should hide the navigator and the debugger, because we want to concentrate on refactoring or writing the next test. In case the testing fails (which happens a lot in TDD), Xcode will show a bezel again. I like to hide the debugger, because usually, it is not the best place to figure out what is going on in the case of a failing test. And in most of the cases in TDD, we already know what the problem is. But we want to see the failing test. Therefore, check Show navigator and select Issue navigator. At the bottom of the window, check Navigate to and select first new issue. You can even make your Mac speak the announcements. Check Speak announcements using and select the voice you like. But be careful not to annoy your coworkers. You might need their help in the future. Now, the project and Xcode are set up, and we can start our TDD journey. Summary In this article, we took a look at the app that we are going to build throughout the course of this book. We took a look at how the screens of the app will look when we are finished with it. We created the project that we will use later on and learned about Xcode behaviors. Resources for Article: Further resources on this subject: Thinking Functionally [article] Hosting on Google App Engine [article] Cloud and Async Communication [article]
Read more
  • 0
  • 0
  • 27364

article-image-introduction-javascript
Packt
10 Nov 2016
15 min read
Save for later

Introduction to JavaScript

Packt
10 Nov 2016
15 min read
In this article by Simon Timms, author of the book, Mastering JavaScript Design Patterns - Second Edition, we will explore the history of JavaScript and how it came to be the important language that it is today (For more resources related to this topic, see here.) JavaScript is an evolving language that has come a long way from its inception. Possibly more than any other programming language, it has grown and changed with the growth of the World Wide Web. As JavaScript has evolved and grown in importance, the need to apply rigorous methods to its construction has also grown. The road to JavaScript We'll never know how language first came into being. Did it slowly evolve from a series of grunts and guttural sounds made during grooming rituals? Perhaps it developed to allow mothers and their offspring to communicate. Both of these are theories, all but impossible to prove. Nobody was around to observe our ancestors during that important period. In fact, the general lack of empirical evidence lead the Linguistic Society of Paris to ban further discussions on the topic, seeing it as unsuitable for serious study. The early days Fortunately, programming languages have developed in recent history and we've been able to watch them grow and change. JavaScript has one of the more interesting histories of modern programming languages. During what must have been an absolutely frantic 10 days in May of 1995, a programmer at Netscape wrote the foundation for what would grow up to be modern JavaScript. At the time, Netscape was involved in the first of the browser wars with Microsoft. The vision for Netscape was far grander than simply developing a browser. They wanted to create an entire distributed operating system making use of Sun Microsystems' recently-released Java programming language. Java was a much more modern alternative to the C++ Microsoft was pushing. However, Netscape didn't have an answer to Visual Basic. Visual Basic was an easier to use programming language, which was targeted at developers with less experience. It avoided some of the difficulties around memory management that make C and C++ notoriously difficult to program. Visual Basic also avoided strict typing and overall allowed more leeway: Brendan Eich was tasked with developing Netscape repartee to VB. The project was initially codenamed Mocha, but was renamed LiveScript before Netscape 2.0 beta was released. By the time the full release was available, Mocha/LiveScript had been renamed JavaScript to tie it into the Java applet integration. Java Applets were small applications which ran in the browser. They had a different security model from the browser itself and so were limited in how they could interact with both the browser and the local system. It is quite rare to see applets these days, as much of their functionality has become part of the browser. Java was riding a popular wave at the time and any relationship to it was played up. The name has caused much confusion over the years. JavaScript is a very different language from Java. JavaScript is an interpreted language with loose typing, which runs primarily on the browser. Java is a language that is compiled to bytecode, which is then executed on the Java Virtual Machine. It has applicability in numerous scenarios, from the browser (through the use of Java applets), to the server (Tomcat, JBoss, and so on), to full desktop applications (Eclipse, OpenOffice, and so on). In most laypersons' minds, the confusion remains. JavaScript turned out to be really quite useful for interacting with the web browser. It was not long until Microsoft had also adopted JavaScript into their Internet Explorer to complement VBScript. The Microsoft implementation was known as JScript. By late 1996, it was clear that JavaScript was going to be the winning web language for the near future. In order to limit the amount of language deviation between implementations, Sun and Netscape began working with the European Computer Manufacturers Association (ECMA) to develop a standard to which future versions of JavaScript would need to comply. The standard was released very quickly (very quickly in terms of how rapidly standards organizations move), in July of 1997. On the off chance that you have not seen enough names yet for JavaScript, the standard version was called ECMAScript, a name which still persists in some circles. Unfortunately, the standard only specified the very core parts of JavaScript. With the browser wars raging, it was apparent that any vendor that stuck with only the basic implementation of JavaScript would quickly be left behind. At the same time, there was much work going on to establish a standard Document Object Model (DOM) for browsers. The DOM was, in effect, an API for a web page that could be manipulated using JavaScript. For many years, every JavaScript script would start by attempting to determine the browser on which it was running. This would dictate how to address elements in the DOM, as there were dramatic deviations between each browser. The spaghetti of code that was required to perform simple actions was legendary. I remember reading a year-long 20-part series on developing a Dynamic HTML (DHTML) drop down menu such that it would work on both Internet Explorer and Netscape Navigator. The same functionally can now be achieved with pure CSS without even having to resort to JavaScript. DHTML was a popular term in the late 1990s and early 2000s. It really referred to any web page that had some sort of dynamic content that was executed on the client side. It has fallen out of use, as the popularity of JavaScript has made almost every page a dynamic one. Fortunately, the efforts to standardize JavaScript continued behind the scenes. Versions 2 and 3 of ECMAScript were released in 1998 and 1999. It looked like there might finally be some agreement between the various parties interested in JavaScript. Work began in early 2000 on ECMAScript 4, which was to be a major new release. A pause Then, disaster struck. The various groups involved in the ECMAScript effort had major disagreements about the direction JavaScript was to take. Microsoft seemed to have lost interest in the standardization effort. It was somewhat understandable, as it was around that time that Netscape self-destructed and Internet Explorer became the de-facto standard. Microsoft implemented parts of ECMAScript 4 but not all of it. Others implemented more fully-featured support, but without the market leader on-board, developers didn't bother using them. Years passed without consensus and without a new release of ECMAScript. However, as frequently happens, the evolution of the Internet could not be stopped by a lack of agreement between major players. Libraries such as jQuery, Prototype, Dojo, and Mootools, papered over the major differences in browsers, making cross-browser development far easier. At the same time, the amount of JavaScript used in applications increased dramatically. The way of GMail The turning point was, perhaps, the release of Google's GMail application in 2004. Although XMLHTTPRequest, the technology behind Asynchronous JavaScript and XML (AJAX), had been around for about five years when GMail was released, it had not been well-used. When GMail was released, I was totally knocked off my feet by how smooth it was. We've grown used to applications that avoid full reloads, but at the time, it was a revolution. To make applications like that work, a great deal of JavaScript is needed. AJAX is a method by which small chunks of data are retrieved from the server by a client instead of refreshing the entire page. The technology allows for more interactive pages that avoid the jolt of full page reloads. The popularity of GMail was the trigger for a change that had been brewing for a while. Increasing JavaScript acceptance and standardization pushed us past the tipping point for the acceptance of JavaScript as a proper language. Up until that point, much of the use of JavaScript was for performing minor changes to the page and for validating form input. I joke with people that in the early days of JavaScript, the only function name which was used was Validate(). Applications such as GMail that have a heavy reliance on AJAX and avoid full page reloads are known as Single Page Applications or SPAs. By minimizing the changes to the page contents, users have a more fluid experience. By transferring only JavaScript Object Notation (JSON) payload instead of HTML, the amount of bandwidth required is also minimized. This makes applications appear to be snappier. In recent years, there have been great advances in frameworks that ease the creation of SPAs. AngularJS, backbone.js, and ember are all Model View Controller style frameworks. They have gained great popularity in the past two to three years and provide some interesting use of patterns. These frameworks are the evolution of years of experimentation with JavaScript best practices by some very smart people. JSON is a human-readable serialization format for JavaScript. It has become very popular in recent years, as it is easier and less cumbersome than previously popular formats such as XML. It lacks many of the companion technologies and strict grammatical rules of XML, but makes up for it in simplicity. At the same time as the frameworks using JavaScript are evolving, the language is too. 2015 saw the release of a much-vaunted new version of JavaScript that had been under development for some years. Initially called ECMAScript 6, the final name ended up being ECMAScript-2015. It brought with it some great improvements to the ecosystem. Browser vendors are rushing to adopt the standard. Because of the complexity of adding new language features to the code base, coupled with the fact that not everybody is on the cutting edge of browsers, a number of other languages that transcompile to JavaScript are gaining popularity. CoffeeScript is a Python-like language that strives to improve the readability and brevity of JavaScript. Developed by Google, Dart is being pushed by Google as an eventual replacement for JavaScript. Its construction addresses some of the optimizations that are impossible in traditional JavaScript. Until a Dart runtime is sufficiently popular, Google provides a Dart to the JavaScript transcompiler. TypeScript is a Microsoft project that adds some ECMAScript-2015 and even some ECMAScript-201X syntax, as well as an interesting typing system, to JavaScript. It aims to address some of the issues that large JavaScript projects present. The point of this discussion about the history of JavaScript is twofold: first, it is important to remember that languages do not develop in a vacuum. Both human languages and computer programming languages mutate based on the environments in which they are used. It is a popularly held belief that the Inuit people have a great number of words for "snow", as it was so prevalent in their environment. This may or may not be true, depending on your definition for the word and exactly who makes up the Inuit people. There are, however, a great number of examples of domain-specific lexicons evolving to meet the requirements for exact definitions in narrow fields. One need look no further than a specialty cooking store to see the great number of variants of items which a layperson such as myself would call a pan. The Sapir–Whorf hypothesis is a hypothesis within the linguistics domain, which suggests that not only is language influenced by the environment in which it is used, but also that language influences its environment. Also known as linguistic relativity, the theory is that one's cognitive processes differ based on how the language is constructed. Cognitive psychologist Keith Chen has proposed a fascinating example of this. In a very highly-viewed TED talk, Dr. Chen suggested that there is a strong positive correlation between languages that lack a future tense and those that have high savings rates (https://www.ted.com/talks/keith_chen_could_your_language_affect_your_ability_to_save_money/transcript). The hypothesis at which Dr. Chen arrived is that when your language does not have a strong sense of connection between the present and the future, this leads to more reckless behavior in the present. Thus, understanding the history of JavaScript puts one in a better position to understand how and where to make use of JavaScript. The second reason I explored the history of JavaScript is because it is absolutely fascinating to see how quickly such a popular tool has evolved. At the time of writing, it has been about 20 years since JavaScript was first built and its rise to popularity has been explosive. What more exciting thing is there than to work in an ever-evolving language? JavaScript everywhere Since the GMail revolution, JavaScript has grown immensely. The renewed browser wars, which pit Internet Explorer and Edge against Chrome, against Firefox, have lead to building a number of very fast JavaScript interpreters. Brand new optimization techniques have been deployed and it is not unusual to see JavaScript compiled to machine-native code for the added performance it gains. However, as the speed of JavaScript has increased, so has the complexity of the applications built using it. JavaScript is no longer simply a language for manipulating the browser, either. The JavaScript engine behind the popular Chrome browser has been extracted and is now at the heart of a number of interesting projects such as Node.js. Node.js started off as a highly asynchronous method of writing server-side applications. It has grown greatly and has a very active community supporting it. A wide variety of applications have been built using the Node.js runtime. Everything from build tools to editors have been built on the base of Node.js. Recently, the JavaScript engine for Microsoft Edge, ChakraCore, was also open sourced and can be embedded in NodeJS as an alternative to Google's V8. SpiderMonkey, the FireFox equivalent, is also open source and is making its way into more tools. JavaScript can even be used to control microcontrollers. The Johnny-Five framework is a programming framework for the very popular Arduino. It brings a much simpler approach to programming devices than the traditional low-level languages used for programming these devices. Using JavaScript and Arduino opens up a world of possibilities, from building robots to interacting with real-world sensors. All of the major smartphone platforms (iOS, Android, and Windows Phone) have an option to build applications using JavaScript. The tablet space is much the same with tablets supporting programming using JavaScript. Even the latest version of Windows provides a mechanism for building applications using JavaScript: JavaScript is becoming one of the most important languages in the world. Although language usage statistics are notoriously difficult to calculate, every single source which attempts to develop a ranking puts JavaScript in the top 10: Language index Rank of JavaScript Langpop.com 4 Statisticbrain.com 4 Codeval.com 6 TIOBE 8 What is more interesting is that most of of these rankings suggest that the usage of JavaScript is on the rise. The long and short of it is that JavaScript is going to be a major language in the next few years. More and more applications are being written in JavaScript and it is the lingua franca for any sort of web development. Developer of the popular Stack Overflow website Jeff Atwood created Atwood's Law regarding the wide adoption of JavaScript: "Any application that can be written in JavaScript, will eventually be written in JavaScript" – Atwood's Law, Jeff Atwood This insight has been proven to be correct time and time again. There are now compilers, spreadsheets, word processors—you name it—all written in JavaScript. As the applications which make use of JavaScript increase in complexity, the developer may stumble upon many of the same issues as have been encountered in traditional programming languages: how can we write this application to be adaptable to change? This brings us to the need for properly designing applications. No longer can we simply throw a bunch of JavaScript into a file and hope that it works properly. Nor can we rely on libraries such as jQuery to save ourselves. Libraries can only provide additional functionality and contribute nothing to the structure of an application. At least some attention must now be paid to how to construct the application to be extensible and adaptable. The real world is ever-changing, and any application that is unable to change to suit the changing world is likely to be left in the dust. Design patterns provide some guidance in building adaptable applications, which can shift with changing business needs. Summary JavaScript has an interesting history and is really coming of age. With server-side JavaScript taking off and large JavaScript applications becoming common, there is a need for more diligence in building JavaScript applications. For more information on JavaScript, you can check other books by Packt mentioned as follows: Mastering JavaScript Promises: https://www.packtpub.com/application-development/mastering-javascript-promises Mastering JavaScript High Performance: https://www.packtpub.com/web-development/mastering-javascript-high-performance JavaScript : Functional Programming for JavaScript Developers: https://www.packtpub.com/web-development/javascript-functional-programming-javascript-developers Resources for Article: Further resources on this subject: API with MongoDB and Node.js [article] Tips & Tricks for Ext JS 3.x [article] Saying Hello! [article]
Read more
  • 0
  • 0
  • 12306

article-image-testing-components-service-dependencies
Victor Mejia
10 Nov 2016
5 min read
Save for later

Testing Components with Service Dependencies

Victor Mejia
10 Nov 2016
5 min read
It is very common for your Angular 2 components to depend on a service that performs actions, such as fetching data. In this post we will look at testing components with service dependencies, and at testing asynchronous actions. We will be using Jasmine for our tests. If you have not read Getting Started Testing Angular 2 Components, I strongly suggest you do so before continuing. Angular 2 Component with a Service Dependency Continuing with our contact manager application, we need to have a ContactService that fetches data from a server: import { Injectable } from '@angular/core'; import { Http } from '@angular/http'; import 'rxjs/add/operator/map'; @Injectable() export class ContactService { constructor(private http: Http){ } getContacts() { return this.http.get('/contacts.json') .map(res => res.json()); } } The Http service is injected here, and TypeScript will automatically assign the injected service to this.http. With this service ready to use, we are now ready to inject it into our ContactsComponent : import { Component, OnInit } from '@angular/core'; import { ContactService } from '../shared/contact.service'; @Component({ selector: 'contacts', template: ` <button (click)="getContacts()">Get Contacts</button> <profile *ngFor="let profile of contacts" [info]="profile"></profile> ` }) export class ContactsComponent implements OnInit { contacts: Array<any>; constructor(private contactService: ContactService) { } ngOnInit() { } getContacts() { this.contactService.getContacts() .subscribe(data => { this.contacts = data; }); } } We have an action set up, so when we click on the button, we make a call to our ContactService to fetch the data and assign the result. Once the call is resolved, the data will display. Setting up your unit test What we have to keep in mind is that we want to test our components in isolation. What this means is that instead of using the actual ContactService implementation, we create a MockContactService that returns mock data (array of Profile s). let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } When configuring our testing module, we add a new property,providers, where we specify the usage of our mock service: TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); We can now go ahead and get handles on fixture, component, and element : import { TestBed, async, ComponentFixture } from '@angular/core/testing'; import { ContactsComponent } from './contacts.component'; import { ContactService } from '../shared/contact.service'; import { Profile } from '../shared/profile.model'; import { Observable } from 'rxjs/Observable'; import { Observer } from 'rxjs/Observer'; let mockData = [ { name: 'Victor Mejia', email: 'victor.mejia@example.com', phone: '123-456-7890' } ]; class MockContactService { getContacts(url) { return Observable.create((observer: Observer<Array<Profile>>) => { observer.next(mockData); }); } } let fixture: ComponentFixture<ContactsComponent>; let component: ContactsComponent; let element: HTMLElement; describe('Component: Contacts', () => { beforeEach(async(() => { TestBed.configureTestingModule({ declarations: [ContactsComponent], providers: [ { provide: ContactService, useClass: MockContactService } ] }); TestBed.compileComponents() .then(() => { fixture = TestBed.createComponent(ContactsComponent); component = fixture.debugElement.componentInstance; element = fixture.debugElement.nativeElement }); })); }); Ensuring calls to our service A good test to always perform is to ensure that your actions are making the correct calls to your service. To do so, we can spy on the getContacts() function on the service, calling the component action and then ensuring that the function was indeed called: describe('getContacts', () => { it('should make a call to contactService.getContacts()', () => { spyOn(component.contactService, 'getContacts').and.callThrough(); component.getContacts(); expect(component.contactService.getContacts).toHaveBeenCalled(); }); }); Ensuring data is set A follow-up test to be performed is to ensure that the data is being set on the component after the call to the API is resolved. Since our call to getContacts() is performing an asynchronous action, we should use the async function in the it: it('should set the contacts property after fetching data', async(() => { ... })); It wraps the test function in an asynchronous “test zone”. Basically, it automatically completes when the asynchronous actions are complete. Next, we can make a call to component.getContacts() . However, we don’t want to run our specs until after that call has been resolved. There is a useful function we can use in our fixture, fixture.whenStable(). This returns a promise that resolves after asynchronous activity. Our test should now look as follows: it('should set the contacts property after fetching data', async(() => { component.getContacts(); fixture.whenStable().then(() => { expect(component.contacts).toEqual(mockData); }); })); We simply run a check to ensure that the contacts property is set to what the API call returns. Finer Async Control There are times when you want finer control, such as dealing with time intervals, and so on. To do so, we can simply use the fakeAsync in conjunction with the tick() function to simulate the passage of time. it('asynchronous timed test...', fakeAsync(() => { component.asyncActionWithTime(); tick(2000); // "advance" 2 seconds expect(...).toBe(...); })); Conclusion Angular 2 has wonderful APIs that make it really easy to test your components. We have seen how to test components with service dependencies, along with asynchronous actions. Time to start writing tests!
Read more
  • 0
  • 0
  • 9531

article-image-managing-users-and-groups
Packt
10 Nov 2016
7 min read
Save for later

Managing Users and Groups

Packt
10 Nov 2016
7 min read
In this article, we will cover the following recipes: Creating user account Creating user accounts in batch mode Creating a group Introduction In this article by Uday Sawant, the author of the book Ubuntu Server Cookbook, you will see how to add new users to the Ubuntu server, update existing users. You will get to know the default setting for new users and how to change them. (For more resources related to this topic, see here.) Creating user account While installing Ubuntu, we add a primary user account on the server; if you are using the cloud image, it comes preinstalled with the default user. This single user is enough to get all tasks done in Ubuntu. There are times when you need to create more restrictive user accounts. This recipe shows how to add a new user to the Ubuntu server. Getting ready You will need super user or root privileges to add a new user to the Ubuntu server. How to do it… Follow these steps to create the new user account: To add a new user in Ubuntu, enter following command in your shell: $ sudo adduser bob Enter your password to complete the command with sudo privileges: Now enter a password for the new user: Confirm the password for the new user: Enter the full name and other information about new user; you can skip this part by pressing the Enter key. Enter Y to confirm that information is correct: This should have added new user to the system. You can confirm this by viewing the file /etc/passwd: How it works… In Linux systems, the adduser command is higher level command to quickly add a new user to the system. Since adduser requires root privileges, we need to use sudo along with the command, adduser completes following operations: Adds a new user Adds a new default group with the same name as the user Chooses UID (user ID) and GID (group ID) conforming to the Debian policy Creates a home directory with skeletal configuration (template) from /etc/skel Creates a password for the new user Runs the user script, if any If you want to skip the password prompt and finger information while adding the new user, use the following command: $ sudo adduser --disabled-password --gecos "" username Alternatively, you can use the useradd command as follows: $ sudo useradd -s <SHELL> -m -d <HomeDir> -g <Group> UserName Where: -s specifies default login shell for the user -d sets the home directory for the user -m creates a home directory if one does not already exist -g specifies the default group name for the user Creating a user with the command useradd does not set password for the user account. You can set or change the user password with the following command: $sudo passwd bob This will change the password for the user account bob. Note that if you skip the username part from the preceding command you will end up changing the password of root account. There's more… With adduser, you can do five different tasks: Add a normal user Add a system user with system option Add user group with the--group option and without the--system option Add a system group when called with the --system option Add an existing user to existing group when called with two non-option arguments Check out the manual page man adduser to get more details. You can also configure various default settings for the adduser command. A configuration file /etc/adduser.conf can be used to set the default values to be used by the adduser, addgroup, and deluser commands. A key value pair of configuration can set various default values, including the home directory location, directory structure skel to be used, default groups for new users, and so on. Check the manual page for more details on adduser.conf with following command: $ man adduser.conf See also Check out the command useradd, a low level command to add new user to system Check out the command usermod, a command to modify a user account See why every user has his own group at: http://unix.stackexchange.com/questions/153390/why-does-every-user-have-his-own-group Creating user accounts in batch mode In this recipe, we will see how to create multiple user accounts in batch mode without using any external tool. Getting ready You will need a user account with root or root privileges. How to do it... Follow these steps to create a user account in batch mode: Create a new text file users.txt with the following command: $ touch users.txt Change file permissions with the following command: $ chmod 600 users.txt Open users.txt with GNU nano and add user accounts details: $ nano users.txt Press Ctrl + O to save the changes. Press Ctrl + X to exit GNU nano. Enter $ sudo newusers users.txt to import all users listed in users.txt file. Check /etc/passwd to confirm that users are created: How it works… We created a database of user details listed in format as the passwd file. The default format for each row is as follows: username:passwd:uid:gid:full name:home_dir:shell Where: username: This is the login name of the user. If a user exists, information for user will be changed; otherwise, a new user will be created. password: This is the password of the user. uid: This is the uid of the user. If empty, a new uid will be assigned to this user. gid: This is the gid for the default group of user. If empty, a new group will be created with the same name as the username. full name: This information will be copied to the gecos field. home_dir: This defines the home directory of the user. If empty, a new home directory will be created with ownership set to new or existing user. shell: This is the default login shell for the user. The new user command reads each row and updates the user information if user already exists, or it creates a new user. We made the users.txt file accessible to owner only. This is to protect this file, as it contains the user's login name and password in unencrypted format. Creating a group Group is a way to organize and administer user accounts in Linux. Groups are used to collectively assign rights and permissions to multiple user accounts. Getting ready You will need super user or root privileges to add a group to the Ubuntu server. How to do it... Enter the following command to add a new group: $ sudo addgroup guest Enter your password to complete addgroup with root privileges. How it works… Here, we are simply adding a new group guest to the server. As addgroup needs root privileges, we need to use sudo along with the command. After creating a new group, addgroup displays the GID of the new group. There's more… Similar to adduser, you can use addgroup in different modes: Add a normal group when used without any options Add a system group with the--system option Add an existing user to existing group when called with two non-option arguments Check out groupadd, a low level utility to add new group to the server See also Check out groupadd, a low level utility to add new group to the server Summary In this article, we have discussed how to create user account, how to create a group and also about how to create user accounts in batch mode. Resources for Article: Further resources on this subject: Directory Services [article] Getting Started with Ansible [article] Lync 2013 Hybrid and Lync Online [article]
Read more
  • 0
  • 0
  • 25543
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-connecting-react-redux-and-firebase-part-2
AJ Webb
10 Nov 2016
11 min read
Save for later

Connecting React to Redux and Firebase – Part 2

AJ Webb
10 Nov 2016
11 min read
This is the second part in a series on using Redux and Firebase with React. If you haven't read through the first part, then you should go back and do so, since this post will build on the other. If you have gone through the first part, then you're in the right place. In this final part of this two-part series, you will be creating actions in your app to update the store. Then you will create a Firebase app and set up async actions that will subscribe to Firebase. Whenever the data on Firebase changes, the store will automatically update and your app will receive the latest state. After all of that is wired up, you'll create a quick user interface. Lets get started. Creating Actions in Redux For your actions, you are going to create a new file inside of src/: [~/code/yak]$ touch src/actions.js Inside of actions.js, you will create your first action; an action in Redux is simply an object that contains a type and a payload. The type allows you to catch the action inside of your reducer and then use the payload to update state. The type can be a string literal or a constant. I recommend using constants for the same reasons given by the Redux team. In this app you will set up some constants to follow this practice. [~/code/yak]$ touch src/constants.js Create your first constant in src/constants.js: export const SEND_MESSAGE = 'CHATS:SEND_MESSAGE'; It is good practice to namespace your constants, like the constant above the CHATS. Namespace can be specific to the CHATS portion of the application and helps document your constants. Now go ahead and use that constant to create an action. In src/actions.js, add: import { SEND_MESSAGE } from './constants'; export function sendMessage(message) { return { type: SEND_MESSAGE, payload: message }; } You now have your first action! You just need to add a way to handle that action in your reducer. So open up src/reducers.js and do just that. First import the constant. import { SEND_MESSAGE } from './constants'; Then inside of the yakApp function, you'll handle different actions: export function yakApp(state = initialState, action) { switch(action.type) { case SEND_MESSAGE: return Object.assign({}, state, { messages: state.messages.concat([action.payload]) }); default: return state; } } A couple of things are happening here, you'll notice that there is a default case that returns state, your reducer must return some sort of state. If you don't return state your app will stop working. Redux does a good job of letting you know what's happening, but it is still good to know. Another thing to notice is that the SEND_MESSAGE case does not mutate the state. Instead it creates a new state and returns it. Mutating the state can result in bad side effects that are hard to debug; do your very best to never mutate the state. You should also avoid mutating the arguments passed to a reducer, performing side effects and calling any non-pure functions within your reducer. For the most part, your reducer is set up to take the new state and return it in conjunction with the old state. Now that you have an action and a reducer that is handling that action, you are ready for your component to interact with them. In src/App.js add an input and a button: <p className="App-intro"> <input type="text" />{' '} <button>Send Message</button> </p> Now that you have the input and button in, you're going to add two things. The first is a function that runs when the button is clicked. The second is a reference on the input so that the function can easily find the value of your input. js <p className="App-intro"> <input type="text" ref={(i) => this.message = i}/>{' '} <button onClick={this.handleSendMessage.bind(this)}>Send Message</button> </p> Now that you have those set, you will need to add the sendMessage function, but you'll also need to be able to dispatch an action. So you'll want to map your actions to props, similar to how you mapped state to props in the previous guide. At the end of the file, under the mapStateToProps function, add the following function: function mapDispatchToProps(dispatch) { return { sendMessage: (msg) => dispatch(sendMessage(msg)) }; } Then you'll need to add the mapDispatchToProps function to the export statement: export default connect(mapStateToProps, mapDispatchToProps)(App); And you'll need to import the sendMessage action into src/App.js: import { sendMessage } from './actions'; Finally you'll need to create the handleSendMessage method inside of your App class, just above the render method: handleSendMessage() { const { sendMessage } = this.props; const { value } = this.message; if (!value) { return false; } sendMessage(value); this.message.value = ''; } If you still have a console.log statement inside of your render method, from the last guide, you should see your message being added to the array each time you click on the button. The final piece of UI that you need to create is a list of all the messages you've received. Add the following to the render method in src/App.js, just below the <p/> that contains your input: {messages.map((message, i) => ( <p key={i} style={{textAlign: 'left', padding: '0 10px'}}>{message}</p> ))} Now you have a crude chat interface; you can improve it later. Set up Firebase If you've never used Firebase before, you'll first need a Google account. If you don't have a Google account, sign up for one; then you'll be able to create a new Firebase project. Head over to Firebase and create a new project. If you have trouble naming it, try YakApp_[YourName]. After you have created your Firebase project, you'll be taken to the project. There is a button that says "Add Firebase to your web app"; click on it and you'll be able to get the configuration information that you will need for your app to be able to work with Firebase. A dialog will open and you should see something like this: You will only be using the database for this project, so you'll only need a portion of the config. Keep the config handy while you prepare the app to use Firebase. First add the Firebase package to your app: [~/code/yak]$ npm install --save firebase Open src/actions.js and import firebase: import firebase from 'firebase' You will only be using Firebase in your actions; that is the reason you are importing it here. Once imported, you'll need to initialize Firebase. After the import section, add your Firebase config (the one from the image above), initialize Firebase and create a reference to messages: const firebaseConfig = { apiKey: '[YOUR API KEY]', databaseURL: '[YOUR DATABASE URL]' } firebase.initializeApp(firebaseConfig); const messages = firebase.database().ref('messages'); You'll notice that you only need the apiKey and databaseUrl for now. Eventually you might want to add auth and storage, and other facets of Firebase, but for now you only need database. Now that Firebase is set up and initialized, you can use it to store the chat history. One last thing before you leave the Firebase console: Firebase automatically sets the database rules to require users to be logged in. That is a great idea, but authentication is outside of the scope of this post. So you'll need to turn that off in the rules. In the Firebase console, click on the "Database" navigation item on the left side of the page; now there should be a tab bar with a "Rules" option. Click on "Rules" and replace what is in the textbox with: { "rules": { ".read": true, ".write": true } } Create subscriptions In order to subscribe to Firebase, you will need a way to send async actions. So you'll need another package to help with that. Go ahead and install redux-thunk. [~/code/yak]$ npm install --save redux-thunk After it's finished installing, you'll need to add it as middleware to your store. That means it's time to head back to src/index.js and add the extra parameters to the createStore function. First import redux-thunk into src/index.js and import another method from redux itself. import { createStore, applyMiddleware } from 'redux';import thunk from 'redux-thunk'; Now update the createStore call to be: const store = createStore(yakApp, undefined, applyMiddleware(thunk)); You are now ready to create async actions. In order to create more actions, you're going to need more constants. Head over to src/constants.js and add the following constant. export const RECEIVE_MESSAGE = 'CHATS:RECEIVE_MESSAGE'; This is the only constant you will need for now; after this guide, you should create edit and delete actions. At that point, you'll need more constants for those actions. For now, only concern yourself with adding and subscribing. You'll need to refactor your actions and reducer to handle these new features. In src/actions.js, refactor the sendMessage action to be an async action. The way redux-thunk works is that it intercepts any action that is dispatched and inspects it. If the action returns a function instead of an object, redux-thunk stops it from reaching the reducer and runs the function. If the action returns an object, redux-thunk ignores it and allows it to continue to the reducer. Change sendMessage to look like the following: export function sendMessage(message) { return function() { messages.push(message); }; } Now when you type a message in and hit the "Submit Message" button, the message will get stored in Firebase. Try it out! UH OH! There's a problem! You are adding messages to Firebase but they aren't showing up in your app anymore! That's ok. You can fix that! You'll need to create a few more actions though, and refactor your reducer. First off, you can delete one of your constants. You no longer need the constant from earlier: export const SEND_MESSAGE = 'CHATS:SEND_MESSAGE'; Go ahead and remove it; that leaves you with only one constant. Change your constant import in both src/actions.js and src/reducer.js: import { RECEIVE_MESSAGE } from './constants'; Now in actions, add the following action: function receiveMessage(message) { return { type: RECEIVE_MESSAGE, payload: message }; } That should look familiar; it's almost identical to the original sendMessage action. You'll also need to rename the action that your reducer is looking for. So now your reducer function should look like this: export function yakApp(state = initialState, action) { switch(action.type) { case RECEIVE_MESSAGE: return Object.assign({}, state, { messages: state.messages.concat([action.payload]) }); default: return state; } } Before the list starts to show up again, you'll need to create a subscription to Firebase. Sounds more complicated than it really is. Add the following action to src/actions.js: export function subscribeToMessages() { return function(dispatch) { messages.on('child_added', data => dispatch(receiveMessage(data.val()))); } } Now, in src/App.js you'll need to dispatch that action and set up the subscription. First change your import statement from ./actions to include subscribeToMessages: import { sendMessage, subscribeToMessages } from './actions'; Now, in mapDispatchToProps you need to map subscribeToMessages: function mapDispatchToProps(dispatch) { return { sendMessage: (msg) => dispatch(sendMessage(msg)), subscribeToMessages: () => dispatch(subscribeToMessages()), }; } Finally, inside of the App class, add the componentWillMount life cycle method just above the handleSendMessage method, and call subscribeToMessages: componentWillMount() { this.props.subscribeToMessages(); } Once you save the file, you should see that your app is subscribed to the Firebase database, which is automatically updating your Redux state, and your UI is displaying all the messages stored in Firebase. Have some fun and open another browser window! Conclusion You have now created a React app from scratch, added Redux to it, refactored it to connect to Firebase and, via subscriptions, updated your entire app state! What will you do now? Of course the app could use a better UI; you could redesign it! Maybe make it a little more usable. Try learning how to create users and show who yakked about what! The sky is the limit, now that you know how to put all the pieces together! Feel free to share what you've done with the app in the comments! About the author AJ Webb is team lead and frontend engineer for @tannerlabs and a co-creator of Payba.cc.
Read more
  • 0
  • 0
  • 17251

article-image-why-we-need-design-patterns
Packt
10 Nov 2016
16 min read
Save for later

Why we need Design Patterns?

Packt
10 Nov 2016
16 min read
In this article by Praseed Pai, and Shine Xavier, authors of the book .NET Design Patterns, we will try to understand the necessity of choosing a pattern-based approach to software development. We start with some principles of software development, which one might find useful while undertaking large projects. The working example in the article starts with a requirements specification and progresses towards a preliminary implementation. We will then try to iteratively improve the solution using patterns and idioms, and come up with a good design that supports a well-defined programming Interface. In this process, we will learn about some software development principles (listed below) one can adhere to, including the following: SOLID principles for OOP Three key uses of design patterns Arlow/Nuestadt archetype patterns Entity, value, and data transfer objects Leveraging the .NET Reflection API for plug-in architecture (For more resources related to this topic, see here.) Some principles of software development Writing quality production code consistently is not easy without some foundational principles under your belt. The purpose of this section is to whet the developer's appetite, and towards the end, some references are given for detailed study. Detailed coverage of these principles warrants a separate book on its own scale. The authors have tried to assimilate the following key principles of software development which would help one write quality code: KISS: Keep it simple, Stupid DRY: Don't repeat yourself YAGNI: You aren't gonna need it Low coupling: Minimize coupling between classes SOLID principles: Principles for better OOP William of Ockham had framed the maxim Keep it simple, Stupid (KISS). It is also called law of parsimony. In programming terms, it can be translated as "writing code in a straightforward manner, focusing on a particular solution that solves the problem at hand". This maxim is important because, most often, developers fall into the trap of writing code in a generic manner for unwarranted extensibility. Even though it initially looks attractive, things slowly go out of bounds. The accidental complexity introduced in the code base for catering to improbable scenarios, often reduces readability and maintainability. The KISS principle can be applied to every human endeavor. Learn more about KISS principle by consulting the Web. Don't repeat yourself (DRY), a maxim which most programmers often forget while implementing their domain logic. Most often, in a collaborative development scenario, code gets duplicated inadvertently due to lack of communication and proper design specifications. This bloats the code base, induces subtle bugs, and make things really difficult to change. By following the DRY maxim at all stages of development, we can avoid additional effort and make the code consistent. The opposite of DRY is write everything twice (WET). You aren't gonna need it (YAGNI), a principle that compliments the KISS axiom. It serves as a warning for people who try to write code in the most general manner, anticipating changes right from the word go,. Too often, in practice, most of this code is not used to make potential code smells. While writing code, one should try to make sure that there are no hard-coded references to concrete classes. It is advisable to program to an interface as opposed to an implementation. This is a key principle which many patterns use to provide behavior acquisition at runtime. A dependency injection framework could be used to reduce coupling between classes. SOLID principles are a set of guidelines for writing better object-oriented software. It is a mnemonic acronym that embodies the following five principles: 1 Single Responsibility Principle (SRP) A class should have only one responsibility. If it is doing more than one unrelated thing, we need to split the class. 2 Open Close Principle (OCP) A class should be open for extension, closed for modification. 3 Liskov Substitution Principle (LSP) Named after Barbara Liskov, a Turing Award laureate, who postulated that a sub-class (derived class) could substitute any super class (base class) references without affecting the functionality. Even though it looks like stating the obvious, most implementations have quirks which violate this principle. 4 Interface segregation principle (ISP) It is more desirable to have multiple interfaces for a class (such classes can also be called components) than having one Uber interface that forces implementation of all methods (both relevant and non-relevant to the solution context). 5 Dependency Inversion (DI) This is a principle which is very useful for Framework design. In the case of Frameworks, the client code will be invoked by server code, as opposed to the usual process of client invoking the server. The main principle here is that abstraction should not depend upon details, rather, details should depend upon abstraction. This is also called the "Hollywood Principle" (Do not call us, we will call you back). The authors consider the preceding five principles primarily as a verification mechanism. This will be demonstrated by verifying the ensuing case study implementations for violation of these principles. Karl Seguin has written an e-book titled Foundations of Programming – Building Better Software, which covers most of what has been outlined here. Read his book to gain an in-depth understanding of most of these topics. The SOLID principles are well covered in the Wikipedia page on the subject, which can be retrieved from https://en.wikipedia.org/wiki/SOLID_(object-oriented_design). Robert Martin's Agile Principles, Patterns, and Practices in C# is a definitive book on learning about SOLID, as Robert Martin itself is the creator of these principles, even though Michael Feathers coined the acronym. Why patterns are required? According to the authors, the three key advantages of pattern-oriented software development that stand out are as follows: A language/platform-agnostic way to communicate about software artifacts A tool for refactoring initiatives (targets for refactoring) Better API design With the advent of the pattern movement, the software development community got a canonical language to communicate about software design, architecture, and implementation. Software development is a craft which has got trade-offs attached to each strategy, and there are multiple ways to develop software. The various pattern catalogues brought some conceptual unification for this cacophony in software development. Most developers around the world today who are worth their salt, can understand and speak this language. We believe you will be able to do the same at the end of the article. Fancy yourself stating the following about your recent implementation: For our tax computation example, we have used command pattern to handle the computation logic. The commands (handlers) are configured using an XML file, and a factory method takes care of the instantiation of classes on the fly using Lazy loading. We cache the commands, and avoid instantiation of more objects by imposing singleton constraints on the invocation. We support prototype pattern where command objects can be cloned. The command objects have a base implementation, where concrete command objects use the template method pattern to override methods which are necessary. The command objects are implemented using the design by contracts idiom. The whole mechanism is encapsulated using a Façade class, which acts as an API layer for the application logic. The application logic uses entity objects (reference) to store the taxable entities, attributes like tax parameters are stored as value objects. We use data transfer object (DTO) to transfer the data from the application layer to the computational layer. Arlow/Nuestadt-based archetype pattern is the unit of structuring the tax computation logic. For some developers, the preceding language/platform-independent description of the software being developed is enough to understand the approach taken. This will boos developer productivity (during all phases of SDLC, including development, maintenance, and support) as the developers will be able to get a good mental model of the code base. Without Pattern catalogs, such succinct descriptions of the design or implementation would have been impossible. In an Agile software development scenario, we develop software in an iterative fashion. Once we reach a certain maturity in a module, developers refactor their code. While refactoring a module, patterns do help in organizing the logic. The case study given next will help you to understand the rationale behind "Patterns as refactoring targets". APIs based on well-defined patterns are easy to use and impose less cognitive load on programmers. The success of the ASP.NET MVC framework, NHibernate, and API's for writing HTTP modules and handlers in the ASP.NET pipeline are a few testimonies to the process. Personal income tax computation - A case study Rather than explaining the advantages of patterns, the following example will help us to see things in action. Computation of the annual income tax is a well-known problem domain across the globe. We have chosen an application domain which is well known to focus on the software development issues. The application should receive inputs regarding the demographic profile (UID, Name, Age, Sex, Location) of a citizen and the income details (Basic, DA, HRA, CESS, Deductions) to compute his tax liability. The System should have discriminants based on the demographic profile, and have a separate logic for senior citizens, juveniles, disabled people, old females, and others. By discriminant we mean demographic that parameters like age, sex and location should determine the category to which a person belongs and apply category-specific computation for that individual. As a first iteration, we will implement logic for the senior citizen and ordinary citizen category. After preliminary discussion, our developer created a prototype screen as shown in the following image: Archetypes and business archetype pattern The legendary Swiss psychologist, Carl Gustav Jung, created the concept of archetypes to explain fundamental entities which arise from a common repository of human experiences. The concept of archetypes percolated to the software industry from psychology. The Arlow/Nuestadt patterns describe business archetype patterns like Party, Customer Call, Product, Money, Unit, Inventory, and so on. An Example is the Apache Maven archetype, which helps us to generate projects of different natures like J2EE apps, Eclipse plugins, OSGI projects, and so on. The Microsoft patterns and practices describes archetypes for targeting builds like Web applications, rich client application, mobile applications, and services applications. Various domain-specific archetypes can exist in respective contexts as organizing and structuring mechanisms. In our case, we will define some archetypes which are common in the taxation domain. Some of the key archetypes in this domain are: Sr.no Archetype Description 1 SeniorCitizenFemale Tax payers who are female, and above the age of 60 years 2 SeniorCitizen Tax payers who are male, and above the age of 60 years 3 OrdinaryCitizen Tax payers who are Male/Female, and above 18 years of age 3 DisabledCitizen Tax payers who have any disability 4 MilitaryPersonnel Tax payers who are military personnel 5 Juveniles Tax payers whose age is less than 18 years We will use demographic parameters as discriminant to find the archetype which corresponds to the entity. The whole idea of inducing archetypes is to organize the tax computation logic around them. Once we are able to resolve the archetypes, it is easy to locate and delegate the computations corresponding to the archetypes. Entity, value, and data transfer objects We are going to create a class which represents a citizen. Since citizen needs to be uniquely identified, we are going to create an entity object, which is also called reference object (from DDD catalog). The universal identifier (UID) of an entity object is the handle which an application refers. Entity objects are not identified by their attributes, as there can be two people with the same name. The ID uniquely identifies an entity object. The definition of an entity object is given as follows: public class TaxableEntity { public int Id { get; set; } public string Name { get; set; } public int Age { get; set; } public char Sex { get; set; } public string Location { get; set; } public TaxParamVO taxparams { get; set; } } In the preceding class definition, Id uniquely identifies the entity object. TaxParams is a value object (from DDD catalog) associated with the entity object. Value objects do not have a conceptual identity. They describe some attributes of things (entities). The definition of TaxParams is given as follows: public class TaxParamVO { public double Basic {get;set;} public double DA { get; set; } public double HRA { get; set; } public double Allowance { get; set; } public double Deductions { get; set; } public double Cess { get; set; } public double TaxLiability { get; set; } public bool Computed { get; set; } } While writing applications ever since Smalltalk, Model-view-controller (MVC) is the most dominant paradigm for structuring applications. The application is split into a model layer ( which mostly deals with data), view layer (which acts as a display layer), and a controller (to mediate between the two). In the Web development scenario, they are physically partitioned across machines. To transfer data between layers, the J2EE pattern catalog identified the DTO to transfer data between layers. The DTO object is defined as follows: public class TaxDTO { public int id { } public TaxParamVO taxparams { } } If the layering exists within the same process, we can transfer these objects as-is. If layers are partitioned across processes or systems, we can use XML or JSON serialization to transfer objects between the layers. A computation engine We need to separate UI processing, input validation, and computation to create a solution which can be extended to handle additional requirements. The computation engine will execute different logic depending upon the command received. The GoF command pattern is leveraged for executing the logic based on the command received. The command pattern consists of four constituents. They are: Command object Parameters Command Dispatcher Client The command object's interface has an Execute method. The parameters to the command objects are passed through a bag. The client invokes the command object by passing the parameters through a bag to be consumed by the Command Dispatcher. The Parameters are passed to the command object through the following data structure: public class COMPUTATION_CONTEXT { private Dictionary<String, Object> symbols = new Dictionary<String, Object>(); public void Put(string k, Object value) { symbols.Add(k, value); } public Object Get(string k) { return symbols[k]; } } The ComputationCommand interface, which all the command objects implement, has only one Execute method, which is shown next. The Execute method takes a bag as parameter. The COMPUTATION_CONTEXT data structure acts as the bag here. Interface ComputationCommand { bool Execute(COMPUTATION_CONTEXT ctx); } Since we have already implemented a command interface and bag to transfer the parameters, it is time that we implement a command object. For the sake of simplicity, we will implement two commands where we hardcode the tax liability. public class SeniorCitizenCommand : ComputationCommand { public bool Execute(COMPUTATION_CONTEXT ctx) { TaxDTO td = (TaxDTO)ctx.Get("tax_cargo"); //---- Instead of computation, we are assigning //---- constant tax for each arcetypes td.taxparams.TaxLiability = 1000; td.taxparams.Computed = true; return true; } } public class OrdinaryCitizenCommand : ComputationCommand { public bool Execute(COMPUTATION_CONTEXT ctx) { TaxDTO td = (TaxDTO)ctx.Get("tax_cargo"); //---- Instead of computation, we are assigning //---- constant tax for each arcetypes td.taxparams.TaxLiability = 1500; td.taxparams.Computed = true; return true; } } The commands will be invoked by a CommandDispatcher Object, which takes an archetype string and a COMPUTATION_CONTEXT object. The CommandDispatcher acts as an API layer for the application. class CommandDispatcher { public static bool Dispatch(string archetype, COMPUTATION_CONTEXT ctx) { if (archetype == "SeniorCitizen") { SeniorCitizenCommand cmd = new SeniorCitizenCommand(); return cmd.Execute(ctx); } else if (archetype == "OrdinaryCitizen") { OrdinaryCitizenCommand cmd = new OrdinaryCitizenCommand(); return cmd.Execute(ctx); } else { return false; } } } The application to engine communication The data from the application UI, be it Web or Desktop, has to flow to the computation engine. The following ViewHandler routine shows how data, retrieved from the application UI, is passed to the engine, via the Command Dispatcher, by a client: public static void ViewHandler(TaxCalcForm tf) { TaxableEntity te = GetEntityFromUI(tf); if (te == null){ ShowError(); return; } string archetype = ComputeArchetype(te); COMPUTATION_CONTEXT ctx = new COMPUTATION_CONTEXT(); TaxDTO td = new TaxDTO { id = te.id, taxparams = te.taxparams}; ctx.Put("tax_cargo",td); bool rs = CommandDispatcher.Dispatch(archetype, ctx); if ( rs ) { TaxDTO temp = (TaxDTO)ctx.Get("tax_cargo"); tf.Liabilitytxt.Text = Convert.ToString(temp.taxparams.TaxLiability); tf.Refresh(); } } At this point, imagine that a change in requirement has been received from the stakeholders. Now, we need to support tax computation for new categories. Initially, we had different computations for senior citizen and ordinary citizen. Now we need to add new Archetypes. At the same time, to make the software extensible (loosely coupled) and maintainable, it would be ideal if we provide the capability to support new Archetypes in a configurable manner as opposed to recompiling the application for every new archetype owing to concrete references. The Command Dispatcher object does not scale well to handle additional archetypes. We need to change the assembly whenever a new archetype is included, as the tax computation logic varies for each archetype. We need to create a pluggable architecture to add or remove archetypes at will. The plugin system to make system extensible Writing system logic without impacting the application warrants a mechanism—that of loading a class on the fly. Luckily, the .NET Reflection API provides a mechanism for one to load a class during runtime, and invoke methods within it. A developer worth his salt should learn the Reflection API to write systems which change dynamically. In fact, most of the technologies like ASP.NET, Entity framework, .NET Remoting, and WCF work because of the availability of Reflection API in the .NET stack. Henceforth, we will be using an XML configuration file to specify our tax computation logic. A sample XML file is given next: <?xml version="1.0"?> <plugins> <plugin archetype ="OrindaryCitizen" command="TaxEngine.OrdinaryCitizenCommand"/> <plugin archetype="SeniorCitizen" command="TaxEngine.SeniorCitizenCommand"/> </plugins> The contents of the XML file can be read very easily using LINQ to XML. We will be generating a Dictionary object by the following code snippet: private Dictionary<string,string> LoadData(string xmlfile) { return XDocument.Load(xmlfile) .Descendants("plugins") .Descendants("plugin") .ToDictionary(p => p.Attribute("archetype").Value, p => p.Attribute("command").Value); } Summary In this article, we have covered quite a lot of ground in understanding why pattern-oriented software development is a good way to develop modern software. We started the article citing some key principles. We progressed further to demonstrate the applicability of these key principles by iteratively skinning an application which is extensible and resilient to changes. Resources for Article: Further resources on this subject: Debugging Your .NET Application [article] JSON with JSON.Net [article] Using ASP.NET Controls in SharePoint [article]
Read more
  • 0
  • 0
  • 37642

article-image-using-ros-uavs
Packt
10 Nov 2016
11 min read
Save for later

Using ROS with UAVs

Packt
10 Nov 2016
11 min read
In this article by Carol Fairchild and Dr. Thomas L. Harman, co-authors of the book ROS Robotics by Example, you will discover the field of ROS Unmanned Air Vehicles (UAVs), quadrotors, in particular. The reader is invited to learn about the simulated hector quadrotor and take it for a flight. The ROS wiki currently contains a growing list of ROS UAVs. These UAVs are as follows: (For more resources related to this topic, see here.) AscTec Pelican and Hummingbird quadrotors Berkeley's STARMAC Bitcraze Crazyflie DJI Matrice 100 Onboard SDK ROS support Erle-copter ETH sFly Lily CameraQuadrotor Parrot AR.Drone Parrot Bebop Penn's AscTec Hummingbird Quadrotors PIXHAWK MAVs Skybotix CoaX helicopter Refer to http://wiki.ros.org/Robots#UAVs for future additions to this list and to the website http://www.ros.org/news/robots/uavs/ to get the latest ROS UAV news. The preceding list contains primarily quadrotors except for the Skybotix helicopter. A number of universities have adopted the AscTec Hummingbird as their ROS UAV of choice. For this book, we present a simulator called Hector Quadrotor and two real quadrotors Crazyflie and Bebop that use ROS. Introducing Hector quadrotor The hardest part of learning about flying robots is the constant crashing. From the first-time learning of flight control to testing new hardware or flight algorithms, the resulting failures can have a huge cost in terms of broken hardware components. To answer this difficulty, a simulated air vehicle designed and developed for ROS is ideal. A simulated quadrotor UAV for the ROS Gazebo environment has been developed by the Team Hector Darmstadt of Technische Universität Darmstadt. This quadrotor, called Hector Quadrotor, is enclosed in the hector_quadrotor metapackage. This metapackage contains the URDF description for the quadrotor UAV, its flight controllers, and launch files for running the quadrotor simulation in Gazebo. Advanced uses of the Hector Quadrotor simulation allows the user to record sensor data such as Lidar, depth camera, and many more. The quadrotor simulation can also be used to test flight algorithms and control approaches in simulation. The hector_quadrotor metapackage contains the following key packages: hector_quadrotor_description: This package provides a URDF model of Hector Quadrotor UAV and the quadrotor configured with various sensors. Several URDF quadrotor models exist in this package each configured with specific sensors and controllers. hector_quadrotor_gazebo: This package contains launch files for executing Gazebo and spawning one or more Hector Quadrotors. hector_quadrotor_gazebo_plugins: This package contains three UAV specific plugins, which are as follows: The simple controller gazebo_quadrotor_simple_controller subscribes to a geometry_msgs/Twist topic and calculates the required forces and torques A gazebo_ros_baro sensor plugin simulates a barometric altimeter The gazebo_quadrotor_propulsion plugin simulates the propulsion, aerodynamics, and drag from messages containing motor voltages and wind vector input hector_gazebo_plugins: This package contains generic sensor plugins not specific to UAVs such as IMU, magnetic field, GPS, and sonar data. hector_quadrotor_teleop: This package provides a node and launch files for controlling a quadrotor using a joystick or gamepad. hector_quadrotor_demo: This package provides sample launch files that run the Gazebo quadrotor simulation and hector_slam for indoor and outdoor scenarios. The entire list of packages for the hector_quadrotor metapackage appears in the next section. Loading Hector Quadrotor The repository for the hector_quadrotor software is at the following website: https://github.com/tu-darmstadt-ros-pkg/hector_quadrotor The following commands will install the binary packages of hector_quadrotor into the ROS package repository on your computer. If you wish to install the source files, instructions can be found at the following website: http://wiki.ros.org/hector_quadrotor/Tutorials/Quadrotor%20outdoor%20flight%20demo (It is assumed that ros-indigo-desktop-full has been installed on your computer.) For the binary packages, type the following commands to install the ROS Indigo version of Hector Quadrotor: $ sudo apt-get update $ sudo apt-get install ros-indigo-hector-quadrotor-demo A large number of ROS packages are downloaded and installed in the hector_quadrotor_demo download with the main hector_quadrotor packages providing functionality that should now be somewhat familiar. This installation downloads the following packages: hector_gazebo_worlds hector_geotiff hector_map_tools hector_mapping hector_nav_msgs hector_pose_estimation hector_pose_estimation_core hector_quadrotor_controller hector_quadrotor_controller_gazebo hector_quadrotor_demo hector_quadrotor_description hector_quadrotor_gazebo hector_quadrotor_gazebo_plugins hector_quadrotor_model hector_quadrotor_pose_estimation hector_quadrotor_teleop hector_sensors_description hector_sensors_gazebo hector_trajectory_serve hector_uav_msgs message_to_tf A number of these packages will be discussed as the Hector Quadrotor simulations are described in the next section. Launching Hector Quadrotor in Gazebo Two demonstration tutorials are available to provide the simulated applications of the Hector Quadrotor for both outdoor and indoor environments. These simulations are described in the next sections. Before you begin the Hector Quadrotor simulations, check your ROS master using the following command in your terminal window: $ echo $ROS_MASTER_URI If this variable is set to localhost or the IP address of your computer, no action is needed. If not, type the following command: $ export ROS_MASTER_URI=http://localhost:11311 This command can also be added to your .bashrc file. Be sure to delete or comment out (with a #) any other commands setting the ROS_MASTER_URI variable. Flying Hector outdoors The quadrotor outdoor flight demo software is included as part of the hector_quadrotor metapackage. Start the simulation by typing the following command: $ roslaunch hector_quadrotor_demo outdoor_flight_gazebo.launch This launch file loads a rolling landscape environment into the Gazebo simulation and spawns a model of the Hector Quadrotor configured with a Hokuyo UTM-30LX sensor. An rviz node is also started and configured specifically for the quadrotor outdoor flight. A large number of flight position and control parameters are initialized and loaded into the Parameter Server. Note that the quadrotor propulsion model parameters for quadrotor_propulsion plugin and quadrotor drag model parameters for quadrotor_aerodynamics plugin are displayed. Then look for the following message: Physics dynamic reconfigure ready. The following screenshots show both the Gazebo and rviz display windows when the Hector outdoor flight simulation is launched. The view from the onboard camera can be seen in the lower left corner of the rviz window. If you do not see the camera image on your rviz screen, make sure that Camera has been added to your Displays panel on the left and that the checkbox has been checked. If you would like to pilot the quadrotor using the camera, it is best to uncheck the checkboxes for tf and robot_model because the visualizations sometimes block the view: Hector Quadrotor outdoor gazebo view Hector Quadrotor outdoor rviz view The quadrotor appears on the ground in the simulation ready for takeoff. Its forward direction is marked by a red mark on its leading motor mount. To be able to fly the quadrotor, you can launch the joystick controller software for the Xbox 360 controller. In a second terminal window, launch the joystick controller software with a launch file from the hector_quadrotor_teleop package: $ roslaunch hector_quadrotor_teleop xbox_controller.launch This launch file launches joy_node to process all joystick input from the left stick and right stick on the Xbox 360 controller as shown in the following figure. The message published by joy_node contains the current state of the joystick axes and buttons. The quadrotor_teleop node subscribes to these messages and publishes messages on the cmd_vel topic. These messages provide the velocity and direction for the quadrotor flight. Several joystick controllers are currently supported by the ROS joy package including PS3 and Logitech devices. For this launch, the joystick device is accessed as /dev/input/js0 and is initialized with a deadzone of 0.050000. Parameters to set the joystick axes are as follows: * /quadrotor_teleop/x_axis: 5 * /quadrotor_teleop/y_axis: 4 * /quadrotor_teleop/yaw_axis: 1 * /quadrotor_teleop/z_axis: 2 These parameters map to the Left Stick and the Right Stick controls on the Xbox 360 controller shown in the following figure. The direction of these sticks control are as follows: Left Stick: Forward (up) is to ascend Backward (down) is to descend Right is to rotate clockwise Left is to rotate counterclockwise Right Stick: Forward (up) is to fly forward Backward (down) is to fly backward Right is to fly right Left is to fly left Xbox 360 joystick controls for Hector Now use the joystick to fly around the simulated outdoor environment! The pilot's view can be seen in the Camera image view on the bottom left of the rviz screen. As you fly around in Gazebo, keep an eye on the Gazebo launch terminal window. The screen will display messages as follows depending on your flying ability: [ INFO] [1447358765.938240016, 617.860000000]: Engaging motors! [ WARN] [1447358778.282568898, 629.410000000]: Shutting down motors due to flip over! When Hector flips over, you will need to relaunch the simulation. Within ROS, a clearer understanding of the interactions between the active nodes and topics can be obtained by using the rqt_graph tool. The following diagram depicts all currently active nodes (except debug nodes) enclosed in oval shapes. These nodes publish to the topics enclosed in rectangles that are pointed to by arrows. You can use the rqt_graph command in a new terminal window to view the same display: ROS nodes and topics for Hector Quadrotor outdoor flight demo The rostopic list command will provide a long list of topics currently being published. Other command line tools such as rosnode, rosmsg, rosparam, and rosservice will help gather specific information about Hector Quadrotor's operation. To understand the orientation of the quadrotor on the screen, use the Gazebo GUI to show the vehicle's tf reference frame. Select quadrotor in the World panel on the left, then select the translation mode on the top environment toolbar (looks like crossed double-headed arrows). This selection will bring up the red-green-blue axis for the x-y-z axes of the tf frame, respectively. In the following figure, the x axis is pointing to the left, the y axis is pointing to the right (toward the reader), and the z axis is pointing up. Hector Quadrotor tf reference frame An YouTube video of hector_quadrotor outdoor scenario demo shows the hector_quadrotor in Gazebo operated with a gamepad controller: https://www.youtube.com/watch?v=9CGIcc0jeuI Flying Hector indoors The quadrotor indoor SLAM demo software is included as part of the hector_quadrotor metapackage. To launch the simulation, type the following command: $ roslaunch hector_quadrotor_demo indoor_slam_gazebo.launch The following screenshots show both the rviz and Gazebo display windows when the Hector indoor simulation is launched: Hector Quadrotor indoor rviz and gazebo views If you do not see this image for Gazebo, roll your mouse wheel to zoom out of the image. Then you will need to rotate the scene to a top-down view, in order to find the quadrotor press Shift + right mouse button. The environment was the offices at Willow Garage and Hector starts out on the floor of one of the interior rooms. Just like in the outdoor demo, the xbox_controller.launch file from the hector_quadrotor_teleop package should be executed: $ roslaunch hector_quadrotor_teleop xbox_controller.launch If the quadrotor becomes embedded in the wall, waiting a few seconds should release it and it should (hopefully) end up in an upright position ready to fly again. If you lose sight of it, zoom out from the Gazebo screen and look from a top-down view. Remember that the Gazebo physics engine is applying minor environment conditions as well. This can create some drifting out of its position. The rqt graph of the active nodes and topics during the Hector indoor SLAM demo is shown in the following figure. As Hector is flown around the office environment, the hector_mapping node will be performing SLAM and be creating a map of the environment. ROS nodes and topics for Hector Quadrotor indoor SLAM demo The following screenshot shows Hector Quadrotor mapping an interior room of Willow Garage: Hector mapping indoors using SLAM The 3D robot trajectory is tracked by the hector_trajectory_server node and can be shown in rviz. The map along with the trajectory information can be saved to a GeoTiff file with the following command: $ rostopic pub syscommand std_msgs/String "savegeotiff" The savegeotiff map can be found in the hector_geotiff/map directory. An YouTube video of hector_quadrotor stack indoor SLAM demo shows hector_quadrotor in Gazebo operated with a gamepad controller: https://www.youtube.com/watch?v=IJbJbcZVY28 Summary In this article, we learnt about Hector Quadrotors, loading Hector Quadrotors, launching Hector Quadrotor in Gazebo, and also about flying Hector outdoors and indoors. Resources for Article: Further resources on this subject: Working On Your Bot [article] Building robots that can walk [article] Detecting and Protecting against Your Enemies [article]
Read more
  • 0
  • 0
  • 37490

article-image-data-clustering
Packt
10 Nov 2016
6 min read
Save for later

Data Clustering

Packt
10 Nov 2016
6 min read
In this article by Rodolfo Bonnin, the author of the book Building Machine Learning Projects with TensorFlow, we will start applying data transforming operations. We will begin finding interesting patterns in some given information, discovering groups of data, or clusters, and using clustering techniques. (For more resources related to this topic, see here.) In this process we'll also gain two new tools: the ability to generate synthetic sample sets from a collection of representative data structures via the scikit-learn library, and the ability to graphically plot our data and model results, this time via the matplotlib library. The topics we will cover in this article are as follows: Getting an idea of how clustering works, and comparing it to other alternative existent classification techniques Using scikit-learn and matplotlib to enrichen the possibilities of dataset choices, and to get professional looking graphical representation of the data Implementing the K-means clustering algorithm Test some variations of the K-means methods to improve the fit and/or the convergence rate Three types of learning from data Based on how we approach the supervision of the samples, we can extract three types of learning: Unsupervised learning: The fully unsupervised approach directly takes a number of undetermined elements and builds a classification of them, looking at different properties that could determine its class Semi-supervised learning: The semi-supervised approach has a number of known classified items and then applies techniques to discover the class of the remaining items Supervised learning: In supervised learning, we start from a population of samples, which have a known type beforehand, and then build a model from it Normally there are three sample populations: one from which the model grows, called training set, one that is used to test the model, called training set, and then there are the samples for which we will be doing classification. Types of data learning based on supervision: unsupervised, semi-supervised, and supervised Unsupervised data clustering One of the simplest operations that can be initially applied to an unknown dataset is to try to understand the possible grouping or common features that the dataset members have. To do so, we could try to find representative points in them that summarize a balance of the parameters of the members of the group. This value could be, for example, the mean or the median of all the cluster members. This also guides to the idea of defining a notion of distance between members: all the members of the groups should be obviously at short distances between them and the representative points, that from the central points of the other groups. In the following image, we can see the results of a typical clustering algorithm and the representation of the cluster centers: Sample clustering algorithm output K-means K-means is a very well-known clustering algorithm that can be easily implemented. It is very straightforward and can guide (depending on the data layout) to a good initial understanding of the provided information. Mechanics of K-means K-means tries to divide a set of samples into K disjoint groups or clusters, using as a main indicator the mean value (be it 1D, 2D, and so on) of the members. This point is normally called centroid, referring to the arithmetic entity with the same name. One important characteristic of K-means is that K should be provided beforehand, and so some previous knowledge of the data is needed to avoid a non-representative result. Algorithm iteration criterion The criterion and goal of this method is to minimize the sum of squared distances from the cluster's member to the actual centroid of all cluster contained samples. This is also known as minimization of inertia. Error minimization criteria for K-means K-means algorithm breakdown The mechanism of the K-means algorithm can be summarized in the following graphic: Simplified flow chart of the K-means process And this is a simplified summary of the algorithm: We start with unclassified samples and take K elements as the starting centroids. There are also possible simplifications of this algorithm that take the first elements in the element list, for the sake of brevity. We then calculate the distances between the samples and the first chosen samples, and so we get the first calculated centroids (or other representative values). You can see in the moving centroids in the illustration toward a more common sense centroid. After the centroids change, their displacement will provoke the individual distances to change, and so the cluster membership can change. So this is the time when we recalculate the centroids and repeat the first steps, in case the stop condition isn't met. The stopping conditions could be of various types: After n iterations (it could be that either we chose a too large number and we'll have unnecessary rounds of computing, or it could converge slowly and we will have a very unconvincing results) if the centroid doesn't have a very stable means. This stop condition could also be used as a last resort if we have a really long iterative process. Referring to the previous mean result, a possibly better criterion for the convergence of the iterations is to take a look at the changes of the centroids, be it in total displacement or total cluster element switches. The last one is employed normally, so we will stop the process once there are no more element-changing clusters: K-means simplified graphic Pros and cons of K-means The advantages of this method are: It scales very well (most of the calculations can be run in parallel) It has been used in a very large range of applications But its simplicity has also a price (no silver bullet rule applies): It requires apriori knowledge (the number of possible clusters should be known beforehand) The outlier values can push the values of the centroids, as they have the same value as any other sample As we assume that the figure is convex and isotropic, it doesn't work very well with non-circle-like delimited clusters Summary In this article, we got a simple overview of some of the most basic models we can implement, but we tried to be as detailed in the explanation as possible. From now on, we are able to generate synthetic datasets, allowing us to rapidly test the adequacy of a model for different data configurations and so evaluate the advantages and shortcoming of them without having to load models with a greater number of unknown characteristics. You can also refer to the following books on the similar topics: Getting Started with TensorFlow: https://www.packtpub.com/big-data-and-business-intelligence/getting-started-tensorflow R Machine Learning Essentials: https://www.packtpub.com/big-data-and-business-intelligence/r-machine-learning-essentials Building Machine Learning Systems with Python - Second Edition: https://www.packtpub.com/big-data-and-business-intelligence/building-machine-learning-systems-python-second-edition Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Unsupervised Learning [article] Preprocessing the Data [article]
Read more
  • 0
  • 0
  • 2294
article-image-creating-personal-web-portal-pwp
Packt
10 Nov 2016
8 min read
Save for later

Creating a Personal Web Portal (PWP)

Packt
10 Nov 2016
8 min read
In this article by Sherwin John Calleja Tragura, author of the book Spring MVC Blueprints, we will discuss about creating a robust and simple personal web portal that can serve as a personal web page, or a professional reference site, for anyone. Usually, these kinds of web sites are used as mashups, or dashboards, of centralized sources of information describing and individual or group. (For more resources related to this topic, see here.) Technically, a personal web portal is a composition of web components like CSS, HTML, and JavaScript, woven together to create a formal, simple or exquisite presentation of any content. It can be used, in its simplest form, as a personal portfolio or an enterprise form like an e-commerce content management system. Commercially, these portals are drafted and designed using the principles of the Rich-client platform or responsive web designs. In the industry, most companies suggest that clients try easy-to-use-tools like PHP frameworks (for example, CodeIgniter, Laravel, Drupal) and seldom advise using JEE-based portals. Overview of the project The personal web portal (PWP) created publishes a simple biography, and professional information, one can at least share through the web. The prototype is a session-driven one that can do dynamic transactions, like updating information on the web pages, and posting notes on the page without using any back-end database. Through using wireframes, the following are the initial drafts and design of the web portal: The Home Page: This is the first page of the site that shows updatable quotes, and inspiring messages coming from the owner of the portal. It contains a sticky-note feature at the side that allows visitors to post their short greetings to the owner in real-time. The Personal Information Page: This page highlights personal information of the owner including the owner's name, age, hobbies, birth date, and age. This page contains some part of the blogger's educational history. The page is dynamic and can be updated at anytime by the owner. The Professional Information Page: This page presents details about the owner's career background. It lists down all the previous jobs of the account owner, and enumerates all skills-related information. This page is also updatable. The Reach Out Page: This serves as the contact information page of the owner. Moreover, it allows visitors to send their contact information, and specifically their electronic mail address, to the portal owner. Update pages: The Home, Personal and Professional pages has updateable pages for the owner to update the content of the portal. The prototype has the capability to update the information presented in the content at anytime the user desires. This simple prototype, called PWP, will give clear steps on how to build personal sites from the ground-up, using Spring MVC 4.x specifications. It will give enthusiasts the opportunity to start creating Spring-based web portals in just a day, without using any database backend. To those who are new to the Spring MVC 4.x concept, this article will be a good start in building full-blown portal sites. Technical requirements In order to start the development, the following tools need to be installed onto the platform: Java Development Kit (JDK) 1.7.x Spring Tool Suite (Eclipse) 3.6 Maven 3.x Spring Framework 4.1 Apache Tomcat 7.x Any operating system First, the JDK 1.7.x installer must be installed. Visit the site http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html to download the installer. Next, setup the Spring Tool Suite 3.6 (Eclipse-based) which will be the official Integrated Development Environment (IDE) of this article. Download the Spring Tool Suite 3.6 at https://spring.io/tools/sts. Setting-up the development environment This article recommends Spring Tool Suite (Eclipse) 3.6 since it has all the Spring Framework 4.x plug-ins, and other dependencies needed by the projects. To start us off, the following screenshot shows the dashboard of the STS IDE: Conversely, Apache Maven 3.x will be used to build and deploy the project for this article. Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information (https://maven.apache.org/). There is already a Maven plugin installed in the STS IDE that can be used to generate the needed development directory structure. Among the many ways to create Spring MVC projects, this article focuses on two styles, namely: Converting a dynamic web project to a Maven specimen Creating a Maven project from scratch Converting a dynamic web project to a Maven project To start creating the project, press CTRL + N to browse the Menu wizard of the IDE. This menu wizard contains all the types of project modules you'll need to start a project. The Menu wizard should look similar to the following screenshot: Once on the menu, browse the Web option and choose Dynamic Web Project. Afterwards, just follow the series of instructions to create the chosen project module until you reached the last menu wizard, which looks like the following figure: This last instruction (Web Module panel) will auto-generate the deployment descriptor (web.xml) of the project. Always click on the Generate web-xml deployment descriptor checkbox option. The deployment descriptor is an XML file that must reside inside the /WEB-INF/ folder of all JEE projects. This file describes how a component, module or application can be deployed. A JEE project must always be in the web.xml file otherwise the project will be defective. Since Spring 4.x container supports the Servlet Specification 3.0 in Tomcat 7 and above, web.xml is no longer mandatory and can be replaced by org.springframework.web.servlet.support.AbstractAnnotationConfigDispatcherServletInitializer or org.springframework.web.servlet.support.AbstractDispatcherServletInitializer class. The next major step is to convert the newly created dynamic web project to a Maven one. To complete the conversion, right-click on the project and navigate to the Configure | Convert Maven Project command set, as shown in the following screenshot: It is always best for the developer to study the directory structure of the project folder created before the actual implementation starts. Below is the directory structure of the Maven project after the conversion: The project directories are just like the usual Eclipse Dynamic Web project without the pom.xml file. Creating a Maven project from scratch Another method of creating a Spring MVC web project is by creating a Maven project from the start. Be sure to install the Maven 3.2 plugin in STS Eclipse. Browse the Menu wizard again, and locate the Maven option. Click on the Maven Project to generate a new Maven project. After clicking this option, a wizard will pop up, asking if an archetype is needed or not to create the Maven project. An archetype is a Maven plugin whose main objective is to create a project structure as per its template. To start quickly, choose an archetype plugin to create a simple java application here. It is recommended to create the project using the archetype maven-archetype-webapp. However, skipping the archetype selection can still be a valid option. After you've done this, proceed with the Select an Archetype window shown in the following screenshot. Locate maven-archetype-webapp then proceed with the last process. The selection of the Archetype maven-archetype-webapp will require the input of Maven parameters before the ending the whole process with a new Maven project: The required parameters for the Maven group or project are as follows: Group Id (groupId): This is the ID of the project's group and must be unique among all the project's groups Artifact Id (artifactId): This is the ID of the project. This is generally the name of the project Version (version): This is the version of the project Package (package): The initial or core package of the sources For more information on Maven plugin and configuration details, visit the documentation and samples on the site http://maven.apache.org/. After providing the Maven parameters, the project source folder structure will be similar to the following screenshot: Summary Using the basic Spring Framework 4.x APIs, web portal creators can create their own platform to promote their personal philosophy, business, ideology, religion, and other concepts. Although it is an advantage to use existing portal platforms made in other language like PHP and Python, it is still fulfilling to design and develop our own portal based on an open-source framework. The PWP is just prototype software that needs to be upgraded to have a backend database, security, and other social media plugins, in order to make the software commercially competitive. Resources for Article: Further resources on this subject: Designing your very own ASP.NET MVC Application [article] ASP.NET MVC 2: Validating MVC [article] Using ASP.NET Controls in SharePoint [article]
Read more
  • 0
  • 0
  • 9877

article-image-internationalization
Packt
10 Nov 2016
16 min read
Save for later

Internationalization

Packt
10 Nov 2016
16 min read
In this article by Jérémie Bouchet author of the book Magento Extensions Development. We will see how to handle this aspect of our extension and how it is handled in a complex extension using an EAV table structure. In this article, we will cover the following topics: The EAV approach Store relation table Translation of template interface texts (For more resources related to this topic, see here.) The EAV approach The EAV structure in Magento is used for complex models, such as customer and product entities. In our extension, if we want to add a new field for our events, we would have to add a new column in the main table. With the EAV structure, each attribute is stored in a separate table depending on its type. For example, catalog_product_entity, catalog_product_entity_varchar and catalog_product_entity_int. Each row in the subtables has a foreign key reference to the main table. In order to handle multiple store views in this structure, we will add a column for the store ID in the subtables. Let's see an example for a product entity, where our main table contains only the main attribute: The varchar table structure is as follows: The 70 attribute corresponds to the product name and is linked to our 1 entity. There is a different product name for the store view, 0 (default) and 2 (in French in this example). In order to create an EAV model, you will have to extend the right class in your code. You can inspire your development on the existing modules, such as customers or products. Store relation table In our extension, we will handle the store views scope by using a relation table. This behavior is also used for the CMS pages or blocks, reviews, ratings, and all the models that are not EAV-based and need to be store views-related. Creating the new table The first step is to create the new table to store the new data: Create the [extension_path]/Setup/UpgradeSchema.php file and add the following code: <?php namespace BlackbirdTicketBlasterSetup; use MagentoEavSetupEavSetup; use MagentoEavSetupEavSetupFactory; use MagentoFrameworkSetupUpgradeSchemaInterface; use MagentoFrameworkSetupModuleContextInterface; use MagentoFrameworkSetupSchemaSetupInterface; /** * @codeCoverageIgnore */ class UpgradeSchema implements UpgradeSchemaInterface { /** * EAV setup factory * * @varEavSetupFactory */ private $eavSetupFactory; /** * Init * * @paramEavSetupFactory $eavSetupFactory */ public function __construct(EavSetupFactory $eavSetupFactory) { $this->eavSetupFactory = $eavSetupFactory; } public function upgrade(SchemaSetupInterface $setup, ModuleContextInterface $context) { if (version_compare($context->getVersion(), '1.3.0', '<')) { $installer = $setup; $installer->startSetup(); /** * Create table 'blackbird_ticketblaster_event_store' */ $table = $installer->getConnection()->newTable( $installer->getTable('blackbird_ticketblaster_event_store') )->addColumn( 'event_id', MagentoFrameworkDBDdlTable::TYPE_SMALLINT, null, ['nullable' => false, 'primary' => true], 'Event ID' )->addColumn( 'store_id', MagentoFrameworkDBDdlTable::TYPE_SMALLINT, null, ['unsigned' => true, 'nullable' => false, 'primary' => true], 'Store ID' )->addIndex( $installer->getIdxName('blackbird_ticketblaster_event_store', ['store_id']), ['store_id'] )->addForeignKey( $installer->getFkName('blackbird_ticketblaster_event_store', 'event_id', 'blackbird_ticketblaster_event', 'event_id'), 'event_id', $installer->getTable('blackbird_ticketblaster_event'), 'event_id', MagentoFrameworkDBDdlTable::ACTION_CASCADE )->addForeignKey( $installer->getFkName('blackbird_ticketblaster_event_store', 'store_id', 'store', 'store_id'), 'store_id', $installer->getTable('store'), 'store_id', MagentoFrameworkDBDdlTable::ACTION_CASCADE )->setComment( 'TicketBlaster Event To Store Linkage Table' ); $installer->getConnection()->createTable($table); $installer->endSetup(); } } } The upgrade method will handle all the necessary updates in our database for our extension. In order to differentiate the update for a different version of the extension, we surround the script with a version_compare() condition. Once this code is set, we need to tell Magento that our extension has new database upgrades to process. Open the [extension_path]/etc/module.xml file and change the version number 1.2.0 to 1.3.0: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="../../../../../lib/internal/Magento/Framework/Module/etc/module.xsd"> <module name="Blackbird_TicketBlaster" setup_version="1.3.0"> <sequence> <module name="Magento_Catalog"/> <module name="Blackbird_AnotherModule"/> </sequence> </module> </config> In your terminal, run the upgrade by typing the following command: php bin/magentosetup:upgrade The new table structure now contains two columns: event_id and store_id. This table will store which events are available for store views: If you have previously created events, we recommend emptying the existing blackbird_ticketblaster_event table, because they won't have a default store view and this may trigger an error output. Adding the new input to the edit form In order to select the store view for the content, we will need to add the new input to the edit form. Before running this code, you should add a new store view: Here's how to do that: Open the [extension_path]/Block/Adminhtml/Event/Edit/Form.php file and add the following code in the _prepareForm() method, below the last addField() call: /* Check is single store mode */ if (!$this->_storeManager->isSingleStoreMode()) { $field = $fieldset->addField( 'store_id', 'multiselect', [ 'name' => 'stores[]', 'label' => __('Store View'), 'title' => __('Store View'), 'required' => true, 'values' => $this->_systemStore->getStoreValuesForForm(false, true) ] ); $renderer = $this->getLayout()->createBlock( 'MagentoBackendBlockStoreSwitcherFormRendererFieldsetElement' ); $field->setRenderer($renderer); } else { $fieldset->addField( 'store_id', 'hidden', ['name' => 'stores[]', 'value' => $this->_storeManager->getStore(true)->getId()] ); $model->setStoreId($this->_storeManager->getStore(true)->getId()); } This results in a new multiselect field in the form. Saving the new data in the new table Now we have the form and the database table, we have to write the code to save the data from the form: Open the [extension_path]/Model/Event.php file and add the following method at its end: /** * Receive page store ids * * @return int[] */ public function getStores() { return $this->hasData('stores') ? $this->getData('stores') : $this->getData('store_id'); } Open the [extension_path]/Model/ResourceModel/Event.php file and replace all the code with the following code: <?php namespace BlackbirdTicketBlasterModelResourceModel; class Event extends MagentoFrameworkModelResourceModelDbAbstractDb { [...] The afterSave() method is handling our insert queries in the new table. The afterload() and getLoadSelect() methods are handling the new load mode to select the right events. Your new table is now filled when you save your events; they are also properly loaded when you go back to your edit form. Showing the store views in the admin grid In order to inform admin users of the selected store views for one event, we will add a new column in the admin grid: Open the [extension_path]/Model/ResourceModel/Event/Collection.php file and replace all the code with the following code: <?php namespace BlackbirdTicketBlasterModelResourceModelEvent; class Collection extends MagentoFrameworkModelResourceModelDbCollectionAbstractCollection { [...] Open the [extention_path]/view/adminhtml/ui_component/ticketblaster_event_listing.xml file and add the following XML instructions before the end of the </filters> tag: <filterSelect name="store_id"> <argument name="optionsProvider" xsi_type="configurableObject"> <argument name="class" xsi_type="string">MagentoCmsUiComponentListingColumnCmsOptions</argument> </argument> <argument name="data" xsi_type="array"> <item name="config" xsi_type="array"> <item name="dataScope" xsi_type="string">store_id</item> <item name="label" xsi_type="string" translate="true">Store View</item> <item name="captionValue" xsi_type="string">0</item> </item> </argument> </filterSelect> Before the actionsColumn tag, add the new column: <column name="store_id" class="MagentoStoreUiComponentListingColumnStore"> <argument name="data" xsi_type="array"> <item name="config" xsi_type="array"> <item name="bodyTmpl" xsi_type="string">ui/grid/cells/html</item> <item name="sortable" xsi_type="boolean">false</item> <item name="label" xsi_type="string" translate="true">Store View</item> </item> </argument> </column> You can refresh your grid page and see the new column added at the end. Magento remembers the previous column's order. If you add a new column, it will always be added at the end of the table. You will have to manually reorder them by dragging and dropping them. Modifying the frontend event list Our frontend list (/events) is still listing all the events. In order to list only the events available for our current store view, we need to change a file: Edit the [extension_path]/Block/EventList.php file and replace the code with the following code: <?php namespace BlackbirdTicketBlasterBlock; use BlackbirdTicketBlasterApiDataEventInterface; use BlackbirdTicketBlasterModelResourceModelEventCollection as EventCollection; use MagentoCustomerModelContext; class EventList extends MagentoFrameworkViewElementTemplate implements MagentoFrameworkDataObjectIdentityInterface { /** * Store manager * * @var MagentoStoreModelStoreManagerInterface */ protected $_storeManager; /** * @var MagentoCustomerModelSession */ protected $_customerSession; /** * Construct * * @param MagentoFrameworkViewElementTemplateContext $context * @param BlackbirdTicketBlasterModelResourceModelEventCollectionFactory $eventCollectionFactory, * @param array $data */ public function __construct( MagentoFrameworkViewElementTemplateContext $context, BlackbirdTicketBlasterModelResourceModelEventCollectionFactory $eventCollectionFactory, MagentoStoreModelStoreManagerInterface $storeManager, MagentoCustomerModelSession $customerSession, array $data = [] ) { parent::__construct($context, $data); $this->_storeManager = $storeManager; $this->_eventCollectionFactory = $eventCollectionFactory; $this->_customerSession = $customerSession; } /** * @return BlackbirdTicketBlasterModelResourceModelEventCollection */ public function getEvents() { if (!$this->hasData('events')) { $events = $this->_eventCollectionFactory ->create() ->addOrder( EventInterface::CREATION_TIME, EventCollection::SORT_ORDER_DESC ) ->addStoreFilter($this->_storeManager->getStore()->getId()); $this->setData('events', $events); } return $this->getData('events'); } /** * Return identifiers for produced content * * @return array */ public function getIdentities() { return [BlackbirdTicketBlasterModelEvent::CACHE_TAG . '_' . 'list']; } /** * Is logged in * * @return bool */ public function isLoggedIn() { return $this->_customerSession->isLoggedIn(); } } Note that we have a new property available and instantiated in our constructor: storeManager. Thanks to this class, we can filter our collection with the store view ID by calling the addStoreFilter() method on our events collection. Restricting the frontend access by store view The events will not be listed in our list page if they are not available for the current store view, but they can still be accessed with their direct URL, for example http://[magento_url]/events/view/index/event_id/2. We will change this to restrict the frontend access by store view: Open the [extention_path]/Helper/Event.php file and replace the code with the following code: <?php namespace BlackbirdTicketBlasterHelper; use BlackbirdTicketBlasterApiDataEventInterface; use BlackbirdTicketBlasterModelResourceModelEventCollection as EventCollection; use MagentoFrameworkAppActionAction; class Event extends MagentoFrameworkAppHelperAbstractHelper { /** * @var BlackbirdTicketBlasterModelEvent */ protected $_event; /** * @var MagentoFrameworkViewResultPageFactory */ protected $resultPageFactory; /** * Store manager * * @var MagentoStoreModelStoreManagerInterface */ protected $_storeManager; /** * Constructor * * @param MagentoFrameworkAppHelperContext $context * @param BlackbirdTicketBlasterModelEvent $event * @param MagentoFrameworkViewResultPageFactory $resultPageFactory * @SuppressWarnings(PHPMD.ExcessiveParameterList) */ public function __construct( MagentoFrameworkAppHelperContext $context, BlackbirdTicketBlasterModelEvent $event, MagentoFrameworkViewResultPageFactory $resultPageFactory, MagentoStoreModelStoreManagerInterface $storeManager, ) { $this->_event = $event; $this->_storeManager = $storeManager; $this->resultPageFactory = $resultPageFactory; $this->_customerSession = $customerSession; parent::__construct($context); } /** * Return an event from given event id. * * @param Action $action * @param null $eventId * @return MagentoFrameworkViewResultPage|bool */ public function prepareResultEvent(Action $action, $eventId = null) { if ($eventId !== null && $eventId !== $this->_event->getId()) { $delimiterPosition = strrpos($eventId, '|'); if ($delimiterPosition) { $eventId = substr($eventId, 0, $delimiterPosition); } $this->_event->setStoreId($this->_storeManager->getStore()->getId()); if (!$this->_event->load($eventId)) { return false; } } if (!$this->_event->getId()) { return false; } /** @var MagentoFrameworkViewResultPage $resultPage */ $resultPage = $this->resultPageFactory->create(); // We can add our own custom page handles for layout easily. $resultPage->addHandle('ticketblaster_event_view'); // This will generate a layout handle like: ticketblaster_event_view_id_1 // giving us a unique handle to target specific event if we wish to. $resultPage->addPageLayoutHandles(['id' => $this->_event->getId()]); // Magento is event driven after all, lets remember to dispatch our own, to help people // who might want to add additional functionality, or filter the events somehow! $this->_eventManager->dispatch( 'blackbird_ticketblaster_event_render', ['event' => $this->_event, 'controller_action' => $action] ); return $resultPage; } } The setStoreId() method called on our model will load the model only for the given ID. The events are no longer available through their direct URL if we are not on their available store view. Translation of template interface texts In order to translate the texts written directly in the template file, for the interface or in your PHP class, you need to use the __('Your text here') method. Magento looks for a corresponding match within all the translation CSV files. There is nothing to be declared in XML; you simply have to create a new folder at the root of your module and create the required CSV: Create the [extension_path]/i18n folder. Create [extension_path]/i18n/en_US.csv and add the following code: "Event time:","Event time:" "Please sign in to read more details.","Please sign in to read more details." "Read more","Read more" Create [extension_path]/i18n/en_US.csv and add the following code: "Event time:","Date de l'évènement :" "Pleasesign in to read more details.","Merci de vous inscrire pour plus de détails." "Read more","Lire la suite" The CSV file contains the correspondences between the key used in the code and the value in its final language. Translation of e-mail templates: creating and translating the e-mails We will add a new form in the Details page to share the event to a friend. The first step is to declare your e-mail template. To declare your e-mail template, create a new [extension_path]/etc/email_templates.xml file and add the following code: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="urn:magento:module:Magento_Email:etc/email_templates.xsd"> <template id="ticketblaster_email_email_template" label="Share Form" file="share_form.html" type="text" module="Blackbird_TicketBlaster" area="adminhtml"/> </config> This XML line declares a new template ID, label, file path, module, and area (frontend or adminhtml). Next, create the corresponding template by creating the [extension_path]/view/adminhtml/email/share_form.html file and add the following code: <!--@subject Share Form@--> <!--@vars { "varpost.email":"Sharer Email", "varevent.title":"Event Title", "varevent.venue":"Event Venue" } @--> <p>{{trans "Your friend %email is sharing an event with you:" email=$post.email}}</p> {{trans "Title: %title" title=$event.title}}<br/> {{trans "Venue: %venue" venue=$event.venue}}<br/> <p>{{trans "View the detailed page: %url" url=$event.url}}</p> Note that in order to translate texts within the HTML file, we use the trans function, which works like the default PHP printf() function. The function will also use our i18n CSV files to find a match for the text. Your e-mail template can also be overridden directly from the backoffice: Marketing | Email templates. The e-mail template is ready; we will also add the ability to change it in the system configuration and allow users to determine the sender's e-mail and name: Create the [extension_path]/etc/adminhtml/system.xml file and add the following code: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="urn:magento:module:Magento_Config:etc/system_file.xsd"> <system> <section id="ticketblaster" translate="label" type="text" sortOrder="100" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Ticket Blaster</label> <tab>general</tab> <resource>Blackbird_TicketBlaster::event</resource> <group id="email" translate="label" type="text" sortOrder="50" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Email Options</label> <field id="recipient_email" translate="label" type="text" sortOrder="10" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Send Emails To</label> <validate>validate-email</validate> </field> <field id="sender_email_identity" translate="label" type="select" sortOrder="20" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Email Sender</label> <source_model>MagentoConfigModelConfigSourceEmailIdentity</source_model> </field> <field id="email_template" translate="label comment" type="select" sortOrder="30" showInDefault="1" showInWebsite="1" showInStore="1"> <label>Email Template</label> <comment>Email template chosen based on theme fallback when "Default" option is selected.</comment> <source_model>MagentoConfigModelConfigSourceEmailTemplate</source_model> </field> </group> </section> </system> </config> Create the [extension_path]/etc/config.xml file and add the following code: <?xml version="1.0"?> <config xsi_noNamespaceSchemaLocation="urn:magento:module:Magento_Store:etc/config.xsd"> <default> <ticketblaster> <email> <recipient_email> <![CDATA[hello@example.com]]> </recipient_email> <sender_email_identity>custom2</sender_email_identity> <email_template>ticketblaster_email_email_template</email_template> </email> </ticketblaster> </default> </config> Thanks to these two files, you can change the configuration for the e-mail template in the Admin panel (Stores | Configuration). Let's create our HTML form and the controller that will handle our submission: Open the existing [extension_path]/view/frontend/templates/view.phtml file and add the following code at the end: <form action="<?php echo $block->getUrl('events/view/share', array('event_id' => $event->getId())); ?>" method="post" id="form-validate" class="form"> <h3> <?php echo __('Share this event to my friend'); ?> </h3> <input type="email" name="email" class="input-text" placeholder="email" /> <button type="submit" class="button"><?php echo __('Share'); ?></button> </form> Create the [extension_path]/Controller/View/Share.php file and add the following code: <?php namespace BlackbirdTicketBlasterControllerView; use MagentoFrameworkExceptionNotFoundException; use MagentoFrameworkAppRequestInterface; use MagentoStoreModelScopeInterface; use BlackbirdTicketBlasterApiDataEventInterface; class Share extends MagentoFrameworkAppActionAction { [...] This controller will get the necessary configuration entirely from the admin and generate the e-mail to be sent. Testing our code by sending the e-mail Go to the page of an event and fill in the form we prepared. When you submit it, Magento will send the e-mail immediately. Summary In this article, we addressed all the main processes that are run for internationalization. We can now create and control the availability of our events with regards to Magento's stores and translate the contents of our pages and e-mails. Resources for Article: Further resources on this subject: Magento Theme Distribution [article] Installing Magento [article] Magento 2 – the New E-commerce Era [article]
Read more
  • 0
  • 0
  • 33122

article-image-introduction-practical-business-intelligence
Packt
10 Nov 2016
20 min read
Save for later

Introduction to Practical Business Intelligence

Packt
10 Nov 2016
20 min read
In this article by Ahmed Sherif, author of the book Practical Business Intelligence, is going to explain what is business intelligence? Before answering this question, I want to pose and answer another question. What isn't business intelligence? It is not spreadsheet analysis done with transactional data with thousands of rows. One of the goals of Business Intelligence or BI is to shield the users of the data from the intelligent logic lurking behind the scenes of the application that is delivering the data to them. If the integrity of the data is compromised in any way by an individual not intimately familiar with the data source, then there cannot, by definition, be intelligence in the business decisions made within that same data. The single source of truth is the key for any Business Intelligence operation whether it is a mom and pop soda shop or a Fortune 500 company. Any report, dashboard, or analytical application that is delivering information to a user through a BI tool but the numbers cannot be tied back to the original source will break the trust between the user and the data and will defeat the purpose of Business Intelligence. (For more resources related to this topic, see here.) In my opinion, the most successful tools used for business intelligence directly shield the business user from the query logic used for displaying that same data in some form of visual manner. Business Intelligence has taken many forms in terms of labels over the years. Business Intelligence is the process of delivering actionable business decisions from analytical manipulation and presentation of data within the confines of a business environment. The delivery process mentioned in the definition will focus its attention on. The beauty of BI is that it is not owned by any one particular tool that is proprietary to a specific industry or company. Business Intelligence can be delivered using many different tools, some even that were not originally intended to be used for BI. The tool itself should not be the source where the query logic is applied to generate the business logic of the data. The tool should primarily serve as the delivery mechanism of the query that is generated by the data warehouse that houses both the data, as well as the logic. In this chapter we will cover the following topics: Understanding the Kimball method Understanding business intelligence Data and SQL Working with data and SQL Working with business intelligence tools Downloading and installing MS SQL Server 2014 Downloading and installing AdventureWorks Understanding the Kimball method As we discuss the data warehouse where our data is being housed, we will be remised not to bring up Ralph Kimball, one of the original architects of the data warehouse.  Kimball's methodology incorporated dimensional modeling, which has become the standard for modeling a data warehouse for Business Intelligence purposes. Dimensional modeling incorporates joining tables that have detail data and tables that have lookup data. A detail table is known as a fact table in dimensional modeling. An example of a fact table would be a table holding thousands of rows of transactional sales from a retail store.  The table will house several ID's affiliated with the product, the sales person, the purchase date, and the purchaser just to name a few. Additionally, the fact table will store numeric data for each individual transaction, such as sales quantity for sales amount to name a few examples. These numeric values will be referred to as measures. While there is usually one fact table, there will also be several lookup or dimensional tables that will have one table for each ID that is used in a fact table. So, for example,  there would be one dimensional table for the product name affiliated with a product ID. There would be one dimensional table for the month, week, day, and year of the id affiliated with the date. These dimensional tables are also referred to as Lookup Tables, because they kind of look up what the name of a dimension ID is affiliated with. Usually, you would find as many dimensional tables as there are ID's in the fact table. The dimensional tables would all be joined to the one fact table creating something of a 'star' look. Hence, the name for this type of table join is known as a star schema which is represented diagrammatically in the following figure. It is customary that the fact table will be the largest table in a data warehouse while the lookup tables will all be quite small in rows, some as small as one row. The tables are joined by ID's, also known as surrogate keys. Surrogate keys allow for the most efficient join between a fact table and a dimensional table as they are usually a data type of integer. As more and more detail is added to a dimensional table, that new dimension is just given the next number in line, usually starting with 1. Query performance between tables joins suffers when we introduce non-numeric characters into the join or worse, symbols (although most databases will not allow that). Understanding business intelligence architecture I will continuously hammer home the point that the various tools utilized to deliver the visual and graphical BI components should not house any internal logic to filter data out of the tool nor should it be the source of any built in calculations. The tools themselves should not house this logic as they will be utilized by many different users. If each user who develops a BI app off of the tool incorporates different internal filters without the tool, the single source of truth tying back to the data warehouse will become multiple sources of truths.  Any logic applied to the data to filter out a specific dimension or to calculate a specific measure should be applied in the data warehouse and then pulled into the tool. For example, if the requirement for a BI dashboard was to show current year and prior year sales for US regions only, the filter for region code would be ideally applied in the data warehouse as opposed to inside of the tool. The following is a query written in SQL joining two tables from the AdventureWorks database that highlights the difference between dimenions and measures.  The 'region' column is a dimension column and the 'SalesYTD' and 'SalesPY' are measure columns. In this example, the 'TerritoryID' is serving as the key join between 'SalesTerritory' and 'SalesPerson'. Since the measures are coming from the 'SalesPerson' table, that table will serve as the fact table and 'SalesPerson.TerritoryID' will serve as the fact ID. Since the Region column is dimensional and coming from the 'SalesTerritory' table, that table will serve as the dimensional or lookup table and 'SalesTerritory.TerritoryID' will serve as the dimension ID. In a finely-tuned data warehouse both the fact ID and dimension ID would be indexed to allow for efficient query performance. This performance is obtained by sorting the ID's numerically so that a row from one table that is being joined to another table does not have to be searched through the entire table but only a subset of that table. When the table is only a few hundred rows, it may not seem necessary to index columns, but when the table grows to a few hundred million rows, it may become necessary. Select region.Name as Region ,round(sum(sales.SalesYTD),2) as SalesYTD ,round(sum(sales.SalesLastYear),2) as SalesPY FROM [AdventureWorks2014].[Sales].[SalesTerritory] region left outer join [AdventureWorks2014].[Sales].[SalesPerson] sales on sales.TerritoryID = region.TerritoryID where region.CountryRegionCode = 'US' Group by region.Name order by region.Name asc There are several reasons why applying the logic at the database level is considered a best practice. Most of the time, these requests for filtering data or manipulating calculations are done at the BI tool level because it is easier for the developer than to go to the source. However, if these filters are being performed due to data quality issues then applying logic at the reporting level is only masking an underlying data issue that needs to be addressed across the entire data warehouse. You would be doing yourself a disservice in the long run as you will be establishing a precedence that the data quality would be handled by the report developer as opposed to the database administrator. You are just adding additional work onto your plate. Ideal BI tools will quickly connect to the data source and then allow for slicing and dicing of your dimensions and measures in a manner that will quickly inform the business of useful and practical information. Ultimately, the choice of a BI tool by an individual or an organization will come down to the ease of use of the tool as well as the flexibility to showcase the data through various components such as graphs, charts, widgets, and infographics. Management If you are a Business Intelligence manager looking to establish a department with a variety of tools to help flesh out your requirements, could serve as a good source for interview questions to weed out unqualified candidates. A manager could use to distinguish some of the nuances between these different skillsets and prioritize hiring based on immediate needs. Data Scientist The term Data Scientist has been misused in the BI industry, in my humble opinion. It has been lumped in with Data Analyst as well as BI Developer. Unfortunately, these three positions have separate skillsets and you will do yourself a disservice by assuming one person can do multiple positions successfully. A Data Scientist will be able to apply statistical algorithms behind the data that is being extracted from the BI tools and make predictions about what will happen in the future with that same data set. Due to this skillset, a Data Scientist may find the chapters focusing on R and Python to be of particular importance because of their abilities to leverage predictive capabilities within their BI delivery mechanisms. Data Analyst A Data Analyst is probably the second most misused position behind a Data Scientist. Typically, a Data Analyst should be analyzing the data that is coming out of the BI tools that are connected to the data warehouse. Most Data Analysts are comfortable working with Microsoft Excel. Often times they are asked to take on additional roles in developing dashboards that require additional programming skills.  This is where they would find some comfort using a tool like Power BI, Tableau, or QlikView. These tools would allow for a Data Analyst to quickly develop a storyboard or visualization that would allow for quick analysis with minimal programming skills. Visualization Developer A 'dataviz' developer is someone who can create complex visualizations out of data and showcase interesting interactions between different measures inside of a dataset that cannot necessarily be seen with a traditional chart or graph. More often than not these developers possess some programming background such as JavaScript, HTML, or CSS. These developers are also used to developing applications directly for the web and therefore would find D3.js a comfortable environment to program in. Working with Data and SQL The examples and exercises that will come from the AdventureWorks database.  The AdventureWorks database has a comprehensive list of tables that mimics an actual bicycle retailor. The examples will draw on different tables from the database to highlight BI reporting from the various segments appropriate for the AdventureWorks Company. These segments include Human Resources, Manufacturing, Sales, Purchasing, and Contact Management. A different segment of the data will be highlighted in each chapter utilizing a specific set of tools. A cursory understanding of SQL (structured query language) will be helpful to get a grasp of how data is being aggregated with dimensions and measures. Additionally, an understanding of the SQL statements used will help with the validation process to ensure a single source of truth between the source data and the output inside of the BI tool of choice. For more information about learning SQL, visit the following website: www.sqlfordummies.com Working with business intelligence tools Over the course of the last 20 years, there have been a growing number of software products released that were geared towards Business Intelligence. In addition, there have been a number of software products and programming languages that were not initially built for BI but later on became a staple for the industry. The tools used were chosen based on the fact that they were either built off of open source technology or they were products from companies that provided free versions of their software for development purposes. Many companies from the big enterprise firms have their own BI tools and they are quite popular. However, unless you have a license with them, it is unlikely that you will be able to use their tool without having to shell out a small fortune. Power BI and Excel Power BI is one of the more relatively newer BI tools from Microsoft.  It is known as a self-service solution and integrates seamlessly with other data sources such as Microsoft Excel and Microsoft SQL Server.  Our primary purpose in using Power BI will be to generate interactive dashboards, reports, and datasets for users. In addition to using Power BI we will also focus on utilizing Microsoft Excel to assist with some data analysis and validation of results that are being pulled from our data warehouse.  Pivot tables are very popular within MS Excel and will be used to validate aggregation done inside of the data warehouse. D3.js D3.js, also known as data-driven documents, is a JavaScript library known for delivery beautiful visualizations by manipulating documents based on data. Since D3 is rooted in JavaScript, all visualizations make a seamless transition to the web. D3 allows for major customization to any part of visualization and because of this flexibility, it will require a steeper learning curve that probably any other software program. D3 can consume data easily as a .json or a .csv file.  Additionally, the data can also be imbedded directly within the JavaScript code that renders the visualization on the web. R R is a free and open source statistical programming language that produces beautiful graphics. The R language has been widely used among the statistical community and more recently in the data science and machine learning community as well. Due to this fact, it has picked up steam in recent years as a platform for displaying and delivering effective and practical BI. In addition to visualizing BI, R has the ability to also visualize predictive analysis with algorithms and forecasts. While R is a bit raw in its interface, there have been some IDE's (Integrated Development Environment) that have been developed to ease the user experience. RStudio will be used to deliver the visualisations developed within R. Python Python is considered the most traditional programming language of all the different languages. It is a widely used general purpose programming language with several modules that are very powerful in analysing and visualizing data. Similar to R, Python is a bit raw in its own form for delivering beautiful graphics as a BI tool; however, with the incorporation of an IDE the user interface becomes much more of a pleasurable development experience. PyCharm will be the IDE used to develop BI with Python. PyCharm is free to use and allows creation of the iPython notebook which delivers seamless integration between Python and the powerful modules that will assist with BI. As a note, all code in Python will be developed using the Python 3 syntax. QlikView QlikView is a software company specializing in delivering business intelligence solutions using their desktop tool. QlikView is one of the leaders in delivering quick visualizations based on data and queries through their desktop application. They advertise themselves to be self-service BI for business users. While they do offer solutions that target more enterprise organizations, they also offer a free version of their tool for personal use. Tableau is probably the closest competitor in terms of delivering similar BI solutions. Tableau Tableau is a software company specializing in delivering business intelligence solutions using their desktop tool. If this sounds familiar to QlikView, it's probably because it's true. Both are leaders in the field of establishing a delivery mechanism with easy installation, setup, and connectivity to the available data. Tableau has a free version of their desktop tool. Again, Tableau excels at delivering both beautiful visualizations quickly as well as self-service data discovery to more advanced business users. Microsoft SQL Server Microsoft SQL will serve as the data warehouse for the examples that we will with the BI Tools. Microsoft SQL Server is relatively simple to install and set up as well it is free to download. Additionally, there are example databases that configure seamlessly with it, such as the AdventureWorks database. Downloading and Installing MS SQL Server 2014 First things first. We will need to get started with getting our database and data warehouse up and running so that we can begin to develop our BI environment. We will visit the Microsoft website below to start the download selection process. https://www.microsoft.com/en-us/download/details.aspx?id=42299 Select the specified language that is applicable to you and also select the MS SQL Server Express version with Advanced features that is 64-bit edition as shown in the following screenshot. Ideally you'll want to be working in a 64-bit edition when dealing with servers. After selecting the file, the download process should begin. Depending on your connection speed it could take some time as the file is slightly larger than 1 GB. The next step in the process is selecting a new stand-alone instance of SQL Server 2014 unless you already have a version and wish to upgrade instead as shown in the following screenshot.. After accepting the license terms, continue through the steps in the Global Rules as well as the Product Updates to get to the setup installation files. For the feature selection tab, make sure the following features are selected for your installation as shown in the following screenshot. Our preference is to label a named instance of this database to something related to the work we are doing.  Since this will be used for Business Intelligence, I went ahead and name this instance 'SQLBI' as shown in the following screenshot: The default Server Configuration settings are sufficient for now, there is no need to change anything under that section as shown in the following screenshot. Unless you are required to do so within your company or organization, for personal use it is sufficient to just go with Windows Authentication mode for sign-on as shown in the following screenshot. We will not need to do any configuring of reporting services, so it is sufficient for our purposes to just with installing Reporting Services Native mode without any need for configuration at this time. At this point the installation will proceed and may take anywhere between 20-30 minutes depending on the cpu resources. If you continue to have issues with your installation, you can visit the following website from Microsoft for additional help. http://social.technet.microsoft.com/wiki/contents/articles/23878.installing-sql-server-2014-step-by-step-tutorial.aspx Ultimately, if everything with the installation is successful, you'll want to see all portions of the installation have a green check mark next to their name and be labeled 'Successful' as shown in the following screenshot. Downloading and Installing AdventureWorks We are almost finished with getting our business intelligence data warehouse complete. We are now at the stage where we will extract and load data into our data warehouse. The last part is to download and install the AdventureWorks database from Microsoft. The zipped file for AdventureWorks 2014 is located in the following website from Microsoft: https://msftdbprodsamples.codeplex.com/downloads/get/880661 Once the file is downloaded and unzipped, you will find a file named the following: AdventureWorks2014.bak Copy that file and paste it in the following folder where it will be incorporated with your Microsoft SQL Server 2014 Express Edition. C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLBackup Also note that the MSSQL12.SQLBI subfolder will vary user by user depending on what you named your SQL instance when you were installing MS SQL Server 2014. Once that has been copied over, we can fire up Management Studio for SQL Server 2014 and start up a blank new query by going to File New Query with Current Connection Once you have a blank query set up, copy and paste the following code in the and execute it: use [master] Restore database AdventureWorks2014 from disk = 'C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLBackupAdventureWorks2014.bak' with move 'AdventureWorks2014_data' to 'C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLDATAAdventureWorks2014.mdf', Move 'AdventureWorks2014_log' to 'C:Program FilesMicrosoft SQL ServerMSSQL12.SQLBIMSSQLDATAAdventureWorks2014.ldf' , replace Once again, please note that the MSSQL12.SQLBI subfolder will vary user by user depending on what you named your SQL instance when you were installing MS SQL Server 2014. At this point in time within the database you should have received a message saying that Microsoft SQL Server has processed 24248 pages for database 'AdventureWorks2014'. Once you have refreshed your database tab on the upper left hand corner of SQL Server, the AdventureWorks database will become visible as well as all of the appropriate tables as shown in the following screenshot: One final step that we will need to verify just to make sure that your login account has all of the appropriate server settings. When you right-click on the SQL Server name on the upper left hand portion of Management Studio, select the properties.  Select Permissions inside Properties. Find your username and check all of the rights under the Grant column as shown in the following screenshot: Finally, we need to also ensure that the folder that houses Microsoft SQL Server 2014 also has the appropriate rights enabled for your current user.  That specific folder is located under C:Program FilesMicrosoft SQL Server. For purposes of our exercises, we will assign all rights for the SQL Server user to the following folder as shown in the following screenshot: We are now ready to begin connecting our BI tools to our data! Summary The emphasis will be placed on implementing Business Intelligence best practices within the various tools that will be used based on the different levels of data that is provided within the AdventureWorks database. In the next chapter we will cover extracting additional data from the web that will be joined to the AdventureWorks database. This process is known as web scraping and can be performed with great success using tools such as Python and R. In addition to collecting the data, we will focus on transforming the collected data for optimal query performance. Resources for Article: Further resources on this subject: LabVIEW Basics [article] Thinking Probabilistically [article] Clustering Methods [article]
Read more
  • 0
  • 1
  • 4035
article-image-how-build-search-engine
JanuVerma
10 Nov 2016
6 min read
Save for later

How to Build a Search Engine

JanuVerma
10 Nov 2016
6 min read
In this post, I will discuss how you can build a search engine like Google. We will start with a definition of a search engine, and I will go on to present the core concepts (mathematics and implementation). A search engine is an information retrieval system that returns documents relevant to a user submitted query, ranked by their relevance to the query. There are many subtle issues in the last statement;for example, what does relevance mean in this context? We will make all of these notions clear as we go further. In this post, by documents I mean textual documents or webpages. There are surely search engines for images, videos or any type of documents, the discussion of which is beyond the scope of this post. Information Retrieval System Consider a corpus of textual documents, and we want to build a system that searches for a query, and returns the documents relevant to the query. The easiest way to accomplish this is to return all of the documents that contain the query. Such a model is called Boolean model of information retrieval (BIR). Mathematically, we can represent a document (and query) as a set of the words it contains. The BIR model retrieves documents that have non-zero intersection with the query. Notice that the retrieved documents are all indistinguishable; there is no inherent ranking method. Another class of models is the vector space model, and all modern search engines use such a model as their base system. In these models, a document is represented as a vector in a high-dimensional vector space. The dimension of the vector is equal to the total number of words in the corpus, and there is an entry corresponding to each word. The documents are retrieved based on the similarity of the corresponding vectors. For example, doc = (v1, v2, ... , vk) is the vector corresponding to some document, and q = (q1, q2, ..., qk) is the vector corresponding to the query, then the similarity of the document and the query can be computed as the vector similarity (distance). There are many choices for such similarity,for example,Euclidean distance, Jaccard distance, Manhattan distance, and so on. In most modern search engines cosine similarity is used: cos(doc,q) = (doc . q) / ||doc|| ||q|| where . is the dot product of the vectors, and || is the norm of the vector. There are difference schemes for choosing the vectors as well, in binary scheme, so the vector entry corresponding to a word is 1 if the word is present in the document, and 0 otherwise. The most popular scheme is the so-called term frequence- inverse document frequency (tf-idf), which assigns higher score of words that are characteristic to the current document. Sothe tf-idf of word w in document d can be computed as: tf-idf(w, d) = tf(w,d) * log(N/n) Where tf(w,d) is the frequency of word w in document d, N is the total number of documents, and n is the number of documents that contain w. Thus tf-idf taxes words that are present in a large number of documents, are not good as features in a model. The retrieved documents are ranked according to the similarity value. There are also probablistic models for retrieving documents, where we estimate the probability that the document d is relevant to a query q, and the documents are then ranked by this probability estimate. For implementation of these models and more, check out InfoR. Computational Complexities The process of retrieval based on vector space model is incredibly expensive. For each search query, we need to compute its similarity with all the web documents, even though most of the documents don't even contain that word. Going by the standard method, it's impossible to achieve the results at the speed we experience while using Google. The trick is to store documents in a data structure, which facilitates fast computation. The most commonly used data structure is the inverted index. A forward index stores the lists of words for each document, an invereted index stores documents for each word in a hashmap-style structure: Inverted index = {word1: [doc1, doc24, doc77], word2: [doc34, doc97], .... wordk: [doc11, doc22, doc1020, doc23]} Given such a structure, one can just jump to the words in the query in constant time, and compute the similarity with a very small number of documents, ones containing the words in the query. Now building such an index also presents a serious computational problem. This problem was the reason behind the development of the MapReduce paradigm, which is the distributed, massively parallel system, discovered by Google. The use of MapReduce system has extended to all major big data problems. If you are curious refer to the original paper. Web Search Engines The working of search engine can be described in the following steps: Crawling: First of all, we need to crawl the web to extract the HTML documents. There are many open source services that provide the crawling capabilities, for example, Nutch, Scrappy,and so on. The documents are then processed to extract the text, the hyperlinks, geo-location, and so on. based on the complexity of the retrieval model. Indexing: We want the documents to be stored in such a way that facilitates a quick search. As discussed earlier, building and maintaining an inverted index is a crucial step in building a search engine. Retrieval: We have discussed retrieval models based on vector space and that probability above. Such a model is used as the base for all major web engines. In addition to the cosine similarity, all web search engines use PageRank for ranking the retrieved documents. This algorithm, discovered by founders of Google, uses the quality of hyperlinks in a webpage for ranking. Mathematically, pagerank assumes that the web is a network graph where documents are linked via hyperlinks. A document receives higher pagerank if there are many 'high-quality' links in it. The PageRank can also be interpreted as the probability that a person randomly clicking on links will arrive at this particular page. For the more mathematically inclined, this translates to computing the principal eigenvector of the transition matrix of random jumps. For more details, refer to this article on American Mathematical Society website. Modern web search engines like Google use more than 200 heuristics on top of the standard retrieval model! Conclusion In this post, we discussed the central ideas of information retrieval and challenges in building a web search engine. It's not an easy problem. Even today, 18 years after the origin of Google, significant research, both at universities and in industry, is being pursued in this direction. About the author Janu Verma is a researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational health and biology. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and Indian Statistical Institute. He has also written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals,and so on. His current focus is on the development of visual analytics systems for prediction and understanding. Check out his personal website.
Read more
  • 0
  • 0
  • 10832

article-image-machine-learning-technique-supervised-learning
Packt
09 Nov 2016
7 min read
Save for later

Machine Learning Technique: Supervised Learning

Packt
09 Nov 2016
7 min read
In this article by Andrea Isoni author of the book Machine Learning for the Web, we will the most relevant regression and classification techniques are discussed. All of these algorithms share the same background procedure, and usually the name of the algorithm refers to both a classification and a regression method. The linear regression algorithms, Naive Bayes, decision tree, and support vector machine are going to be discussed in the following sections. To understand how to employ the techniques, a classification and a regression problem will be solved using the mentioned methods. Essentially, a labeled train dataset will be used to train the models, which means to find the values of the parameters, as we discussed in the introduction. As usual, the code is available in the my GitHub folder at https://github.com/ai2010/machine_learning_for_the_web/tree/master/chapter_3/. (For more resources related to this topic, see here.) We will conclude the article with an extra algorithm that may be used for classification, although it is not specifically designed for this purpose (hidden Markov model). We will now begin to explain the general causes of error in the methods when predicting the true labels associated with a dataset. Model error estimation We said that the trained model is used to predict the labels of new data, and the quality of the prediction depends on the ability of the model to generalize, that is, the correct prediction of cases not present in the trained data. This is a well-known problem in literature and related to two concepts: bias and variance of the outputs. The bias is the error due to a wrong assumption in the algorithm. Given a point x(t) with label yt, the model is biased if it is trained with different training sets, and the predicted label ytpred will always be different from yt. The variance error instead refers to the different wrongly predicted labels of the given point x(t). A classic example to explain the concepts is to consider a circle with the true value at the center (true label), as shown in the following figure. The closer the predicted labels are to the center, the more unbiased the model and the lower the variance (top left in the following figure). The other three cases are also shown here: Variance and bias example. A model with low variance and low bias errors will have the predicted labels that is blue dots (as show in the preceding figure) concentrated on the red center (true label). The high bias error occurs when the predictions are far away from the true label, while high variance appears when the predictions are in a wide range of values. We have already seen that labels can be continuous or discrete, corresponding to regression classification problems respectively. Most of the models are suitable for solving both problems, and we are going to use word regression and classification referring to the same model. More formally, given a set of N data points and corresponding labels, a model with a set of parameters with the true parameter values will have the mean square error (MSE), equal to: We will use the MSE as a measure to evaluate the methods discussed in this article. Now we will start describing the generalized linear methods. Generalized linear models The generalized linear model is a group of models that try to find the M parameters that form a linear relationship between the labels yi and the feature vector x(i) that is as follows: Here, are the errors of the model. The algorithm for finding the parameters tries to minimize the total error of the model defined by the cost function J: The minimization of J is achieved using an iterative algorithm called batch gradient descent: Here, a is called learning rate, and it is a trade-off between convergence speed and convergence precision. An alternative algorithm that is called stochastic gradient descent, that is loop for : The qj is updated for each training example i instead of waiting to sum over the entire training set. The last algorithm converges near the minimum of J, typically faster than batch gradient descent, but the final solution may oscillate around the real values of the parameters. The following paragraphs describe the most common model and the corresponding cost function, J. Linear regression Linear regression is the simplest algorithm and is based on the model: The cost function and update rule are: Ridge regression Ridge regression, also known as Tikhonov regularization, adds a term to the cost function J such that: , where l is the regularization parameter. The additional term has the function needed to prefer a certain set of parameters over all the possible solutions penalizing all the parameters qj different from 0. The final set of qj shrank around 0, lowering the variance of the parameters but introducing a bias error. Indicating with the superscript l the parameters from the linear regression, the ridge regression parameters are related by the following formula: This clearly shows that the larger the l value, the more the ridge parameters are shrunk around 0. Lasso regression Lasso regression is an algorithm similar to ridge regression, the only difference being that the regularization term is the sum of the absolute values of the parameters: Logistic regression Despite the name, this algorithm is used for (binary) classification problems, so we define the labels. The model is given the so-called logistic function expressed by: In this case, the cost function is defined as follows: From this, the update rule is formally the same as linear regression (but the model definition,  , is different): Note that the prediction for a point p,  , is a continuous value between 0 and 1. So usually, to estimate the class label, we have a threshold at =0.5 such that: The logistic regression algorithm is applicable to multiple label problems using the techniques one versus all or one versus one. Using the first method, a problem with K classes is solved by training K logistic regression models, each one assuming the labels of the considered class j as +1 and all the rest as 0. The second approach consists of training a model for each pair of labels (  trained models). Probabilistic interpretation of generalized linear models Now that we have seen the generalized linear model, let’s find the parameters qj that satisfy the relationship: In the case of linear regression, we can assume  as normally distributed with mean 0 and variance s2 such that the probability  is  equivalent to: Therefore, the total likelihood of the system can be expressed as follows: In the case of the logistic regression algorithm, we are assuming that the logistic function itself is the probability: Then the likelihood can be expressed by: In both cases, it can be shown that maximizing the likelihood is equivalent to minimizing the cost function, so the gradient descent will be the same. k-nearest neighbours (KNN) This is a very simple classification (or regression) method in which given a set of feature vectors  with corresponding labels yi, a test point x(t) is assigned to the label value with the majority of the label occurrences in the K nearest neighbors  found, using a distance measure such as the following: Euclidean: Manhattan: Minkowski:  (if q=2, this reduces to the Euclidean distance) In the case of regression, the value yt is calculated by replacing the majority of occurrences by the average of the labels .  The simplest average (or the majority of occurrences) has uniform weights, so each point has the same importance regardless of their actual distance from x(t). However, a weighted average with weights equal to the inverse distance from x(t) may be used. Summary In this article, the major classification and regression algorithms, together with the techniques to implement them, were discussed. You should now be able to understand in which situation each method can be used and how to implement it using Python and its libraries (sklearn and pandas). Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Unsupervised Learning [article] Specialized Machine Learning Topics [article]
Read more
  • 0
  • 0
  • 1739
Modal Close icon
Modal Close icon