How-To Tutorials

article-image-testing-and-debugging-distributed-applications

01 Apr 2016

21 min read

Testing and Debugging Distributed Applications

01 Apr 2016

0
0
3629

Packt

01 Apr 2016

16 min read

Machine Learning Tasks

Packt

01 Apr 2016

16 min read

In this article written by David Julian, author of the book Designing Machine Learning Systems with Python, the author wants to state that, he will first introduce the basic machine learning tasks. Classification is probably the most common task, due in part to the fact that it is relatively easy, well understood, and solves a lot of common problems. Multiclass classification (for instance, handwriting recognition) can sometimes be achieved by chaining binary classification tasks. However, we lose information this way, and we become unable to define a single decision boundary. For this reason, multiclass classification is often treated separately from binary classification. (For more resources related to this topic, see here.) There are cases where we are not interested in discrete classes but rather a real number, for instance, a probability. These type of problems are regression problems. Both classification and regression require a training set of correctly labelled data. They are supervised learning problems. Originating from these basic machine tasks are a number of derived tasks. In many applications, this may simply be applying the learning model to a prediction to establish a causal relationship. We must remember that explaining and predicting are not the same. A model can make a prediction, but unless we know explicitly how it made the prediction, we cannot begin to form a comprehensible explanation. An explanation requires human knowledge of the domain. We can also use a prediction model to find exceptions from a general pattern. Here, we are interested in the individual cases that deviate from the predictions. This is often called anomaly detection and has wide applications in areas such as detecting bank fraud, noise filtering, and even in the search for extraterrestrial life. An important and potentially useful task is subgroup discovery. Our goal here is not, as in clustering, to partition the entire domain but rather to find a subgroup that has a substantially different distribution. In essence, subgroup discovery is trying to find relationships between a dependent target variable and many independent explaining variables. We are not trying to find a complete relationship but rather a group of instances that are different in ways that are important in the domain. For instance, establishing the subgroups, smoker = true and family history =true, for a target variable of heart disease =true. Finally, we consider control type tasks. These act to optimize control setting to maximize a pay off is given different conditions. This can be achieved in several ways. We can clone expert behavior; the machine learns directly from a human and makes predictions of actions given different conditions. The task is to learn a prediction model for the expert's actions. This is similar to reinforcement learning, where the task is to learn about the relationship between conditions and optimal action. Clustering, on the other hand, is the task of grouping items without any information on that group; this is an unsupervised learning task. Clustering is basically making a measurement of similarity. Related to clustering is association, which is an unsupervised task to find a certain type of pattern in the data. This task is behind movie recommender systems, and customers who bought this also bought .. on checkout pages of online stores. Data for machine learning When considering raw data for machine learning applications, there are three separate aspects: The volume of the data The velocity of the data The variety of the data Data volume The volume problem can be approached from three different directions: efficiency, scalability, and parallelism. Efficiency is about minimizing the time it takes for an algorithm to process a unit of information. A component of this is the underlying processing power of the hardware. The other component, and one that we have more control over, is ensuring our algorithms are not wasting precious processing cycles on unnecessary tasks. Scalability is really about brute force, and throwing as much hardware at a problem as you can. With Moore's law, which predicts the trend of computer power doubling every two years and reaching its limit, it is clear that scalability is not, by its self, going to be able to keep pace with the ever increasing amounts of data. Simply adding more memory and faster processors is not, in many cases, going to be a cost effective solution. Parallelism is a growing area of machine learning, and it encompasses a number of different approaches from harnessing capabilities of multi core processors, to large scale distributed computing on many different platforms. Probably, the most common method is to simply run the same algorithm on many machines, each with a different set of parameters. Another method is to decompose a learning algorithm into an adaptive sequence of queries, and have these queries processed in parallel. A common implementation of this technique is known as MapReduce, or its open source version, Hadoop. Data velocity The velocity problem is often approached in terms of data producers and data consumers. The data transfer rate between the two is its velocity, and it can be measured in interactive response times. This is the time it takes from a query being made to its response being delivered. Response times are constrained by latencies such as hard disk read and write times, and the time it takes to transmit data across a network. Data is being produced at ever greater rates, and this is largely driven by the rapid expansion of mobile networks and devices. The increasing instrumentation of daily life is revolutionizing the way products and services are delivered. This increasing flow of data has led to the idea of streaming processing. When input data is at a velocity that makes it impossible to store in its entirety, a level of analysis is necessary as the data streams, in essence, deciding what data is useful and should be stored and what data can be thrown away. An extreme example is the Large Hadron Collider at CERN, where the vast majority of data is discarded. A sophisticated algorithm must scan the data as it is being generated, looking at the information needle in the data haystack. Another instance where processing data streams might be important is when an application requires an immediate response. This is becoming increasingly used in applications such as online gaming and stock market trading. It is not just the velocity of incoming data that we are interested in. In many applications, particularly on the web, the velocity of a system's output is also important. Consider applications such as recommender systems, which need to process large amounts of data and present a response in the time it takes for a web page to load. Data variety Collecting data from different sources invariably means dealing with misaligned data structures, and incompatible formats. It also often means dealing with different semantics and having to understand a data system that may have been built on a pretty different set of logical principles. We have to remember that, very often, data is repurposed for an entirely different application than the one it was originally intended for. There is a huge variety of data formats and underlying platforms. Significant time can be spent converting data into one consistent format. Even when this is done, the data itself needs to be aligned such that each record consists of the same number of features and is measured in the same units. Models The goal in machine learning is not to just solve an instance of a problem, but to create a model that will solve unique problems from new data. This is the essence of learning. A learning model must have a mechanism for evaluating its output, and in turn, changing its behavior to a state that is closer to a solution. A model is essentially a hypothesis: a proposed explanation for a phenomenon. The goal is to apply a generalization to the problem. In the case of supervised learning, problem knowledge gained from the training set is applied to the unlabeled test. In the case of an unsupervised learning problem, such as clustering, the system does not learn from a training set. It must learn from the characteristics of the dataset itself, such as degree of similarity. In both cases, the process is iterative. It repeats a well-defined set of tasks, that moves the model closer to a correct hypothesis. There are many models and as many variations on these models as there are unique solutions. We can see that the problems that machine learning systems solve (regression, classification, association, and so on) come up in many different settings. They have been used successfully in almost all branches of science, engineering, mathematics, commerce, and also in the social sciences; they are as diverse as the domains they operate in. This diversity of models gives machine learning systems great problem solving powers. However, it can also be a bit daunting for the designer to decide what is the best model, or models, for a particular problem. To complicate things further, there are often several models that may solve your task, or your task may need several models. The most accurate and efficient pathway through an original problem is something you simply cannot know when you embark upon such a project. There are several modeling approaches. These are really different perspectives that we can use to help us understand the problem landscape. A distinction can be made regarding how a model divides up the instance space. The instance space can be considered all possible instances of your data, regardless of whether each instance actually appears in the data. The data is a subset of the instance space. There are two approaches to dividing up this space: grouping and grading. The key difference between the two is that grouping models divide the instance space into fixed discrete units called segments. Each segment has a finite resolution and cannot distinguish between classes beyond this resolution. Grading, on the other hand, forms a global model over the entire instance space, rather than dividing the space into segments. In theory, the resolution of a grading model is infinite, and it can distinguish between instances no matter how similar they are. The distinction between grouping and grading is not absolute, and many models contain elements of both. Geometric models One of the most useful approaches to machine learning modeling is through geometry. Geometric models use the concept of instance space. The most obvious example is when all the features are numerical and can become coordinates in a Cartesian coordinate system. When we only have two or three features, they are easy to visualize. Since many machine learning problems have hundreds or thousands of features, and therefore dimensions, visualizing these spaces is impossible. Importantly, many of the geometric concepts, such as linear transformations, still apply in this hyper space. This can help us better understand our models. For instance, we expect many learning algorithms to be translation invariant, which means that it does not matter where we place the origin in the coordinate system. Also, we can use the geometric concept of Euclidean distance to measure similarity between instances; this gives us a method to cluster alike instances and form a decision boundary between them. Probabilistic models Often, we will want our models to output probabilities rather than just binary true or false. When we take a probabilistic approach, we assume that there is an underlying random process that creates a well-defined, but unknown, probability distribution. Probabilistic models are often expressed in the form of a tree. Tree models are ubiquitous in machine learning, and one of their main advantages is that they can inform us about the underlying structure of a problem. Decision trees are naturally easy to visualize and conceptualize. They allow inspection and do not just give an answer. For example, if we have to predict a category, we can also expose the logical steps that gave rise to a particular result. Also, tree models generally require less data preparation than other models and can handle numerical and categorical data. On the down side, tree models can create overly complex models that do not generalize very well to new data. Another potential problem with tree models is that they can become very sensitive to changes in the input data, and as we will see later, this problem can be mitigated by using them as ensemble learners. Linear models A key concept in machine learning is that of the linear model. Linear models form the foundation of many advanced nonlinear techniques such as support vector machines and neural networks. They can be applied to any predictive task such as classification, regression, or probability estimation. When responding to small changes in the input data, and provided that our data consists of entirely uncorrelated features, linear models tend to be more stable than tree models. Tree models can over-respond to small variation in training data. This is because splits at the root of a tree have consequences that are not recoverable further down a branch, potentially making the rest of the tree significantly different. Linear models, on the other hand, are relatively stable, being less sensitive to initial conditions. However, as you would expect, this has the opposite effect of making it less sensitive to nuanced data. This is described by the terms variance (for over fitting models) and bias (for under fitting models). A linear model is typically low variance and high bias. Linear models are generally best approached from a geometric perspective. We know we can easily plot two dimensions of space in a Cartesian co-ordinate system, and we can use the illusion of perspective to illustrate a third. We have also been taught to think of time as being a fourth dimension, but when we start speaking of n dimensions, a physical analogy breaks down. Intriguingly, we can still use many of the mathematical tools that we intuitively apply to three dimensions of space. While it becomes difficult to visualize these extra dimensions, we can still use the same geometric concepts (such as lines, planes, angles, and distance) to describe them. With geometric models, we describe each instance as having a set of real-valued features, each of which is a dimension in a space. Model ensembles Ensemble techniques can be divided broadly into two types. The Averaging Method: With this method, several estimators are run independently, and their predictions are averaged. This includes the random forests and bagging methods. The Boosting Methods: With this method, weak learners are built sequentially using weighted distributions of the data, based on the error rates. Ensemble methods use multiple models to obtain better performance than any single constituent model. The aim is to not only build diverse and robust models, but also to work within limitations such as processing speed and return times. When working with large datasets and quick response times, this can be a significant developmental bottleneck. Troubleshooting and diagnostics are important aspects of working with all machine learning models, but they are especially important when dealing with models that might take days to run. The types of machine learning ensembles that can be created are as diverse as the models themselves, and the main considerations revolve around three things: how we divide our data, how we select the models, and the methods we use to combine their results. This simplistic statement actually encompasses a very large and diverse space. Neural nets When we approach the problem of trying to mimic the brain, we are faced with a number of difficulties. Considering all the different things the brain does, we might first think that it consists of a number of different algorithms, each specialized to do a particular task, and each hard wired into different parts of the brain. This approach translates to considering the brain as a number of subsystems, each with its own program and task. For example, the auditory cortex for perceiving sound has its own algorithm that, say, does a Fourier transform on an incoming sound wave to detect the pitch. The visual cortex, on the other hand, has its own distinct algorithm for decoding the signals from the optic nerve and translating them into the sensation of sight. There is, however, growing evidence that the brain does not function like this at all. It appears, from biological studies, that brain tissue in different parts of the brain can relearn how to interpret inputs. So, rather than consisting of specialized subsystems that are programmed to perform specific tasks, the brain uses the same algorithm to learn different tasks. This single algorithm approach has many advantages, not least of which is that it is relatively easy to implement. It also means that we can create generalized models and then train them to perform specialized tasks. Like in real brains, using a singular algorithm to describe how each neuron communicates with the other neurons around it allows artificial neural networks to be adaptable and able to carry out multiple higher-level tasks. Much of the most important work being done with neural net models, and indeed machine learning in general, is through the use of very complex neural nets with many layers and features. This approach is often called deep architecture or deep learning. Human and animal learning occurs at a rate and depth that no machine can match. Many of the elements of biological learning still remain a mystery. One of the key areas of research, and one of the most useful in application, is that of object recognition. This is something quite fundamental to living systems, and higher animals have evolved to possessing an extraordinary ability to learn complex relationships between objects. Biological brains have many layers; each synaptic event exists in a long chain of synaptic processes. In order to recognize complex objects, such as people's faces or handwritten digits, a fundamental task is to create a hierarchy of representation from the raw input to higher and higher levels of abstraction. The goal is to transform raw data, such as a set of pixel values, into something that we can describe as, say, a person riding bicycle. Resources for Article: Further resources on this subject: Python Data Structures [article] Exception Handling in MySQL for Python [article] Python Data Analysis Utilities [article]

0
0
5074

Packt

31 Mar 2016

7 min read

Launching a Spark Cluster

Packt

31 Mar 2016

7 min read

In this article by Omar Khedher, author of OpenStack Sahara Essentials we will use Sahara to create and launch a Spark cluster. Sahara provides several plugins to provision Hadoop clusters on top of OpenStack. We will be using Spark plugins to provision Apache Spark clusters using Horizon. (For more resources related to this topic, see here.) General settings The following diagram illustrates our Spark cluster topology, which includes: One Spark master node: This runs the Spark Master and the HDFS NameNode Three Spark slave nodes: These run a Spark Slave and an HDFS DataNode each Preparing the Spark image The following link provides several Sahara images available for download for different plugins: http://sahara-files.mirantis.com/images/upstream/liberty. Note that the upstream Sahara image files are destined for the OpenStack Liberty release. From Horizon, click on Compute and select Images, click on Create Image and add the new image, as shown here: We will need to upload the downloaded image to Glance so that it can be registered in the Sahara image registry catalog. Make sure that the new image is active. Click on the Data Processing tab and select Image Registry. Click on Register Image to register the new uploaded Glance image to Sahara, as shown here: Click on Done and the new Spark image is ready to start launching the Spark cluster. Creating the Spark master group template Node group templates in Sahara facilitate the configuration of a set of instances that have same properties, such as RAM and CPU. We will start by creating the first node group template for the Spark master. From the Data Processing tab, select Node Group Templates and click on Create Template. Our first node group template will be based on Apache Spark with Version 1.3.1, as shown here: The next wizard will guide to specifying the name of the template, the instance flavor, the storage location, and which floating IP pool will be assigned to the cluster instance: The next tab in same wizard will guide you to selecting which kind of process the nodes in the cluster will run. In our case, the Spark master node group template will include Spark master and HDFS namenode processes, as shown here: The next tab in the wizard exposes more choices regarding the security groups that will be applied for the template cluster nodes: Auto security group: This will automatically create a set of security groups that will be directly applied to the instances of the node group template Default security group: Any existing security groups in the OpenStack environment configured as default will be applied the instances of the node group template The last tab in the wizard exposes more specific HDFS configuration that depend on the available resources of the cluster, such as disk space, CPU and memory: dfs.datanode.handler.count: How many server threads there are for the datanode dfs.datanode.du.reserved: How much of the available disk space will not be taken into account for HDFS use dfs.namenode.handler.count: How many server threads there are for the namenode dfs.datanode.failed.volumes.tolerated: How many volumes are allowed to fail before a datanode instance stops dfs.datanode.max.xcievers: What is the maximum number of threads to be used in order to transfer data to/from the DataNode instance. Name Node Heap Size: How much memory will be assigned to the heap size to handle workload per NameNode instance Data Node Heap Size: How much memory will be assigned to the heap size to handle workload per DataNode instance Creating the Spark slave group template Creating the Spark slave group template will be performed in the same way as the Spark master group template except the assignment of the node processes. The Spark slave nodes will be running Spark slave and HDFS datanode processes, as shown here: Security groups and HDFS parameters can be configured the same as the Spark master node group template. Creating the Spark cluster template Now that we have defined the basic templates for the Spark cluster, we will need to compile both entities into one cluster template. In the Sahara dashboard, select Cluster Templates and click on Create Template. Select Apache Spark as the Plugin name, with version 1.3.1, as follows: Give the cluster template a name and small description. It is also possible to mention which process in the Spark cluster will run in a different compute node for high-availability purposes. This is only valid when you have more than one compute node in the OpenStack environment. The next tab in the same wizard allows you to add the necessary number of Spark instances based on the node group templates created previously. In our case, we will use one master Spark instance and three slave Spark instances, as shown here: The next tab, General Parameters, provides more advanced cluster configuration, including the following: Timeout for disk preparing: The cluster will fail when the duration of formatting and mounting the disk per node exceeds the timeout value. Enable NTP service: This option will enable all the instances of the cluster to synchronized time. An NTP file can be found under /tmp when cluster nodes are active. URL of NTP server: If mentioned, the Spark cluster will use the URL of the NTP server for time synchronization. Heat Wait Condition timeout: Heat will throw an error message to Sahara and the cluster will fail when a node is not able to boot up after a certain amount of time. This will prevent Sahara spawning instances indefinitely. Enable XFS: Allows XFS disk formatting. Decommissioning Timeout: This will throw an error when scaling data nodes in the Spark cluster takes more than the time mentioned. Enable Swift: Allows using Swift object storage to pull and push data during job execution. The Spark Parameters tab allows you to specify the following: Master webui port: Which port will access the Spark master web user interface. Work webui port: Which port will access the Spark slave web user interface. Worker memory: How much memory will be reserved for Spark applications. By default, if all is selected, Spark will use all the available RAM is the instance minus 1 GB. Spark will not run properly when using a flavor having RAM less than 1 GB. Launching the Spark cluster Based on the cluster template, the last step will require you to only push the button Launch Cluster from the Clusters tab in the Sahara dashboard. You will need only to select the plugin name, Apache Spark, with version 1.3.1. Next, you will need to name the new cluster, select the right cluster template created previously, and the base image registered in Sahara. Additionally, if you intend to access the cluster instances via SSH, select an existing SSH keypair. It is also possible to select from which network segment you will be able to manage the cluster instances; in our case, an existing private network, Private_Net10, will be used for this purpose. Launch the cluster; this will take a while to finish spawning four instances forming the Spark cluster. The Spark cluster instances can be listed in the Compute Instances tab, as shown here: Summary In this article, we created a Spark cluster using Sahara in OpenStack by means of the Apache Spark plugin. The provisioned cluster includes one Spark master node and three Spark slave nodes. When the cluster status changes to theactive state, it is possible to start executing jobs. Resources for Article: Further resources on this subject: Introducing OpenStack Trove [article] OpenStack Performance, Availability [article] Monitoring Physical Network Bandwidth Using OpenStack Ceilometer [article]

0
0
2945

How-To Tutorials

Packt

31 Mar 2016

8 min read

Why Mesos?

Packt

31 Mar 2016

8 min read

In this article by Dipa Dubhasi and Akhil Das authors of the book Mastering Mesos, delves into understanding the importance of Mesos. Apache Mesos is an open source, distributed cluster management software that came out of AMPLab, UC Berkeley in 2011. It abstracts CPU, memory, storage, and other computer resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. It is referred to as a metascheduler (scheduler of schedulers) and a "distributed systems kernel/distributed datacenter OS". It improves resource utilization, simplifies system administration, and supports a wide variety of distributed applications that can be deployed by leveraging its pluggable architecture. It is scalable and efficient and provides a host of features, such as resource isolation and high availability, which, along with a strong and vibrant open source community, makes this one of the most exciting projects. (For more resources related to this topic, see here.) Introduction to the datacenter OS and architecture of Mesos Over the past decade, datacenters have graduated from packing multiple applications into a single server box to having large datacenters that aggregate thousands of servers to serve as a massively distributed computing infrastructure. With the advent of virtualization, microservices, cluster computing, and hyper-scale infrastructure, the need of the hour is the creation of an application-centric enterprise that follows a software-defined datacenter strategy. Currently, server clusters are predominantly managed individually, which can be likened to having multiple operating systems on the PC, one each for processor, disk drive, and so on. With an abstraction model that treats these machines as individual entities being managed in isolation, the ability of the datacenter to effectively build and run distributed applications is greatly reduced. Another way of looking at the situation is comparing running applications in a datacenter to running them on a laptop. One major difference is that while launching a text editor or web browser, we are not required to check which memory modules are free and choose ones that suit our need. Herein lies the significance of a platform that acts like a host operating system and allows multiple users to run multiple applications simultaneously by utilizing a shared set of resources. Datacenters now run varied distributed application workloads, such as Spark, Hadoop, and so on, and need the capability to intelligently match resources and applications. The datacenter ecosystem today has to be equipped to manage and monitor resources and efficiently distribute workloads across a unified pool of resources with the agility and ease to cater to a diverse user base (noninfrastructure teams included). A datacenter OS brings to the table a comprehensive and sustainable approach to resource management and monitoring. This not only reduces the cost of ownership but also allows a flexible handling of resource requirements in a manner that isolated datacenter infrastructure cannot support. The idea behind a datacenter OS is that of an intelligent software that sits above all the hardware in a datacenter and ensures efficient and dynamic resource sharing. Added to this is the capability to constantly monitor resource usage and improve workload and infrastructure management in a seamless way that is not tied to specific application requirements. In its absence, we have a scenario with silos in a datacenter that force developers to build software catering to machine-specific characteristics and make the moving and resizing of applications a highly cumbersome procedure. The datacenter OS acts as a software layer that aggregates all servers in a datacenter into one giant supercomputer to deliver the benefits of multilatency, isolation, and resource control across all microservice applications. Another major advantage is the elimination of human-induced error during the continual assigning and reassigning of virtual resources. From a developer's perspective, this will allow them to easily and safely build distributed applications without restricting them to a bunch of specialized tools, each catering to a specific set of requirements. For instance, let's consider the case of Data Science teams who develop analytic applications that are highly resource intensive. An operating system that can simplify how the resources are accessed, shared, and distributed successfully alleviates their concern about reallocating hardware every time the workloads change. Of key importance is the relevance of the datacenter OS to DevOps, primarily a software development approach that emphasizes automation, integration, collaboration, and communication between traditional software developers and other IT professionals. With a datacenter OS that effectively transforms individual servers into a pool of resources, DevOps teams can focus on accelerating development and not continuously worry about infrastructure issues. In a world where distributed computing becomes the norm, the datacenter OS is a boon in disguise. With freedom from manually configuring and maintaining individual machines and applications, system engineers need not configure specific machines for specific applications as all applications would be capable of running on any available resources from any machine, even if there are other applications already running on them. Using a datacenter OS results in centralized control and smart utilization of resources that eliminate hardware and software silos to ensure greater accessibility and usability even for noninfrastructural professionals. Examples of some organizations administering their hyperscale datacenters via the datacenter OS are Google with the Borg (and next geneneration Omega) systems. The merits of the datacenter OS are undeniable, with benefits ranging from the scalability of computing resources and flexibility to support data sharing across applications to saving team effort, time, and money while launching and managing interoperable cluster applications. It is this vision of transforming the datacenter into a single supercomputer that Apache Mesos seeks to achieve. Born out of a Berkeley AMPLab research paper in 2011, it has since come a long way with a number of leading companies, such as Apple, Twitter, Netflix, and AirBnB among others, using it in production. Mesosphere is a start-up that is developing a distributed OS product with Mesos at its core. The architecture of Mesos Mesos is an open-source platform for sharing clusters of commodity servers between different distributed applications (or frameworks), such as Hadoop, Spark, and Kafka among others. The idea is to act as a centralized cluster manager by pooling together all the physical resources of the cluster and making it available as a single reservoir of highly available resources for all the different frameworks to utilize. For example, if an organization has one 10-node cluster (16 CPUs and 64 GB RAM) and another 5-node cluster (4 CPUs and 16 GB RAM), then Mesos can be leveraged to pool them into one virtual cluster of 720 GB RAM and 180 CPUs, where multiple distributed applications can be run. Sharing resources in this fashion greatly improves cluster utilization and eliminates the need for an expensive data replication process per-framework. Some of the important features of Mesos are: Scalability: It can elastically scale to over 50,000 nodes Resource isolation: This is achieved through Linux/Docker containers Efficiency: This is achieved through CPU and memory-aware resource scheduling across multiple frameworks High availability: This is through Apache ZooKeeper Interface: A web UI for monitoring the cluster state Mesos is based on the same principles as the Linux kernel and aims to provide a highly available, scalable, and fault-tolerant base for enabling various frameworks to share cluster resources effectively and in isolation. Distributed applications are varied and continuously evolving, a fact that leads Mesos' design philosophy towards a thin interface that allows an efficient resource allocation between different frameworks and delegates the task of scheduling and job execution to the frameworks themselves. The two advantages of doing so are: Different frame data replication works can independently devise methods to address their data locality, fault-tolerance, and other such needs. It simplifies the Mesos codebase and allows it to be scalable, flexible, robust, and agile Mesos' architecture hands over the responsibility of scheduling tasks to the respective frameworks by employing a resource offer abstraction that packages a set of resources and makes offers to each framework. The Mesos master node decides the quantity of resources to offer each framework, while each framework decides which resource offers to accept and which tasks to execute on these accepted resources. This method of resource allocation is shown to achieve good degree of data locality for each framework sharing the same cluster. An alternative architecture would implement a global scheduler that took framework requirements, organizational priorities, and resource availability as inputs and provided a task schedule breakdown by framework and resource as output, essentially acting as a matchmaker for jobs and resources with priorities acting as constraints. The challenges with this architecture, such as developing a robust API that could capture all the varied requirements of different frameworks, anticipating new frameworks, and solving a complex scheduling problem for millions of jobs, made the former approach a much more attractive option for the creators. Summary Thus in this article, we introduced Mesos, and then dived deep into its architecture to understand importance of Mesos. Resources for Article: Further resources on this subject: Understanding Mesos Internals [article] Leveraging Python in the World of Big Data [article] Self-service Business Intelligence, Creating Value from Data [article]

0
0
10027

Packt

30 Mar 2016

15 min read

ALM – Developers and QA

Packt

30 Mar 2016

15 min read

0
0
22117

article-image-golang-decorators-logging-time-profiling

Nicholas Maccharoli

30 Mar 2016

6 min read

Golang Decorators: Logging & Time Profiling

Nicholas Maccharoli

30 Mar 2016

6 min read

Golang's imperative world Golang is not, by any means, a functional language; its design remains true to its jingle, which says that it is "C for the 21st Century". One task I tried to do early on in learning the language was search for the map, filter, and reduce functions in the standard library but to no avail. Next, I tried rolling my own versions, but I felt as though I hit a bit of a road block when I discovered that there is no support for generics in the language at the time of writing this. There is, however, support for Higher Order Functions or, more simply put, functions that take other functions as arguments and return functions. If you have spent some time in Python, you may have come to love a design pattern called "Decorator". In fact, decorators make life in Python so great that support for applying them is built right into the language with a nifty @ operator! Python frameworks such as Flask extensively use decorators. If you have little or no experience in Python, fear not for the concept is a design pattern independent of any language. Decorators An alternative name for the decorator pattern is "wrapper", which pretty much sums it all up in one word! A decorator's job is only to wrap a function so that additional code can be executed when the original function is called. This is accomplished by writing a function that takes a function as its argument and returns a function of the same type (Higher Order Functions in action!). While this still calls the original function and passes through its return value, it does something extra along the way. Decorators for logging We can easily log which specific method is passed with a little help from our decorator friends. Say, we wanted to log which user liked a blog post and what the ID of the post was all without touching any code in the original likePost function. Here is our original function: func likePost(userId int, postId int) bool { fmt.Printf("Update Complete!n") return true } Our decorator might look something similar to this: type LikeFunc func(int, int) bool func decoratedLike(f LikeFunc) LikeFunc { return func(userId int, postId int) bool { fmt.Printf("likePost Log: User %v liked post# %vn", userId, postId) return f(userId, postId) } } Note the use of the type definition here. I encourage you to use it for the sake of readability when defining functions with long signatures, such as those of decorators, as you need to type the function signature twice. Now, we can apply the decorator and allow the logging to begin: r := likeStats(likePost) r(1414, 324) r(5454, 324) r(4322, 250) This produces the following output: likePost Log: User 1414 liked post# 324 Update Complete! likePost Log: User 5454 liked post# 324 Update Complete! likePost Log: User 4322 liked post# 250 Update Complete! Our original likePost function still gets called and runs as expected, but now we get an additional log detailing the user and post IDs that were passed to the function each time it was called. Hopefully, this will help speed up debugging our likePost function if and when we encounter strange behavior! Decorators for performance! Say, we run a "Top 10" site and previously, our main sorting routine to find the top 10 cat photos of this week on the Internet was written with Golang's func Sort(data Interface) function from the sort package of the Golang standard library. Everything is fine until we are informed that Fluffy the cat is infuriated that she is coming in at number six on the list and not number five. The cats at ranks five and six on the list both had 5000 likes each, but Fluffy reached 5000 likes a day earlier than Bozo the cat, who is currently higher ranked. We like to give credit where it's due, so we apologize to Fluffy and go on to use the stable version of the func Stable(data Interface) sort, which preserves the order of elements equal in value during the sort. We can improve our code and tests so that this does not happen again (We promised Fluffy!). The tests pass, everything looks great, and we deploy gracefully... or so we think. Over the course of the day, other developers also deploy their changes, and then, after checking our performance reports, we notice a slowdown somewhere. Is it from our switch to stable the sorting? Well, let’s use decorators to measure the performance of both sort functions and check whether there is a noticeable dip in performance. Here’s our testing function: type SortFunc func(sort.Interface) func timedSortFunc(f SortFunc) SortFunc { return func(data sort.Interface) { defer func(t time.Time) { fmt.Printf("--- Time Elapsed: %v ---n", time.Since(t)) }(time.Now()) f(data) } } In case you are unfamiliar with defer, all it does is call the function it is passed right after its calling function returns. The arguments passed to defer are evaluated right away, so the value we get from time.Now() is really the start time of the function! Let’s go ahead and give this test a go: stable := timedSortFunc(sort.Stable) unStable := timedSortFunc(sort.Sort) // 10000 Elements with values ranging // between 0 and 5000 randomCatList1 := randomCatScoreSlice(10000, 5000) randomCatList2 := randomCatScoreSlice(10000, 5000) fmt.Printf("Unstable Sorting Function:n") stable(randomCatList1) fmt.Printf("Stable Sorting Function:n") unStable(randomCatList2) The following output is yielded: Unstable Sorting Function: --- Time Elapsed: 282.889µs --- Stable Sorting Function: --- Time Elapsed: 93.947µs --- Wow! Fluffy's complaint not only made our top 10 list more accurate but now they sort about three times as fast with the stable version of sort as well! (However, we still need to be careful; sort.Stable most likely uses way more memory than the standard sort.Sort function.) Final thoughts Figuring out when and where to apply the decorator pattern is really up to you and your team. There are no hard rules, and you can completely live without it. However, when it comes to things like extra logging or profiling a pesky area of your code, this technique may prove to be a valuable tool. Where is the rest of the code? In order get this example up and running, there is some setup code that was not shown here in order to keep the post from becoming too bloated. I encourage you take a look at this code here if you are interested! About the author Nick Maccharoli is an iOS/backend developer and open source enthusiast working at a start-up in Tokyo and enjoying the current development scene. You can see what he is up to at @din0sr or github.com/nirma.

0
0
19758

How-To Tutorials

article-image-building-product-recommendation-system

Packt

29 Mar 2016

25 min read

Building a Product Recommendation System

Packt

29 Mar 2016

25 min read

0
0
5424

How-To Tutorials

article-image-boosting-performance-database

Packt

29 Mar 2016

10 min read

Boosting up the Performance of a Database

Packt

29 Mar 2016

10 min read

In this article by Altaf Hussain, author of the book Learning PHP 7 High Performance we will see how databases play a key role in dynamic websites. All incoming and outgoing data is stored in databases. So if the database for a PHP application is not well-designed and optimized, then it will affect the application performance tremendously. In this article, we will be looking into the ways to optimize our PHP application database. (For more resources related to this topic, see here.) MySQL MySQL is the most used Relational Database Management System (RDMS) for the web. It is open source and has a free community version. It provides all those features, which can be provided by an enterprise-level database. The default settings provided with the MySQL installation may not be so good for performance, and there are always ways to fine-tune settings to get an increased performance. Also, remember that your database design also plays a role in performance. A poorly designed database will have an effect on overall performance. In this article, we will discuss how to improve the MySQL database performance. We will be modifying the MySQL configuration my.cnf file. This file is located in different places in different OSes. Also, if you are using XAMPP, WAMP, and so on, on Windows, this file will be located in those respective folders. Whenever my.cnf is mentioned, it is assumed that the file is open no matter which OS is used. Query Caching Query Caching is an important performance feature of MySQL. It caches SELECT queries along with the resulting dataset. When an identical SELECT query occurs, MySQL fetches the data from memory; hence, the query is executed faster. Thus, this reduces the load on the database. To check whether query cache is enabled on a MySQL server or not, issue the following command in your MySQL command line: SHOW VARIABLES LIKE 'have_query_cache'; This command will display an output, as follows: This result set shows that query cache is enabled. If query cache is disabled, the value will be NO. To enable query caching, open up the my.cnf file and add the following lines. If these lines are present, just uncomment them if they are commented: query_cache_type = 1 query_cache_size = 128MB query_cache_limit = 1MB Save the my.cnf file and restart the MySQL server. Let's discuss what these three configurations mean. query_cache_size The query_cache_size parameter means how much memory will be allocated. Some will think that the more memory used, the better this is; but this is just a misunderstanding. It all depends on the size of the database, the types of queries, and ratios between read and writes, hardware and database traffic, and so on. A good value for query_cache_size is in between 100 MB and 200 MB. Then, monitor the performance and the other previously mentioned variables on which the query cache depends, and adjust the size. We have used 128 MB for a medium range traffic magento website, and it is working perfectly. Set this value to 0 to disable the query cache. query_cache_limit This defines the maximum size of a query dataset to be cached. If the size of a query dataset is larger than this value, it won't be cached. The value of this configuration can be guessed by finding out the largest select query and the size of its returned dataset. query_cache_type The query_cache_type parameter plays a weird role. If query_cache_type is set to 1, then the following may occur: If query_cache_size is 0, then no memory is allocated and query cache is disabled If query_cache_size is greater than 0, then query cache is enabled, memory is allocated, and all queries that do not exceed query_cache_limit and use the SQL_NO_CACHE option will be cached If query_cache_type value is 0, then the following occurs: If query_cache_size is 0, then no memory is allocated and the cache is disabled If query_cache_size is greater than 0, then the memory is allocated, but nothing is cached, that is, the cache is disabled Storage Engines Storage Engines (or Table Types) are a part of core MySQL and are responsible for handling operations on tables. MySQL provides several storage engines, and the two most widely-used are MyISAM and InnoDB. Both storage engines have their own pros and cons, but InnoDB is always prioritized. MySQL started to use InnoDB as its default storage engine starting from version 5.5. MySQL provides some other storage engines, which have their own purposes. During the database design process, which table should use which storage engine can be decided. A complete list of storage engines for MySQL 5.6 can be found at http://dev.mysql.com/doc/refman/5.6/en/storage-engines.html. Storage engine can be set at database level, which will be then used as default storage engine for each newly created table. Note that the storage engine is table-based and different tables can have different storage engines in a single database. What if we have a table already created and we want to change its storage engine? This is easy. Let's say our table name is pkt_users and its storage engine is MyISAM and we want to change it to InnoDB, then we will use the following MySQL command: ALTER TABLE pkt_users ENGINE=INNODB; This will change the storage engine of the table to InnoDB. Now, let's discuss the difference between the two most widely-used storage engines MyISAM and InnoDB. MyISAM A brief list of features that are or are not supported by MyISAM is as follows: MyISAM is designed for speed, which plays best with SELECT statement. If a table is more static, that is, the data in that table is less frequently updated or deleted and mostly the data is only fetched, then MyISAM is best for this table. MyISAM supports table-level locking. If a specific operation needs to be performed on data in a table, then the complete table can be locked. During this lock, no operation can be performed on this table. This can cause performance degradation if the table is more dynamic, that is, the data is frequently changing in the table. MyISAM does not have support for Foreign Keys (FK). MyISAM supports fulltext search. MyISAM does not support transactions. So, there is no support for commit and rollback. If a query on a table is executed, it is executed and there is no coming back. Data compression, Replication, Query Cache, and Data encryption is supported. Cluster database is not supported. InnoDB A brief list of features that are or are not supported by InnoDB is as follows: InnoDB is designed for high reliability and high performance when processing a high volume of data. InnoDB supports row-level locking. It is a good feature and is great for performance. Instead of locking the complete table like MyISAM, it locks only the specific rows for SELECT, DELETE, or UPDATE operations; and during these operations, other data in this table can be manipulated. InnoDB supports Foreign Keys and support forcing Foreign Keys Constraints. Transactions are supported. Commits and rollbacks are possible; hence, data can be recovered from a specific transaction. Data Compression, Replication, Query Cache, and Data encryption is supported. InnoDB can be used in a cluster environment, but it does not have full support. However, the InnoDB tables can be converted to an NDB storage engine, which is used in a MySQL cluster by changing the table engine to NDB. In the following sections, we will discuss some more performance features that are related to InnoDB. Values for the following configuration are set in the my.cnf file. InnoDB_buffer_pool_size This setting defines how much memory should be used for InnoDB data and indexes loaded into memory. For a dedicated MySQL server, the recommended value is 50-80% of the installed memory on the sever. If this value is set to a high value, then there will be no memory left for the operating system and other subsystems of MySQL, such as transaction logs. So, let's open our my.cnf file, search for innodb_buffer_pool_size, and set the value in between the recommended value (50-80%) of our RAM. Innoddb_buffer_pool_instances This feature is not that widely-used. This feature enables multiple buffer pool instances to work together to reduce the chances of memory contentions on 64 bits' system and with a large value for innodb_buffer_pool_size. There are different choices on which the value for innodb_buffer_pool_instances should be calculated. One way is to use one instance per GB of innodb_buffer_pool_size. So, if the value of innodb_bufer_pool_size is 16 GB, we will set innodb_buffer_pool_instances to 16. InnoDB_log_file_size Inno_db_log_file_size is the the size of the log file that stores every query information that has been executed. For a dedicated server, a value up to 4 GB is safe, but the time of crash recovery may increase if the log file size is too big. So, in best practices, it should be kept in between 1 GB to 4 GB. Percona server According to Percona website, "Percona server is a free, fully compatible, enhanced, open source drop-in replacement for MySQL that provides superior performance, scalability, and instrumentation." Percona is a fork of MySQL with enhanced features for performance. All the features available in MySQL are available in Percona. Percona uses an enhanced storage engine, which is called XtraDB. According to the Percona website: "Percona XtraDB is an enhanced version of the InnoDB storage engine for MySQL, which has more features, faster performance, and better scalability on modern hardware. Percona XtraDB uses memory more efficiently in high-load environments." As mentioned previously, XtraDB is a fork of InnoDB, so all features available with InnoDB are available in XtraDB. Installation Percona is only available for Linux systems. It is not available for Windows as of now. In this book, we will install the Percona server on Debian 8. The process is the same for both Ubuntu and Debian. To install the Percona server on other Linux flavors, check out the Percona Installation manual at https://www.percona.com/doc/percona-server/5.5/installation.html. As of now, they provide instructions for Debian, Ubuntu, CentOS, and RHEL. They also provide instructions to install the Percona server from sources and Git. Now, let's install Percona server using the following steps: Open your sources list file using the following command in your terminal: sudo nano /etc/apt/sources.list If prompted for a password, enter your Debian password. The file will be opened. Now, place the following repository information at the end of the sources.list file: deb http://repo.percona.com/apt jessie main deb-src http://repo.percona.com/apt jessie main Save the file by clicking on CTRL + O and close the file by clicking on CTRL + X. Update your system using the following command in terminal: sudo apt-get update Start the installation by issuing the following command in terminal: sudo apt-get install percona-server-server-5.5 The installation will start. The process is the same as the MySQL server installation. During installation, the root password for the Percona server will be asked. You just need to enter it. When the installation is completed, you are ready to use the Percona server in the same way as you would use MySQL. Configure the Percona server and optimize it as discussed in the previous sections. Summary In this article, we studied the MySQL and Percona servers with Query Caching and other MySQL configuration options for performance. We also compared different storage engines and Percona XtraDB. We saw MySQL Workbench Performance monitoring tools as well. Resources for Article: Further resources on this subject: Building a Web Application with PHP and MariaDB – Introduction to caching [article] PHP Magic Features [article] Understanding PHP basics [article]

0
0
3014

How-To Tutorials

article-image-making-app-react-and-material-design

Soham Kamani

21 Mar 2016

7 min read

Making an App with React and Material Design

Soham Kamani

21 Mar 2016

7 min read

There has been much progression in the hybrid app development space, and also in React.js. Currently, almost all hybrid apps use cordova to build and run web applications on their platform of choice. Although learning React can be a bit of a steep curve, the benefit you get is that you are forced to make your code more modular, and this leads to huge long-term gains. This is great for developing applications for the browser, but when it comes to developing mobile apps, most web apps fall short because they fail to create the "native" experience that so many users know and love. Implementing these features on your own (through playing around with CSS and JavaScript) may work, but it's a huge pain for even something as simple as a material-design-oriented button. Fortunately, there is a library of react components to help us out with getting the look and feel of material design in our web application, which can then be ported to a mobile to get a native look and feel. This post will take you through all the steps required to build a mobile app with react and then port it to your phone using cordova. Prerequisites and dependencies Globally, you will require cordova, which can be installed by executing this line: npm install -g cordova Now that this is done, you should make a new directory for your project and set up a build environment to use es6 and jsx. Currently, webpack is the most popular build system for react, but if that's not according to your taste, there are many more build systems out there. Once you have your project folder set up, install react as well as all the other libraries you would be needing: npm init npm install --save react react-dom material-ui react-tap-event-plugin Making your app Once we're done, the app should look something like this: If you just want to get your hands dirty, you can find the source files here. Like all web applications, your app will start with an index.html file: <html> <head> <title>My Mobile App</title> </head> <body> <div id="app-node"> </div> <script src="bundle.js" ></script> </body> </html> Yup, that's it. If you are using webpack, your CSS will be included in the bundle.js file itself, so there's no need to put "style" tags either. This is the only HTML you will need for your application. Next, let's take a look at index.js, the entry point to the application code: //index.js import React from 'react'; import ReactDOM from 'react-dom'; import App from './app.jsx'; const node = document.getElementById('app-node'); ReactDOM.render( <App/>, node ); What this does is grab the main App component and attach it to the app-node DOM node. Drilling down further, let's look at the app.jsx file: //app.jsx'use strict';import React from 'react';import AppBar from 'material-ui/lib/app-bar';import MyTabs from './my-tabs.jsx';let App = React.createClass({ render : function(){ return ( <div> <AppBar title="My App" /> <MyTabs /> </div> ); }});module.exports = App; Following react's philosophy of structuring our code, we can roughly break our app down into two parts: The title bar The tabs below The title bar is more straightforward and directly fetched from the material-ui library. All we have to do is supply a "title" property to the AppBar component. MyTabs is another component that we have made, put in a different file because of the complexity: 'use strict';import React from 'react';import Tabs from 'material-ui/lib/tabs/tabs';import Tab from 'material-ui/lib/tabs/tab';import Slider from 'material-ui/lib/slider';import Checkbox from 'material-ui/lib/checkbox';import DatePicker from 'material-ui/lib/date-picker/date-picker';import injectTapEventPlugin from 'react-tap-event-plugin';injectTapEventPlugin();const styles = { headline: { fontSize: 24, paddingTop: 16, marginBottom: 12, fontWeight: 400 }};const TabsSimple = React.createClass({ render: () => ( <Tabs> <Tab label="Item One"> <div> <h2 style={styles.headline}>Tab One Template Example</h2> <p> This is the first tab. </p> <p> This is to demonstrate how easy it is to build mobile apps with react </p> <Slider name="slider0" defaultValue={0.5}/> </div> </Tab> <Tab label="Item 2"> <div> <h2 style={styles.headline}>Tab Two Template Example</h2> <p> This is the second tab </p> <Checkbox name="checkboxName1" value="checkboxValue1" label="Installed Cordova"/> <Checkbox name="checkboxName2" value="checkboxValue2" label="Installed React"/> <Checkbox name="checkboxName3" value="checkboxValue3" label="Built the app"/> </div> </Tab> <Tab label="Item 3"> <div> <h2 style={styles.headline}>Tab Three Template Example</h2> <p> Choose a Date:</p> <DatePicker hintText="Select date"/> </div> </Tab> </Tabs> )});module.exports = TabsSimple; This file has quite a lot going on, so let’s break it down step by step: We import all the components that we're going to use in our app. This includes tabs, sliders, checkboxes, and datepickers. injectTapEventPlugin is a plugin that we need in order to get tab switching to work. We decide the style used for our tabs. Next, we make our Tabs react component, which consists of three tabs: The first tab has some text along with a slider. The second tab has a group of checkboxes. The third tab has a pop-up datepicker. Each component has a few keys, which are specific to it (such as the initial value of the slider, the value reference of the checkbox, or the placeholder for the datepicker). There are a lot more properties you can assign, which are specific to each component. Building your App For building on Android, you will first need to install the Android SDK. Now that we have all the code in place, all that is left is building the app. For this, make a new directory, start a new cordova project, and add the Android platform, by running the following on your terminal: mkdir my-cordova-project cd my-cordova-project cordova create . cordova platform add android Once the installation is complete, build the code we just wrote previously. If you are using the same build system as the source code, you will have only two files, that is, index.html and bundle.min.js. Delete all the files that are currently present in the www folder of your cordova project and copy those two files there instead. You can check whether your app is working on your computer by running cordova serve and going to the appropriate address on your browser. If all is well, you can build and deploy your app: cordova build android cordova run android This will build and install the app on your Android device (provided it is in debug mode and connected to your computer). Similarly, you can build and install the same app for iOS or windows (you may need additional tools such as XCode or .NET for iOS or Windows). You can also use any other framework to build your mobile app. The angular framework also comes with its own set of material design components. About the Author Soham Kamani is a full-stack web developer and electronics hobbyist. He is especially interested in JavaScript, Python, and IoT.

0
0
14657

article-image-delegate-pattern-limitations-swift

Anthony Miller

18 Mar 2016

5 min read

Delegate Pattern Limitations in Swift

Anthony Miller

18 Mar 2016

5 min read

0
0
17579

Packt

18 Mar 2016

13 min read

Neutron API Basics

Packt

18 Mar 2016

13 min read

In this article by James Denton, the author of the book OpenStack Networking Essentials, you can see that Neutron is a virtual networking service that allows users to define network connectivity and IP addressing for instances and other cloud resources using an application programmable interface (API). The Neutron API is made up of core elements that define basic network architectures and extensions that extend base functionality. Neutron accomplishes this by virtue of its data model that consists of networks, subnets, and ports. These objects help define characteristics of the network in an easily storable format. (For more resources related to this topic, see here.) These core elements are used to build a logical network data model using information that corresponds to layers 1 through 3 of the OSI model, shown in the following screenshot: For more information on the OSI model, check out the Wikipedia article at https://en.wikipedia.org/wiki/OSI_model. Neutron uses plugins and drivers to identify network features and construct the virtual network infrastructure based on information stored in the database. A core plugin, such as the Modular Layer 2 (ML2) plugin included with Neutron, implements the core Neutron API and is responsible for adapting the logical network described by networks, ports, and subnets into something that can be implemented by the L2 agent and IP address management system running on the hosts. The extension API, provided by service plugins, allows users to manage the following resources, among others: Security groups Quotas Routers Firewalls Load balancers Virtual private networks Neutron's extensibility means that new features can be implemented in the form of extensions and plugins that extend the API without requiring major changes. This allows vendors to introduce features and functionality that would otherwise not be available with the base API. The following diagram demonstrates at a high level how the Neutron API server interacts with the various plugins and agents responsible for constructing the virtual and physical network across the cloud: The previous diagram demonstrates the interaction between the Neutron API service, Neutron plugins and drivers, and services such as the L2 and L3 agents. As network actions are performed by users via the API, the Neutron server publishes messages to the message queue that are consumed by agents. L2 agents build and maintain the virtual network infrastructure, while L3 agents are responsible for building and maintaining Neutron routers and associated functionality. The Neutron API specifications can be found on the OpenStack wiki at https://wiki.openstack.org/wiki/Neutron/APIv2-specification. In the next few sections, we will look at some of the core elements of the API and the data models used to represent those elements. Networks A network is the central object of the Neutron v2.0 API data model and describes an isolated L2 segment. In a traditional infrastructure, machines are connected to switch ports that are often grouped together into virtual local area networks (VLANs) identified by unique IDs. Machines in the same network or VLAN can communicate with one another but cannot communicate with other networks in other VLANs without the use of a router. The following diagram demonstrates how networks are isolated from one another in a traditional infrastructure: Neutron network objects have attributes that describe the network type and the physical interface used for traffic. The attributes also describe the segmentation ID used to differentiate traffic between other networks connected to virtual switches on the underlying host. The following diagram shows how a Neutron network describes various Layer 1 and Layer 2 attributes: Traffic between instances on different hosts requires underlying connectivity between the hosts. This means that the hosts must reside on the same physical switching infrastructure so that VLAN-tagged traffic can pass between them. Traffic between hosts can also be encapsulated using L2-in-L3 technologies such as GRE or VXLAN. Neutron supports multiple L2 methods of segmenting traffic, including using 802.1q VLANs, VXLANs, GRE, and more, depending on the plugin and configured drivers and agents. Devices in the same network are in the same broadcast domain, even though they may reside on different hosts and attach to different virtual switches. Neutron network attributes are very important in defining how traffic between virtual machine instances should be forwarded between hosts. Network attributes The following table describes base attributes associated with network objects, and more details can be found at the Neutron API specifications wiki referenced earlier in this article: Attribute Type Required Default Notes id uuid-str N/A Auto generated The UUID for the network name string no None The human-readable name for the network admin_state_up boolean no True The administrative state of the network status string N/A Null Indicates whether the network is currently operational subnets list no Empty list The subnets associated with the network shared boolean no False Specifies whether the network can be accessed by any tenant tenant_id uuid-str no N/A The owner of the network Networks are typically associated with tenants or projects and are usable by any user that is a member of the same tenant or project. Networks can also be shared with all other projects or a subnet of projects using Neutron's role-based access control (RBAC) functionality. Neutron RBAC first became available in the Liberty release of OpenStack. For more information on using the RBAC features, check out my blog at the following URL: https://developer.rackspace.com/blog/A-First-Look-at-RBAC-in-the-Liberty-Release-of-Neutron/. Provider attributes One of the earliest extensions to the Neutron API is known as the provider extension. The provider network extension maps virtual networks to physical networks by adding additional network attributes that describe the network type, segmentation ID, and physical interface. The following table shows various provider attributes and their associated values: Attribute Type Required Options Default Notes provider:network_type string yes vlan,flat,local, vxlan,gre Based on the configuration provider:segmentation_id int optional Depends on the network type Based on the configuration The segmentation ID range varies among L2 technologies provider:physical_network string optional Provider label Based on the configuration This specifies the physical interface used for traffic (flat or VLAN-only) All networks have provider attributes. However, because provider attributes specify particular network configuration settings and mappings, only users with the admin role can specify them when creating networks. Users without the admin role can still create networks, but the Neutron server, not the user, will determine the type of network created and any corresponding interface or segmentation ID. Additional attributes The external-net extension adds an attribute to networks that is used to determine whether or not the network can be used as the external, or gateway, network for a Neutron router. When set to true, the network becomes eligible for use as a floating IP pool when attached to routers. Using the Neutron router-gateway-set command, routers can be attached to external networks. The following table shows the external network attribute and its associated values: Attribute Type Required Default Notes router:external Boolean no false When true, the network is eligible for use as a floating IP pool when attached to a router Subnets In the Neutron data model, a subnet is an IPv4 or IPv6 address block from which IP addresses can be assigned to virtual machine instances and other network resources. Each subnet must have a subnet mask represented by a classless inter-domain routing (CIDR) address and must be associated with a network, as shown here: In the preceding diagram, three isolated VLAN networks each have a corresponding subnet. Instances and other devices cannot be attached to networks without an associated subnet. Instances connected to a network can communicate among one another but are unable to connect to other networks or subnets without the use of a router. The following diagram shows how a Neutron subnet describes various Layer 3 attributes in the OSI model: When creating subnets, users can specify IP allocation pools that limit which addresses in the subnet are available for allocation. Users can also define a custom gateway address, a list of DNS servers, and individual host routes that can be pushed to virtual machine instances using DHCP. The following table describes attributes associated with subnet objects: Attribute Type Required Default Notes id uuid-str n/a Auto Generated The UUID for the subnet network_id uuid-str Yes N/A The UUID of the associated network name string no None The human-readable name for the subnet ip_version int Yes 4 IP version 4 or 6 cidr string Yes N/A The CIDR address representing the IP address range for the subnet gateway_ip string or null no First address in CIDR The default gateway used by devices in the subnet dns_nameservers list(str) no None The DNS name servers used by hosts in the subnet allocation_pools list(dict) no Every address in the CIDR (excluding the gateway) The subranges of the CIDR available for dynamic allocation. tenant_id uuid-str no N/A The owner of the subnet enable_dhcp boolean no True This indicates whether or not DHCP is enabled for the subnet host_routes list(dict) no N/A Additional static routes Ports In the Neutron data model, a port represents a switch port on a logical switch that spans the entire cloud and contains information about the connected device. Virtual machine interfaces (VMIFs) and other network objects, such as router and DHCP server interfaces, are mapped to Neutron ports. The ports define both the MAC address and the IP address to be assigned to the device associated with them. Each port must be associated with a Neutron network. The following diagram shows how a port describes various Layer 2 attributes in the OSI model: The following table describes attributes associated with port objects: Attribute Type Required Default Notes id uuid-str n/a Auto generated The UUID for the subnet network_id uuid-str Yes N/A The UUID of the associated network name string no None The human-readable name for the subnet admin_state_up Boolean no True The administrative state of the port status string N/A N/A The current status of the port (for example, ACTIVE, BUILD, or DOWN) mac_address string no Auto generated The MAC address of the port fixed_ips list(dict) no Auto allocated The IP address(es) associated with the port device_id string no None The instance ID or other resource associated with the port device_owner string no None tenant_id uuid-str no ID of tenant adding resource The owner of the port When Neutron is first installed, no ports exist in the database. As networks and subnets are created, ports may be created for each of the DHCP servers reflected by the logical switch model, seen here: As instances are created, a single port is created for each network interface attached to the instance, as shown here: A port can only be associated with a single network. Therefore, if an instance is connected to multiple networks, it will be associated with multiple ports. As instances and other cloud resources are created, the logical switch may scale to hundreds or thousands of ports over time, as shown in the following diagram: There is no limit to the number of ports that can be created in Neutron. However, quotas exist that limit tenants to a small number of ports that can be created. As the number of Neutron ports scale out, the performance of the Neutron API server and the implementation of networking across the cloud may degrade over time. It's a good idea to keep quotas in place to ensure a high-performing cloud, but the defaults and subsequent quota increases should be kept reasonable. The Neutron workflow In the standard Neutron workflow, networks must be created first, followed by subnets and then ports. The following sections describe the workflows involved with booting and deleting instances. Booting an instance Before an instance can be created, it must be associated with a network that has a corresponding subnet or a precreated port that is associated with a network. The following process documents the steps involved in booting an instance and attaching it to a network: The user creates a network. The user creates a subnet and associates it with the network. The user boots a virtual machine instance and specifies the network. Nova interfaces with Neutron to create a port on the network. Neutron assigns a MAC address and IP address to the newly created port using attributes defined by the subnet. Nova builds the instance's libvirt XML file containing local network bridge and MAC address information and starts the instance. The instance sends a DHCP request during boot, at which point the DHCP server responds with the IP address corresponding to the MAC address of the instance. If multiple network interfaces are attached to an instance, each network interface will be associated with a unique Neutron port and may send out DHCP requests to retrieve their respective network information. How the logical model is implemented Neutron agents are services that run on network and compute nodes and are responsible for taking information described by networks, subnets, and ports and using it to implement the virtual and physical network infrastructure. In the Neutron database, the relationship between networks, subnets, and ports can be seen in the following diagram: This information is then implemented on the compute node by way of virtual network interfaces, virtual switches or bridges, and IP addresses, as shown in the following diagram: In the preceding example, the instance was connected to a network bridge on a compute node that provides connectivity from the instance to the physical network. For now, it's only necessary to know how the data model is implemented into something that is usable. Deleting an instance The following process documents the steps involved in deleting an instance: The user destroys virtual machine instance. Nova interfaces with Neutron to destroy the ports associated with the instances. Nova deletes local instance data. The allocated IP and MAC addresses are returned to the pool. When instances are deleted, Neutron removes all virtual network connections from the respective compute node and removes corresponding port information from the database. Summary In this article, we looked at the basics of the Neutron API and its data model made up of networks, subnets, and ports. These objects were used to describe in a logical way how the virtual network is architected and implemented across the cloud. Resources for Article: Further resources on this subject: Introducing OpenStack Trove[article] Concepts for OpenStack[article] Monitoring OpenStack Networks[article]

0
0
3426

article-image-get-your-apps-ready-android-n

Packt

18 Mar 2016

9 min read

Get your Apps Ready for Android N

Packt

18 Mar 2016

9 min read

It seems likely that Android N will get its first proper outing in May, at this year's Google I/O conference, but there's no need to wait until then to start developing for the next major release of the Android platform. Thanks to Google's decision to release preview versions early you can start getting your apps ready for Android N today. In this article by Jessica Thornsby, author of the book Android UI Design, going to look at the major new UI features that you can start experimenting with right now. And since you'll need something to develop your Android N-ready apps in, we're also going to look at Android Studio 2.1, which is currently the recommended development environment for Android N. (For more resources related to this topic, see here.) Multi-window mode Beginning with Android N, the Android operating system will give users the option to display more than one app at a time, in a split-screen environment known as multi-window mode. Multi-window paves the way for some serious multi-app multi-tasking, allowing users to perform tasks such as replying to an email without abandoning the video they were halfway through watching on YouTube, and reading articles in one half of the screen while jotting down notes in Google Keep on the other. When two activities are sharing the screen, users can even drag data from one activity and drop it into another activity directly, for example dragging a restaurant's address from a website and dropping it into Google Maps. Android N users can switch to multi-window mode either by: Making sure one of the apps they want to view in multi-window mode is visible onscreen, then tapping their device's Recent Apps softkey (that's the square softkey). The screen will split in half, with one side displaying the current activity and the other displaying the Recent Apps carousel. The user can then select the secondary app they want to view, and it'll fill the remaining half of the screen. Navigating to the home screen, and then pressing the Recent Apps softkey to open the Recent Apps carousel. The user can then drag one of these apps to the edge of the screen, and it'll open in multi-window mode. The user can then repeat this process for the second activity. If your app targets Android N or higher, the Android operating system assumes that your app supports multi-window mode unless you explicitly state otherwise. To prevent users from displaying your app in multi-window mode, you'll need to add android:resizeableActivity="false" to the <activity> or <application> section of your project's Manifest file. If your app does support multi-window mode, you may want to prevent users from shrinking your app's UI beyond a specified size, using the android:minimalSize attribute. If the user attempts to resize your app so it's smaller than the android:minimalSize value, the system will crop your UI instead of shrinking it. Direct reply notifications Google are adding a few new features to notifications in Android N, including an inline reply action button that allows users to reply to notifications directly from the notification UI. This is particularly useful for messaging apps, as it means users can reply to messages without even having to launch the messaging application. You may have already encountered direct reply notifications in Google Hangouts. To create a notification that supports direct reply, you need to create an instance of RemoteInput.Builder and then add it to your notification action. The following code adds a RemoteInput to a Notification.Action, and creates a Quick Reply key. When the user triggers the action, the notification prompts the user to input their response: private static final String KEY_QUICK_REPLY = "key_quick_reply"; String replyLabel = getResources().getString(R.string.reply_label); RemoteInput remoteInput = new RemoteInput.Builder(KEY_QUICK_REPLY) .setLabel(replyLabel) .build(); To retrieve the user's input from the notification interface, you need to call: getResultsFromIntent(Intent) and pass the notification action's intent as the input parameter: Bundle remoteInput = RemoteInput.getResultsFromIntent(intent); //This method returns a Bundle that contains the text response// if (remoteInput != null) { return remoteInput.getCharSequence(KEY_QUICK_REPLY); //Query the bundle using the result key, which is provided to the RemoteInput.Builder constructor// Bundled notifications Don't you just hate it when you connect to the World Wide Web first thing in the morning, and Gmail bombards you with multiple new message notifications, but doesn't give you anymore information about the individual emails? Not particularly helpful! When you receive a notification that consists of multiple items, the only thing you can really do is launch the app in question and take a closer look at the events that make up this grouped notification. Android N overcomes this drawback, by letting you group multiple notifications from the same app into a single, bundled notification via a new notification style: bundled notifications. A bundled notification consists of a parent notification that displays summary information for that group, plus individual notification items. If the user wants to see more information about one or more individual items, they can unfurl the bundled notification into separate notifications by swiping down with two fingers. The user can then act on each mini-notification individually, for example they might choose to dismiss the first three notifications about spam emails, but open the forth e-mail. To group notifications, you need to call setGroup() for each notification you want to add to the same notification stack, and then assign these notifications the same key. final static String GROUP_KEY_MESSAGES = "group_key_messages"; Notification notif = new NotificationCompat.Builder(mContext) .setContentTitle("New SMS from " + sender1) .setContentText(subject1) .setSmallIcon(R.drawable.new_message) .setGroup(GROUP_KEY_MESSAGES) .build(); Then when you create another notification that belongs to this stack, you just need to assign it the same group key. Notification notif2 = new NotificationCompat.Builder(mContext) .setContentTitle("New SMS from " + sender1) .setContentText(subject2) .setGroup(GROUP_KEY_MESSAGES) .build(); The second Android N developer preview introduced an Android-specific implementation of the Vulkan API. Vulkan is a cross-platform, 3D rendering API for providing high-quality, real-time 3D graphics. For draw-call heavy applications, Vulkan also promises to deliver a significant performance boost, thanks to a threading-friendly design and a reduction of CPU overhead. You can try Vulkan for yourself on devices running Developer Preview 2, or learn more about Vulkan at the official Android docs (https://developer.android.com/ndk/guides/graphics/index.html?utm_campaign=android_launch_npreview2_041316&utm_source=anddev&utm_medium=blog). Android N Support in Android Studio 2.1 The two Developer Previews aren't the only important releases for developers who want to get their apps ready for Android N. Google also recently released a stable version of Android Studio 2.1, which is the recommended IDE for developing Android N apps. Crucially, with the release of Android Studio 2.1 the emulator can now run the N Developer Preview Emulator System Images, so you can start testing your apps against Android N. Particularly with features like multi-window mode, it's important to test your apps across multiple screen sizes and configurations, and creating various Android N Android Virtual Devices (AVDs) is the quickest and easiest ways to do this. Android 2.1 also adds the ability to use the new Jack compiler (Java Android Compiler Kit), which compiles Java source code into Android dex bytecode. Jack is particularly important as it opens the door to using Java 8 language features in your Android N projects, without having to resort to additional tools or resources. Although not Android N-specific, Android 2.1 makes some improvements to the Instant Run feature, which should result in faster editing and deploy builds for all your Android projects. Previously, one small change in the Java code would cause all Java sources in the module to be recompiled. Instant Run aims to reduce compilation time by analyzing the changes you've made and determining how it can deploy them in the fastest way possible. This is instead of Android Studio automatically going through the lengthy process of recompiling the code, converting it to dex format, generating an APK and installing it on the connected device or emulator every time you make even a small change to your project. To start using Instant Run, select Android Studio from the toolbar followed by Preferences…. In the window that appears, select Build, Execution, Deployment from the side-menu and select Instant Run. Uncheck the box next to Restart activity on code changes. Instant Run is supported only when you deploy a debug build for Android 4.0 or higher. You'll also need to be using Android Plugin for Gradle version 2.0 or higher. Instant Run isn't currently compatible with the Jack toolchain. To use Instant Run, deploy your app as normal. Then if you make some changes to your project you'll notice that a yellow thunderbolt icon appears within the Run icon, indicating that Android Studio will push updates via Instant Run when you click this button. You can update to the latest version of Android Studio by launching the IDE and then selecting Android Studio from the toolbar, followed by Check for Updates…. Summary In this article, we looked at the major new UI features currently available in the Android N Developer Preview. We also looked at the Android Studio 2.1 features that are particularly useful for developing and testing apps that target the upcoming Android N release. Although we should expect some pretty dramatic changes between these early previews and the final release of Android N, taking the time to explore these features now means you'll be in a better position to update your apps when Android N is finally released. Resources for Article: Further resources on this subject: Drawing and Drawables In Android Canvas [article] Behavior-Driven Development With Selenium Webdriver [article] Development of Iphone Applications [article]

0
0
10786

article-image-support-vector-machines-classification-engine

Packt

17 Mar 2016

9 min read

Support Vector Machines as a Classification Engine

Packt

17 Mar 2016

9 min read

In this article by Tomasz Drabas, author of the book, Practical Data Analysis Cookbook, we will discuss on how Support Vector Machine models can be used as a classification engine. (For more resources related to this topic, see here.) Support Vector Machines Support Vector Machines (SVMs) are a family of extremely powerful models that can be used in classification and regression problems. They aim at finding decision boundaries that separate observations with differing class memberships. While many classifiers exist that can classify linearly separable data (for example, logistic regression), SVMs can handle highly non-linear problems using a kernel trick that implicitly maps the input vectors to higher-dimensional feature spaces. The transformation rearranges the dataset in such a way that it is then linearly solvable. The mechanics of the machine Given a set of n points of a form (x1,y1)...(xn,yn), where xi is a z-dimensional input vector and yi is a class label, the SVM aims at finding the maximum margin hyperplane that separates the data points: In a two-dimensional dataset, with linearly separable data points (as shown in the preceding figure), the maximum margin hyperplane would be a line that would maximize the distance between each of the classes. The hyperplane could be expressed as a dot product of the set of input vectors x and a vector normal to the hyperplane W:W.X=b, where b is the offset from the origin of the coordinate system. To find the hyperplane, we solve the following optimization problem: The constraint of our optimization problem effectively states that no point can cross the hyperplane if it does not belong to the class on that side of the hyperplane. Linear SVM Building a linear SVM classifier in Python is easy. There are multiple Python packages that can estimate a linear SVM but here, we decided to use MLPY (http://mlpy.sourceforge.net): import pandas as pd import numpy as np import mlpy as ml First, we load the necessary modules that we will use later, namely pandas (http://pandas.pydata.org), NumPy (http://www.numpy.org), and the aforementioned MLPY. We use pandas to read the data (https://github.com/drabastomek/practicalDataAnalysisCookbook repository to download the data): # the file name of the dataset r_filename = 'Data/Chapter03/bank_contacts.csv' # read the data csv_read = pd.read_csv(r_filename) The dataset that we use was described in S. Moro, P. Cortez, and P. Rita. A data-driven approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014 and found here http://archive.ics.uci.edu/ml/datasets/Bank+Marketing. It consists of over 41.1k outbound marketing calls of a bank. Our aim is to classify these calls into two buckets: those that resulted in a credit application and those that did not. Once the file was loaded, we split the data into training and testing datasets; we also keep the input and class indicator data separately. To this end, we use the split_dataset(...) method: def split_data(data, y, x = 'All', test_size = 0.33): ''' Method to split the data into training and testing ''' import sys # dependent variable variables = {'y': y} # and all the independent if x == 'All': allColumns = list(data.columns) allColumns.remove(y) variables['x'] = allColumns else: if type(x) != list: print('The x parameter has to be a list...') sys.exit(1) else: variables['x'] = x # create a variable to flag the training sample data['train'] = np.random.rand(len(data)) < (1 - test_size) # split the data into training and testing train_x = data[data.train] [variables['x']] train_y = data[data.train] [variables['y']] test_x = data[~data.train][variables['x']] test_y = data[~data.train][variables['y']] return train_x, train_y, test_x, test_y, variables['x'] We randomly set 1/3 of the dataset aside for testing purposes and use the remaining 2/3 for the training of the model: # split the data into training and testing train_x, train_y, test_x, test_y, labels = hlp.split_data( csv_read, y = 'credit_application' ) Once we read the data and split it into training and testing datasets, we can estimate the model: # create the classifier object svm = ml.LibSvm(svm_type='c_svc', kernel_type='linear', C=100.0) # fit the data svm.learn(train_x,train_y) The svm_type parameter of the .LibSvm(...) method controls what algorithm to use to estimate the SVM. Here, we use the c_svc method—a C-support Vector Classifier. The method specifies how much you want to avoid misclassifying observations: the larger values of C parameter will shrink the margin for the hyperplane (theb) so that more of the observations are correctly classified. You can also specify nu_svc with a nu parameter that controls how much of your sample (at most) can be misclassified and how many of your observations (at least) can become support vectors. Here, we estimate an SVM with a linear kernel, so let's talk about kernels. Kernels A kernel function K is effectively a function that computes a dot product between two n-dimensional vectors, K: Rn.Rn --> R. In other words, the kernel function takes two vectors and produces a scalar: The linear kernel does not effectively transform the data into a higher dimensional space. This is not true for polynomial or Radial Basis Function (RBF) kernels that transform the input feature space into higher dimensions. In case of the polynomial kernel of degree d, the obtained feature space has (n+d/d) dimensions for the Rn dimensional input feature space. As you can see, the number of additional dimensions can grow very quickly and this would pose significant problems in estimating the model if we would explicitly transform the data into higher dimensions. Thankfully, we do not have to do this as that's where the kernel trick comes into play. The truth is that SVMs do not have to work explicitly in higher dimensions but can rather implicitly map the data to higher dimensions using pairwise inner products (instead of an explicit dot product) and then use it to find the maximum margin hyperplane. You can find a really good explanation of the kernel trick at http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html. Back to our example The .learn(...) method of the .LibSvm(...) object estimates the model. Once the model is estimated, we can test how well it performs. First, we use the estimated model to predict the classes for the observations in the testing dataset: predicted_l = svm.pred(test_x) Next, we will use some of the scikit-learn methods to print the basic statistics for our model: def printModelSummary(actual, predicted): ''' Method to print out model summaries ''' import sklearn.metrics as mt print('Overall accuracy of the model is {0:.2f} percent' .format( (actual == predicted).sum() / len(actual) * 100)) print('Classification report: n', mt.classification_report(actual, predicted)) print('Confusion matrix: n', mt.confusion_matrix(actual, predicted)) print('ROC: ', mt.roc_auc_score(actual, predicted)) First, we calculate the overall accuracy of the model expressed as a ratio of properly classified observations to the total number of observations in the testing sample. Next, we print the classification report: The precision is the model's ability to avoid classifying an observation as positive when it is not. It is a ratio of true positives to the overall number of positively classified records. The overall precision score is a weighted average of the individual precision scores where the weight is the support. The support is the total number of actual observations in each class. The total precision for our model is not too bad—89 out of 100. However, when we look at the precision to classify the true positives, the situation is not as good—only 63 out of 100 were properly classified. Recall can be viewed as the model's capacity to find all the positive samples. It is a ratio of true positives to the sum of true positives and false negatives. The recall for the class 0.0 is almost perfect but for class 1.0, it looks really bad. This might be a problem with the fact that our sample is not balanced, but it is more likely that the features we use to classify the data do not really capture the differences between the two groups. The f1-score is effectively a weighted amalgam of the precision and recall: it is a ratio of twice the product of precision and recall to their sum. In one measure, it shows whether the model performs well or not. At the general level, the model does not perform badly but when looked at the model's ability to classify the true signal, it fails gravely. It is a perfect example why judging the model at the general level might be misleading when dealing with samples that are heavily unbalanced. RBF kernel SVM Given that the linear kernel performed poorly, our dataset might not be linearly separable. Thus, let's try the RBF kernel. The RBF kernel is given as K(x,y)=e ||x-y||2/2a2, where ||x-y||2 is a Euclidean distance between the two vectors, x and y, and σ is a free parameter. The value of RBF equals to 1 when x=y and gradually falls to 0 when the distance approaches infinity. To fit an RBF version of our model, we can specify our svm object as follows: svm = ml.LibSvm(svm_type='c_svc', kernel_type='rbf', gamma=0.1, C=1.0) The gamma parameter here specifies how far the influence of a single support vector reaches. Visually, you can investigate the relationship between gamma and C parameters at http://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html. The rest of the code for the model estimation follows in a similar fashion as with the linear kernel and we obtain the following results: The results are even worse than the linear kernel as the precision and recall were lost across the board. The SVM with the RBF kernel performed worse when classifying calls that resulted in applying for the credit card and those that did not. Summary In this article, we saw that the problem is not with the model but rather, the dataset that we use does not explain the variance sufficiently. This requires going back to the drawing board and selecting other features. Resources for Article: Further resources on this subject: Push your data to the Web [article] Transferring Data from MS Access 2003 to SQL Server 2008 [article] Exporting data from MS Access 2003 to MySQL [article]

0
0
16014

article-image-building-iphone-app-using-swift-part-1

Ryan Loomba

17 Mar 2016

6 min read

Building an iPhone App Using Swift: Part 1

Ryan Loomba

17 Mar 2016

6 min read

In this post, I’ll be showing you how to create an iPhone app using Apple’s new Swift programming language. Swift is a new programming language that Apple released in June at their special WWDC event in San Francisco, CA. You can find more information about Swift on the official page. Apple has released a book on Swift, The Swift Programming Language, which is available on the iBook Store or can be viewed online here. OK—let’s get started! The first thing you need in order to write an iPhone app using Swift is to download a copy of Xcode 6. Currently, the only way to get a copy of Xcode 6 is to sign up for Apple’s developer program. The cost to enroll is $99 USD/year, so enroll here. Once enrolled, click on the iOS 8 GM Seed link, and scroll down to the link that says Xcode 6 GM Seed. Once Xcode is installed, go to File -> New -> New Project. We will click on Application within the iOS section and choose a Single View Application: Click on the play button in the top left of the project to build the project. You should see the iPhone simular open with a blank white screen. Next, click on the top-left blue Sample Swift App project file and navigate to the general tab. In the Deployment Info section, select portrait for the device orientation. This will force the app to only be viewed in portrait mode. First View Controller If we navigate on the left to Main.storyboard, we see a single View Controller, with a single View. First, make sure that Use Size Classes is unchecked in the Interface Builder Document section. Let’s add a text view to the top of our view. In the bottom right text box, search for Text View. Drag the Text View and position it at the top of the View. Click on the Attributes inspectoron the right toolbar to adjust the font and alignment. If we click the play button to build the project, we should see the same white screen, but now with our Swift Sample App text. View a web page Let’s add our first feature–a button that will open up a web page. First embed our controller in a navigation controller, so we can easily navigate back and forth between views. Select the view controller in the storyboard, then go to Editor -> Embed in -> Navigation controller. Note that you might need to resize the text view you added in the previous step. Now, let’s add a button that will open up a web view. Back to our view, in the bottom right let’s search for a button and drag it somewhere in the view and label it Web View. The final product should look like this: If we build the project and click on the button, nothing will happen. We need to create a destination controller that will contain the web view. Go to File -> New and create a new Cocoa Touch Class: Let’s name our new controller WebViewController and make it a subclass of UIViewController. Make sure you choose Swift as the language. Click Create to save the controller file. Back to our storyboard, search for a View Controller in the bottom-right search box and drag to the storyboard. In the Attributes inspector toolbar on the right side of the screen, let’s give this controller the title WebViewController. In the identity inspector, let’s give this view controller a custom class of WebViewController: Let’s wire up our two controllers. Ctrl + click on the Web View button we created earlier and hold. Drag your cursor over to your newly created WebViewController. Upon release, choose push. On our storyboard, let’s search for a web view in the lower-right search box and drag it into our newly created WebViewController. Resize the web view so that it takes up the entire screen, except for the top nav bar area. If we hit the large play button at the top left to build our app, clicking on the Web View link will take us to a blank screen. We’ll also have a back button that takes us back to the first screen. Writing some Swift code Let’s have the web view load up a pre-determined website. Time to get our hands dirty writing some Swift! The first thing we need to do is link the WebView in our controller to the WebViewController.swift file. In the storyboard, click on the Assistant editor button at the top-right of the screen. You should see the storyboard view of WebViewController and WebViewController.swift next to each other. Control click on WebViewController in the storyboard and drag it over to the line right before the WebViewController class is defined. Name the variable webView: In the viewDidLoad function, we are going to add some intitialization to load up our webpage. After super.viewDidLoad(), let’s first declare the URL we want to use. This can be any URL; for the example, I’m going to use my own homepage. It will look something like this: let requestURL = NSURL(string: http://ryanloomba.com) In Swift, the keyword let is used to desiginate contsants, or variables that will not change. Next, we will convert this URL into an NSURLRequest object. Finally, we will tell our WebView to make this request and pass in the request object: import UIKit class WebViewController: UIViewController { @IBOutlet var webView: UIWebView! override func viewDidLoad() { super.viewDidLoad() let requestURL = NSURL(string: "http://ryanloomba.com") let request = NSURLRequest(URL: requestURL) webView.loadRequest(request) // Do any additional setup after loading the view. } override func didReceiveMemoryWarning() { super.didReceiveMemoryWarning() // Dispose of any resources that can be recreated. } /* // MARK: - Navigation // In a storyboard-based application, you will often want to do a little preparation before navigation override func prepareForSegue(segue: UIStoryboardSegue!, sender: AnyObject!) { // Get the new view controller using segue.destinationViewController. // Pass the selected object to the new view controller. } */ } Try changing the URL to see different websites. Here’s an example of what it should look like: About the author Ryan is a software engineer and electronic dance music producer currently residing in San Francisco, CA. Ryan started up as a biomedical engineer but fell in love with web/mobile programming after building his first Android app, you can find him on GitHub @rloomba

0
0
14481

Packt

17 Mar 2016

9 min read

Microservices – Brave New World

Packt

17 Mar 2016

9 min read

0
0
17495

Testing and Debugging Distributed Applications

Machine Learning Tasks

Launching a Spark Cluster

Why Mesos?

ALM – Developers and QA

Golang Decorators: Logging & Time Profiling

Building a Product Recommendation System

Boosting up the Performance of a Database

Making an App with React and Material Design

Delegate Pattern Limitations in Swift

Trending Topics

Neutron API Basics

Get your Apps Ready for Android N

Support Vector Machines as a Classification Engine

Building an iPhone App Using Swift: Part 1

Microservices – Brave New World

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access