How-To Tutorials

article-image-installing-tensorflow-in-windows-ubuntu-and-mac-os

21 Feb 2018

7 min read

Installing TensorFlow in Windows, Ubuntu and Mac OS

21 Feb 2018

[box type="note" align="" class="" width=""]This article is taken from the book Machine Learning with Tensorflow 1.x, written by Quan Hua, Shams Ul Azeem and Saif Ahmed. This book will help tackle common commercial machine learning problems with Google’s TensorFlow 1.x library.[/box] Today, we shall explore the basics of getting started with TensorFlow, its installation and configuration process. The proliferation of large public datasets, inexpensive GPUs, and open-minded developer culture has revolutionized machine learning efforts in recent years. Training data, the lifeblood of machine learning, has become widely available and easily consumable in recent years. Computing power has made the required horsepower available to small businesses and even individuals. The current decade is incredibly exciting for data scientists. Some of the top platforms used in the industry include Caffe, Theano, and Torch. While the underlying platforms are actively developed and openly shared, usage is limited largely to machine learning practitioners due to difficult installations, non-obvious configurations, and difficulty with productionizing solutions. TensorFlow has one of the easiest installations of any platform, bringing machine learning capabilities squarely into the realm of casual tinkerers and novice programmers. Meanwhile, high-performance features, such as—multiGPU support, make the platform exciting for experienced data scientists and industrial use as well. TensorFlow also provides a reimagined process and multiple user-friendly utilities, such as TensorBoard, to manage machine learning efforts. Finally, the platform has significant backing and community support from the world's largest machine learning powerhouse--Google. All this is before even considering the compelling underlying technical advantages, which we'll dive into later. Installing TensorFlow TensorFlow conveniently offers several types of installation and operates on multiple operating systems. The basic installation is CPU-only, while more advanced installations unleash serious horsepower by pushing calculations onto the graphics card, or even to multiple graphics cards. We recommend starting with a basic CPU installation at first. More complex GPU and CUDA installations will be discussed in Appendix, Advanced Installation. Even with just a basic CPU installation, TensorFlow offers multiple options, which are as follows: A basic Python pip installation A segregated Python installation via Virtualenv A fully segregated container-based installation via Docker Ubuntu installation Ubuntu is one of the best Linux distributions for working with Tensorflow. We highly recommend that you use an Ubuntu machine, especially if you want to work with GPU. We will do most of our work on the Ubuntu terminal. We will begin with installing pythonpip and python-dev via the following command: sudo apt-get install python-pip python-dev A successful installation will appear as follows: If you find missing packages, you can correct them via the following command: sudo apt-get update --fix-missing Then, you can continue the python and pip installation. We are now ready to install TensorFlow. The CPU installation is initiated via the following command: sudo pip install tensorflow A successful installation will appear as follows: macOS installation If you use Python, you will probably already have the Python package installer, pip. However, if not, you can easily install it using the easy_install pip command. You'll note that we actually executed sudo easy_install pip—the sudo prefix was required because the installation requires administrative rights. We will make the fair assumption that you already have the basic package installer, easy_install, available; if not, you can install it from https://pypi.python.org/pypi/setuptools. A successful installation will appear as shown in the following screenshot: Next, we will install the six package: sudo easy_install --upgrade six A successful installation will appear as shown in the following screenshot: Surprisingly, those are the only two prerequisites for TensorFlow, and we can now install the core platform. We will use the pip package installer mentioned earlier and install TensorFlow directly from Google's site. The most recent version at the time of writing this book is v1.3, but you should change this to the latest version you wish to use: sudo pip install tensorflow The pip installer will automatically gather all the other required dependencies. You will see each individual download and installation until the software is fully installed. A successful installation will appear as shown in the following screenshot: That's it! If you were able to get to this point, you can start to train and run your first model. Skip to Chapter 2, Your First Classifier, to train your first model. macOS X users wishing to completely segregate their installation can use a VM instead, as described in the Windows installation. Windows installation As we mentioned earlier, TensorFlow with Python 2.7 does not function natively on Windows. In this section, we will guide you through installing TensorFlow with Python 3.5 and set up a VM with Linux if you want to use TensorFlow with Python 2.7. First, we need to install Python 3.5.x or 3.6.x 64-bit from the following links: https://www.python.org/downloads/release/python-352/ https://www.python.org/downloads/release/python-362/ Make sure that you download the 64-bit version of Python where the name of the installation has amd64, such as python-3.6.2-amd64.exe. The Python 3.6.2 installation looks like this: We will select Add Python 3.6 to PATH and click Install Now. The installation process will complete with the following screen: We will click the Disable path length limit and then click Close to finish the Python installation. Now, let's open the Windows PowerShell application under the Windows menu. We will install the CPU-only version of Tensorflow with the following command: pip3 install tensorflow. The result of the installation will look like this: Congratulations, you can now use TensorFlow on Windows with Python 3.5.x or 3.6.x support. In the next section, we will show you how to set up a VM to use TensorFlow with Python 2.7. However, you can skip to the Test installation section of Chapter 2, Your First Classifier, if you don't need Python 2.7. Now, we will show you how to set up a VM with Linux to use TensorFlow with Python 2.7. We recommend the free VirtualBox system available at https://www.virtualbox.org/wiki/Downloads. The latest stable version at the time of writing is v5.0.14, available at the following URL: http:/ / download. virtualbox. org/ virtualbox/ 5. 1. 28/ VirtualBox- 5. 1. 28- 117968- Win. exe A successful installation will allow you to run the Oracle VM VirtualBox Manager dashboard, which looks like this: Testing the installation In this section, we will use TensorFlow to compute a simple math operation. First, open your terminal on Linux/macOS or Windows PowerShell in Windows. Now, we need to run python to use TensorFlow with the following command: python Enter the following program in the Python shell: import tensorflow as tf a = tf.constant(1.0) b = tf.constant(2.0) c = a + b sess = tf.Session() print(sess.run(c)) The result will look like the following screen where 3.0 is printed at the end: We covered TensorFlow installation on the three major operating systems, so that you are up and running with the platform. Windows users faced an extra challenge, as TensorFlow on Windows only supports Python 3.5.x or Python 3.6.x 64-bit version. However, even Windows users should now be up and running. Further get a detailed understanding of implementing Tensorflow with contextual examples in this post. If you liked this article, be sure to check out Machine Learning with Tensorflow 1.x which will help you take up any challenge you may face while implementing TensorFlow 1.x in your machine learning environment.

0
0
13492

article-image-ipv6-unix-domain-sockets-and-network-interfaces

Packt

21 Feb 2018

38 min read

IPv6, Unix Domain Sockets, and Network Interfaces

Packt

21 Feb 2018

38 min read

0
0
22745

Packt

21 Feb 2018

13 min read

Tree Test and Surveys

Packt

21 Feb 2018

13 min read

In this article by Pablo Perea and Pau Giner, authors of the book UX Design for Mobile, we will cover how to use different techniques that can be applied according to the needs of the project and the how to obtain the information that we want. (For more resources related to this topic, see here.) Tree Test This is also called reverse card sorting. This is a method where the participants try to find elements in a given structure. The objective of this method is to discover findability problems and improve the organization and labeling system. The organization structure used should represent a realistic navigation for the application or web page you are evaluating. If you don’t have one by the time the experiment is taking place, try to create a real scenario, as using a fake organization will not lead to really valuable results. There are some platforms available to perform this type of experiment with a computer. One example is https://www.optimalworkshop.com. This can have several advantages: the experiment can be carried out without requiring a physical displacement by the participant, and you can also study the participant steps and not just analyze whether the participant succeeded or not. It can be that the participants found the objectives but had to make many attempts to achieve them. The method steps Create the structure: Type or create the navigation structure with all the different levels you want to evaluate. Create a set of findability tasks: Think about different items that the participant should find or give a location in the given structure. Test with participants: The participant will receive a set of tasks to do. The following are some examples of possible tasks: Find some different products to buy Contact the customer support Get the shipping rates The results: At the end, we should have a success rate for each of the tasks. Tasks such as finding products in a store must be done several times with products located in different sections. This will help us classify our assortment and show us how to organize the first levels of the structure better. How to improve the organization Once we find the main weak points and workarounds we have in our structure, we can create alternative structures to retest and try to find better results. We can repeat this process several times until we get the desired results. The Information Architecture is the science field of organizing and labeling content in a web page to support usability and findability. There's a growing community of Information architecture specialists that supports the Information architecture Institute--https://en.wikipedia.org/wiki/Information_architecture. There are some general lines of work in which we have to invest time in order to improve our Information architecture. Content organization The content can be ordered by following different schemes, in the same way that a supermarket orders products according to different criteria. We should try to find the one that better fits our user needs. We can order the content, dividing it into groups according to nature, goals, audience, chronological entry, and so on. Each of these approaches will lead to different results and each will work better with different kinds of users. In the case of mobile applications, it is common to have certain sections where they mix contents of a different nature, for instance, integrating messages for the user in the contents of the activity view. However, an abuse of these types of techniques can lead to turning the section into a confusing area for the user. Areas naming There are words that have a completely different meaning for one person to another, especially if those people are thinking in different fields when they use our solution. Understanding our user needs, and how the think and speak, will help us provide clear names for sections and subsections. For example, the word pool will represent a different set of products for a person looking for summer products than for a person looking for games. In the case of applications, we will have to find a balance between simplicity and clarity. If space permits, adding a label along with the icon will clarify and reduce the possible ambiguities that may be encountered in recognizing the meaning of these graphic representations. In the case of mobiles, where space is really small, we can find some universal icons, but we must test with users to ensure that they interpret them properly. In the following examples, you can find two different approaches. In the Gmail app, attachment and send are known icons and can work without a label. We find a very different scenario in the Samsung Clock app, where it would be really difficult to differentiate between the Alarm, the Stopwatch, and the Timer without labels: Samsung system and Google Gmail App screenshots (source: Screenshot from Google Gmail App, source: Screenshot from Gmail App) The working memory limit The way the information is displayed to the user can drastically change the ease with which it is understood. When we talk about mobiles, where space is very limited, limiting the number of options and providing a navigation adapted to small spaces can help our user have a more satisfactory experience. As you probably know, the human working memory is not limitless, and it is commonly supposed to be limited to remembering a maximum of seven elements (https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two). Some authors such as Nelson Cowan suggested that the number of elements an adult can remember while performing a task is even lower, and gives the number of reference as four (https://en.wikipedia.org/wiki/Working_memory). This means that your users will understand the information you give them better if you block it into groups according to their limitations. Once we create a new structure, we can evaluate the efficiency of this new structure versus the last version. With small improvements, we will be able to increase the user engagement. Another way to learn about how the user understands the organization of our app or web is by testing a competitor product. This is one of the cheapest ways to have a quick prototype. Evaluate as many versions as you can; in each review, you will find new ideas to organize and show the content of your application or web app better. Surveys Surveys allow us to gather information from lots of participants without too much effort. Sometimes, we need information from a big group of people, and interviewing them one by one will not be affordable. Instead of that, surveys can quickly provide answers from lots of participants and analyze the results with bulk methods. It is not the purpose of this book to deal in depth with the field of questionnaires since there are books devoted entirely to this subject. Nevertheless, we will give some brushstrokes on the subject since they are commonly used to gather information in both web pages and mobile applications. Creating proper questions is a key part of the process that will reduce the noise and help the participants provide useful answers. Some questions will require more effort to analyze, but they will give us answers with deeper level of detail. Questions with pre-established answers are usually easier to automatize, and we can get results in less time. What we want to discover? The first thing to do is to define the objective for which we are making a survey. Working with a clear objective will help the process be focused and will get better results. Plan carefully and determine the information that you really need at that moment. We should avoid surveys with lots of questions that do not have a clear purpose. They will produce poor outcome and result in meaningless exercises for the participants. On the contrary, if we have a general leitmotiv for the questionnaire, it will also help the participants understand how the effort of completing the survey will help the company, and therefore it will give clear value to the time expended. You can plan your survey according to different planning approaches, your questions can be focussed on short and long term goals: Long-term planning: Understanding your users expectations and their view about your product in the long term will help plan the evolution of your application and help create new features that match their needs. For example, imagine that you are designing a music application, and you are unsure about focusing on mass majority music or maybe giving more visibility to amateur groups. Creating a long-term survey can help you understand what your users want to find in your platform and plan changes that match the conclusions extracted from the survey analysis. Short-term planning: This is usually related to operational actions. The objective with these kind of surveys is to gather information for taking actions later with a defined mission. These kind of surveys are useful when we need to choose between two options, that is, whether we are deciding to make a change in our platform or not. For example, it can help to decide what type of information is most important for the user when choosing between one group and another, so we can make that information more visible. We will take better decisions by understanding the main aspects our users will expect to find in our platform. Find the participants Depending on the goal of the survey, we can use a wider range of participants or reduce their number, filtering by their demographics, experience, or the relationship with our brand or products. If the goal is to expand our number of users, it may be interesting to expand the search range to participants outside our current set of users. Looking for new niches and interesting features for our potential users can make new users try out our application. If, on the contrary, our objective is to keep our current users loyal, it can be a great source of improvement to consult them about their preferences and their opinions about the things that work properly in our application. This data, along with the data of use and navigation, will let us see areas for improvement, and we will be able to solve problems of navigation and usability. Determining the questions We can ask different types of questions; depending on the type, we will get more or less detailed answers. If we choose the right type, we can save analysis effort, or we can reduce the number of participants when we require a deep analysis of each response. It is common to include questions at the beginning of the questionnaires in order to classify the results. They are usually called filtering or screening questions, and they will allow us to analyze the answers based on data such as age, gender, or technical skills. These questions have the objective of knowing the person answering the survey. If we know the person solving the questionnaire, we will be able to determine whether the answers given by this user are useful for our goals or not. We can add questions about the experience the participant has with general technology, or with our app, and about the relation with the brand. We can create two kinds of questions based on the type of answers the participant can provide; each of them, therefore, will lead to different results. Open-answer questions The objective of this type of questions is to know more about the participant without guiding the answers. We will try to ask objectively for a subject without providing possible answers. The participant will answer these type of questions with open-ended answers, so it will be easier to know more about how that participant thinks and which aspects are proving more or less satisfactory. While the advantage of this kind of questions is that you will gain a lot of insights and new ideas, the con is the cost of managing big amounts of data. So, these type of questions will be more useful when the number of participants is reduced. Here are some examples of open-answer questions: How often have you been using our customer service? How was your last purchase experience on our platform? Questions with pre-established answers These type of questions facilitate the analysis when the number of participants is high. We will create questions with a clear objective and give different options to respond. Participants will be able to choose one of the options in response. The analysis of these types of questions can be automated and therefore is faster, but it will not give us as detailed information as an open question, in which the participant can expose all his ideas about the matter in the question. The following is an example of a question with pre-established answers: Questions: How many times have you used our application in the last week? Answers: 1) More than five times 2) Two to Five 3) Once 4) None Another great advantage is the facility to answer these types of questions when the participant does not have much time or interest to respond. In environments such as mobile phones, typing long answers can be costly and frustrating. With these types of questions, we can offer answers that the user can select with a single click. This can help increase the number of participants completing the form. It is common to mix both the types of questions. Open-answer questions where the user can respond in more detail can be included as optional questions. The participants willing to share more information can use these fields to introduce more detailed answers. This way, we can make a quicker analysis on the questions with pre-established answers and analyze the questions that require more precise revision later. Humanize the forms When we create a form, we must think about the person who will answer it. Almost no one likes to fill in questionnaires, especially if they are long and complex. To make our participants feel comfortable filling out all the answers on our form, we have to try to treat the process as a human relationship: The first thing we should do is to explain the reason of our form. If our participants understand how their answers will be used in the project, and how they can help us achieve the goal, they will feel more encouraged to answer the questions and take their role seriously. Ask only what is strictly necessary for the purpose you have set for it. We must prevent different departments from introducing questions without a common goal. If the form will answer concerns of different departments, all of them should have the same goal. This way, the form will have more cohesion. The tone used on the form should be friendly and clear. We should not go beyond the limits of indiscretion with our questions, or the participant may feel overwhelmed, especially if the participants of our study are not users of our application or our services, we must treat them as unknown. Being respectful and kind is a key point in getting high participation. Summary In the article we saw how to apply Tree Test and how to conduct surveys to gain information that you want. Resources for Article: Further resources on this subject: Trends UX Design [article] Building Mobile Apps [article] Auditing Mobile Applications [article]

0
0
3495

How-To Tutorials

article-image-some-basic-concepts-theano

Packt

21 Feb 2018

13 min read

Some Basic Concepts of Theano

Packt

21 Feb 2018

13 min read

In this article by Christopher Bourez, the author of the book Deep Learning with Theano, presents Theano as a compute engine, and the basics for symbolic computing with Theano. Symbolic computing consists in building graphs of operations that will be optimized later on for a specific architecture, using the computation libraries available for this architecture. (For more resources related to this topic, see here.) Although this article might sound far from practical application. Theano may be defined as a library for scientific computing; it has been available since 2007 and is particularly suited for deep learning. Two important features are at the core of any deep learning library: tensor operations, and the capability to run the code on CPU or GPU indifferently. These two features enable us to work with massive amount of multi-dimensional data. Moreover, Theano proposes automatic differentiation, a very useful feature to solve a wider range of numeric optimizations than deep learning problems. The content of the article covers the following points: Theano install and loading Tensors and algebra Symbolic programming Need for tensor Usually, input data is represented with multi-dimensional arrays: Images have three dimensions: The number of channels, the width and height of the image Sounds and times series have one dimension: The time length Natural language sequences can be represented by two dimensional arrays: The time length and the alphabet length or the vocabulary length In Theano, multi-dimensional arrays are implemented with an abstraction class, named tensor, with many more transformations available than traditional arrays in a computer language like Python. At each stage of a neural net, computations such as matrix multiplications involve multiple operations on these multi-dimensional arrays. Classical arrays in programming languages do not have enough built-in functionalities to address well and fastly multi-dimensional computations and manipulations. Computations on multi-dimensional arrays have known a long history of optimizations, with tons of libraries and hardwares. One of the most important gains in speed has been permitted by the massive parallel architecture of the Graphical Computation Unit (GPU), with computation ability on a large number of cores, from a few hundreds to a few thousands. Compared to the traditional CPU, for example a quadricore, 12-core or 32-core engine, the gain with GPU can range from a 5x to a 100x times speedup, even if part of the code is still being executed on the CPU (data loading, GPU piloting, result outputing). The main bottleneck with the use of GPU is usually the transfer of data between the memory of the CPU and the memory of the GPU, but still, when well programmed, the use of GPU helps bring a significant increase in speed of an order of magnitude. Getting results in days rather than months, or hours rather than days, is an undeniable benefit for experimentation. Theano engine has been designed to address these two challenges of multi-dimensional array and architecture abstraction from the beginning. There is another undeniable benefit of Theano for scientific computation: the automatic differentiation of functions of multi-dimensional arrays, a well-suited feature for model parameter inference via objective function minimization. Such a feature facilitates the experimentation by releasing the pain to compute derivatives, which might not be so complicated, but prone to many errors. Installing and loading Theano Conda package and environment manager The easiest way to install Theano is to use conda, a cross-platform package and environment manager. If conda is not already installed on your operating system, the fastest way to install conda is to download the miniconda installer from https://conda.io/miniconda.html. For example, for conda under Linux 64 bit and Python 2.7: wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh chmod +x Miniconda2-latest-Linux-x86_64.sh bash ./Miniconda2-latest-Linux-x86_64.sh Conda enables to create new environments in which versions of Python (2 or 3) and the installed packages may differ. The conda root environment uses the same version of Python as the version installed on your system on which you installed conda. Install and run Theano on CPU Last, let’s install Theano: conda install theano Run a python session and try the following commands to check your configuration: >>> from theano import theano >>> theano.config.device 'cpu' >>> theano.config.floatX 'float64' >>> print(theano.config) The last command prints all the configuration of Theano. The theano.config object contains keys to many configuration options. To infer the configuration options, Theano looks first at ~/.theanorc file, then at any environment variables available, which override the former options, last at the variable set in the code, that are first in the order of precedence: >>> theano.config.floatX='float32' Some of the properties might be read-only and cannot be changed in the code, but floatX property, that sets the default floating point precision for floats, is among properties that can be changed directly in the code. It is advised to use float32 since GPU have a long history without float64, float64 execution speed on GPU is slower, sometimes much slower (2x to 32x on latest generation Pascal hardware), and that float32 precision is enough in practice. GPU drivers and libraries Theano enables the use of GPU (graphic computation units), the units usually used to compute the graphics to display on the computer screen. To have Theano work on the GPU as well, a GPU backend library is required on your system. CUDA library (for NVIDIA GPU cards only) is the main choice for GPU computations. There exists also the OpenCL standard, which is opensource, but far less developed, and much more experimental and rudimentary on Theano. Most of the scientific computations still occur on NVIDIA cards today. If you have a NVIDIA GPU card, download CUDA from the NVIDIA website at https://developer.nvidia.com/cuda-downloads and install it. The installer will install the lastest version of the gpu drivers first if they are not already installed. It will install the CUDA library in /usr/local/cuda directory. Install the cuDNN library, a library by NVIDIA also, that offers faster implementations of some operations for the GPU To install it, I usually copy /usr/local/cuda directory to a new directory /usr/local/cuda-{CUDA_VERSION}-cudnn-{CUDNN_VERSION} so that I can choose the version of CUDA and cuDNN, depending on the deep learning technology I use, and its compatibility. In your .bashrc profile, add the following line to set $PATH and $LD_LIBRARY_PATH variables: export PATH=/usr/local/cuda-8.0-cudnn-5.1/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-8.0-cudnn-5.1/lib64::/usr/local/cuda-8.0-cudnn-5.1/lib:$LD_LIBRARY_PATH Install and run Theano on GPU N-dimensional GPU arrays have been implemented in Python under 6 different GPU library (Theano/CudaNdarray,PyCUDA/ GPUArray,CUDAMAT/ CUDAMatrix, PYOPENCL/GPUArray, Clyther, Copperhead), are a subset of NumPy.ndarray. Libgpuarray is a backend library to have them in a common interface with the same property. To install libgpuarray with conda: conda install pygpu To run Theano in GPU mode, you need to configure the config.device variable before execution since it is a read-only variable once the code is run. With the environment variable THEANO_FLAGS: THEANO_FLAGS="device=cuda,floatX=float32" python >>> import theano Using cuDNN version 5110 on context None Mapped name None to device cuda: Tesla K80 (0000:83:00.0) >>> theano.config.device 'gpu' >>> theano.config.floatX 'float32' The first return shows that GPU device has been correctly detected, and specifies which GPU it uses. By default, Theano activates CNMeM, a faster CUDA memory allocator, an initial preallocation can be specified with gpuarra.preallocate option. At the end, my launch command will be: THEANO_FLAGS="device=cuda,floatX=float32,gpuarray.preallocate=0.8" python >>> from theano import theano Using cuDNN version 5110 on context None Preallocating 9151/11439 Mb (0.800000) on cuda Mapped name None to device cuda: Tesla K80 (0000:83:00.0) The first line confirms that cuDNN is active, the second confirms memory preallocation. The third line gives the default context name (that is None when the flag device=cuda is set) and the model of the GPU used, while the default context name for the CPU will always be cpu. It is possible to specify a different GPU than the first one, setting the device to cuda0, cuda1,... for multi-GPU computers. It is also possible to run a program on multiple GPU in parallel or in sequence (when the memory of one GPU is not sufficient), in particular when training very deep neural nets. In this case, the context flag contexts=dev0->cuda0;dev1->cuda1;dev2->cuda2;dev3->cuda3 activates multiple GPU instead of one, and designate the context name to each GPU device to be used in the code. For example, on a 4-GPU instance: THEANO_FLAGS="contexts=dev0->cuda0;dev1->cuda1;dev2->cuda2;dev3->cuda3,floatX=float32,gpuarray.preallocate=0.8" python >>> import theano Using cuDNN version 5110 on context None Preallocating 9177/11471 Mb (0.800000) on cuda0 Mapped name dev0 to device cuda0: Tesla K80 (0000:83:00.0) Using cuDNN version 5110 on context dev1 Preallocating 9177/11471 Mb (0.800000) on cuda1 Mapped name dev1 to device cuda1: Tesla K80 (0000:84:00.0) Using cuDNN version 5110 on context dev2 Preallocating 9177/11471 Mb (0.800000) on cuda2 Mapped name dev2 to device cuda2: Tesla K80 (0000:87:00.0) Using cuDNN version 5110 on context dev3 Preallocating 9177/11471 Mb (0.800000) on cuda3 Mapped name dev3 to device cuda3: Tesla K80 (0000:88:00.0) To assign computations to a specific GPU in this multi-GPU setting, the names we choose dev0, dev1, dev2, and dev3 have been mapped to each device (cuda0, cuda1, cuda2, cuda3). This name mapping enables to write codes that are independent of the underlying GPU assignments and libraries (CUDA or other). To keep the current configuration flags active at every Python session or execution without using environment variables, save your configuration in the ~/.theanorc file as: [global] floatX = float32 device = cuda0 [gpuarray] preallocate = 1 Now, you can simply run python command. You are now all set. Tensors In Python, some scientific libraries such as NumPy provide multi-dimensional arrays. Theano doesn't replace Numpy but works in concert with it. In particular, NumPy is used for the initialization of tensors. To perform the computation on CPU and GPU indifferently, variables are symbolic and represented by the tensor class, an abstraction, and writing numerical expressions consists in building a computation graph of Variable nodes and Apply nodes. Depending on the platform on which the computation graph will be compiled, tensors are replaced either: By a TensorType variable, which data has to be on CPU By a GpuArrayType variable, which data has to be on GPU That way, the code can be written indifferently of the platform where it will be executed. Here are a few tensor objects: Object class Number of dimensions Example theano.tensor.scalar 0-dimensional array 1, 2.5 theano.tensor.vector 1-dimensional array [0,3,20] theano.tensor.matrix 2-dimensional array [[2,3][1,5]] theano.tensor.tensor3 3-dimensional array [[[2,3][1,5]],[[1,2],[3,4]]] Playing with these Theano objects in the Python shell gives a better idea: >>> import theano.tensor as T >>> T.scalar() <TensorType(float32, scalar)> >>> T.iscalar() <TensorType(int32, scalar)> >>> T.fscalar() <TensorType(float32, scalar)> >>> T.dscalar() <TensorType(float64, scalar)> With a i, l, f, d letter in front of the object name, you initiate a tensor of a given type, integer32, integer64, floats32 or float64. For real-valued (floating point) data, it is advised to use the direct form T.scalar() instead of the f or d variants since the direct form will use your current configuration for floats: >>> theano.config.floatX = 'float64' >>> T.scalar() <TensorType(float64, scalar)> >>> T.fscalar() <TensorType(float32, scalar)> >>> theano.config.floatX = 'float32' >>> T.scalar() <TensorType(float32, scalar)> Symbolic variables either: Play the role of placeholders, as a starting point to build your graph of numerical operations (such as addition, multiplication): they receive the flow of the incoming data during the evaluation, once the graph has been compiled Represent intermediate or output results Symbolic variables and operations are both part of a computation graph that will be compiled either towards CPU or GPU for fast execution. Let's write a first computation graph consisting in a simple addition: >>> x = T.matrix('x') >>> y = T.matrix('y') >>> z = x + y >>> theano.pp(z) '(x + y)' >>> z.eval({x: [[1, 2], [1, 3]], y: [[1, 0], [3, 4]]}) array([[ 2., 2.], [ 4., 7.]], dtype=float32) At first place, two symbolic variables, or Variable nodes are created, with names x and y, and an addition operation, an Apply node, is applied between both of them, to create a new symbolic variable, z, in the computation graph. The pretty print function pp prints the expression represented by Theano symbolic variables. Eval evaluates the value of the output variable z, when the first two variables x and y are initialized with two numerical 2-dimensional arrays. The following example explicit the difference between the variables x and y, and their names x and y: >>> a = T.matrix() >>> b = T.matrix() >>> theano.pp(a + b) '(<TensorType(float32, matrix)> + <TensorType(float32, matrix)>)' Without names, it is more complicated to trace the nodes in a large graph. When printing the computation graph, names significantly helps diagnose problems, while variables are only used to handle the objects in the graph: >>> x = T.matrix('x') >>> x = x + x >>> theano.pp(x) '(x + x)' Here the original symbolic variable, named x, does not change and stays part of the computation graph. x + x creates a new symbolic variable we assign to the Python variable x. Note also, that with names, the plural form initializes multiple tensors at the same time: >>> x, y, z = T.matrices('x', 'y', 'z') Now, let's have a look at the different functions to display the graph. Summary Thus, this article helps us to give a brief idea on how to download and install Theano on various platforms along with the packages such as NumPy and SciPy. Resources for Article: Further resources on this subject: Introduction to Deep Learning [article] Getting Started with Deep Learning [article] Practical Applications of Deep Learning [article]

0
0
5151

article-image-exchange-management-shell-common-tasks

Packt

21 Feb 2018

11 min read

Exchange Management Shell Common Tasks

Packt

21 Feb 2018

11 min read

0
0
9201

article-image-how-to-use-standard-macro-in-workflows

Sunith Shetty

21 Feb 2018

6 min read

How to use Standard Macro in Workflows

Sunith Shetty

21 Feb 2018

6 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Renato Baruti titled Learning Alteryx. In this book, you will learn how to perform self-analytics and create interactive dashboards using various tools in Alteryx.[/box] Today we will learn Standard Macro that will provide you with a foundation for building enhanced workflows. The csv file required for this tutorial is available to download here. Standard Macro Before getting into Standard Macro, let's define what a macro is. A macro is a collection of workflow tools that are grouped together into one tool. Using a range of different interface tools, a macro can be developed and used within a workflow. Any workflow can be turned into a macro and a repeatable element of a workflow can commonly be converted into a macro. There are a couple of ways you can turn your workflow into a Standard Macro. The first is to go to the canvas configuration pane and navigate to the Workflow tab. This is where you select what type of workflow you want. If you select Macro you should then have Standard Macro automatically selected. Now, when you save this workflow it will save as a macro. You’ll then be able to add it to another workflow and run the process created within the macro itself. The second method is just to add a Macro Input tool from the Interface tool section onto the canvas; the workflow will then automatically change to a Standard Macro. The following screenshot shows the selection of a Standard Macro, under the Workflow tab: Let's go through an example of creating and deploying a standard macro. Standard Macro Example #1: Create a macro that allows the user to input a number used as a multiplier. Use the multiplier for the DataValueAlt field. The following steps demonstrate this process: Step 1: Select the Macro Input tool from the Interface tool palette and add the tool onto the canvas. The workflow will automatically change to a Standard Macro. Step 2: Select Text Input and Edit Data option within the Macro Input tool configuration. Step 3: Create a field called Number and enter the values: 155, 243, 128, 352, and 357 in each row, as shown in the following image: Step 4: Rename the Input Name Input and set the Anchor Abbreviation as I as shown in the following image: Step 5: Select the Formula tool from the Preparation tool palette. Connect the Formula tool to the Macro Input tool. Step 6: Select the + Add Column option in the Select Column drop down within the Formula tool configuration. Name the field Result. Step 7: Add the following expression to the expression window: [Number]*0.50 Step 8: Select the Macro Output tool from the Interface tool palette and add the tool onto the canvas. Connect the Macro Output tool to the Formula tool. Step 9: Rename the Output Name Output and set the Anchor Abbreviation as O: The Standard Macro has now been created. It can be saved to use as multiplier, to calculate the five numbers added within the Macro Input tool to multiply 0.50. This is great; however, let's take it a step further to make it dynamic and flexible by allowing the user to enter a multiplier. For instance, currently the multiplier is set to 0.50, but what if a user wants to change that to 0.25 or 0.10 to determine the 25% or 10% value of a field. Let's continue building out the Standard Macro to make this possible. Step 1: Select the Text Box tool from the Interface tool palette and drag it onto the canvas. Connect the Text Box tool to the Formula tool on the lightning bolt (the macro indicator). The Action tool will automatically be added to the canvas, as this automatically updates the configuration of a workflow with values provided by interface questions when run as an app or macro. Step 2: Configure the Action tool that will automatically update the expression replaced by a specific field. Select Formula | FormulaFields | FormulaField | @expression - value="[Number]*0.50". Select the Replace a specific string: option and enter 0.50. This is where the automation happens, updating the 0.50 to any number the user enters. You will see how this happens in the following steps: Step 3: In the Enter the text or question to be displayed text box, within the Text Box tool configuration, enter: Please enter a number: Step 4: Save the workflow as Standard Macro.yxmc. The .yxmc file type indicates it's a macro related workflow, as shown in the following image: Step 5: Open a new workflow. Step 6: Select the Input Data tool from the In/Out tool palette and connect to the U.S. Chronic Disease Indicators.csv file. Step 7: Select the Select tool from the Preparation tool palette and drag it onto the canvas. Connect the Select tool to the Input Data tool. Step 8: Change the Data Type for the DataValueAlt field to Double. Step 9: Right-click on the canvas and select Insert | Macro | Standard Macro. Step 10: Connect the Standard Macro to the Select tool. Step 11: There will be Questions to select within the Standard Macro tool configuration. Select DataValueAlt (Double) as the Choose Field option and enter 0.25 in the Please enter a number text box: Step 12: Add a Browse tool to the Standard Macro tool. Step 13: Run the workflow: The goal for creating this Standard Macro was to allow the user to select what they would like the multiplier to be rather than a static number. Let's recap what has been created and deployed using a Standard Macro. First, Standard Macro.yxmc was developed using Interface tools. The Macro Input (I) was used to enter sample text data for the Number field. This Number field is what is used to multiply to what the given multiplier is - in this case, 0.50. This is the static number multiplier. The Formula tool was used to create the expression to conclude that the Number field will be multiplied by 0.50. The Macro Output (O) was used to output the macro so that it can be used in another workflow. The Text Box tool is where the question Please enter a number will be displayed, along with the Action tool that is used to update the specific value replaced. The current multiplier, 0.50, is replaced by 0.25, as identified in step 20, through a dynamic input by which the user can enter the multiplier. Notice that, in the Browse tool output, the Result field has been added, multiplying the values for the DataValueAlt field to the multiplier 0.25. Change the value in the macro to 0.10 and run the workflow. The Result field has been updated to now multiple the values for the DataValueAlt field to the multiplier 0.10. This is a great use case of a Standard Macro and demonstrates how versatile the Interface tools are. We learned about macros and their dynamic use within workflows. We saw how Standard Macro was developed to allow the end user to specify what they want the multiplier to be. This is a great way to implement the interactivity within a workflow. To know more about high-quality interactive dashboards and efficient self-service data analytics, do checkout this book Learning Alteryx.

0
0
6426

Packt

21 Feb 2018

6 min read

Play With Functions

Packt

21 Feb 2018

6 min read

This article by Igor Wojda and Marcin Moskala, authors of the book Android Development with Kotlin, introduces functions in Kotlin, together with different ways of calling functions. (For more resources related to this topic, see here.) Single-expression functions During typical programming, many functions contain only one expression. Here is example of this kind of function: fun square(x: Int): Int { return x * x } Or another one, which can be often found in Android projects. It is pattern used in Activity, to define methods that are just getting text from some view or providing some other data from view to allow presenter to get them: fun getEmail(): String { return emailView.text.toString() } Both functions are defined to return result of single expression. In first example it is result of x * x multiplication, and in second one it is result of expression emailView.text.toString(). This kind of functions are used all around Android projects. Here are some common use-cases: Extracting some small operations Using polymorphism to provide values specific to class Functions that are only creating some object Functions that are passing data between architecture layers (like in preceding example Activity is passing data from view to presenter) Functional programming style functions that base on recurrence Such functions are often used, so Kotlin has notation for this kind of them. When a function returns a single expression, then curly braces and body of the function can be omitted. We specify expression directly using equality character. Functions defined this way are called single-expression functions. Let's update our square function, and define it as a single-expression function: As we can see, single expression function have expression body instead of block body. This notation is shorter, but whole body needs to be just a single expression. In single-expression functions, declaring return type is optional, because it can be inferred by the compiler from type of expression. This is why we can simplify use square function, and define it this way: fun square(x: Int) = x * x There are many places inside Android application where we can utilize single expression functions. Let's consider RecyclerView adapter that is providing layout ID and creating ViewHolder: class AddressAdapter : ItemAdapter<AddressAdapter.ViewHolder>() { override fun getLayoutId() = R.layout.choose_address_view override fun onCreateViewHolder(itemView: View) = ViewHolder(itemView) // Rest of methods } In the following example, we achieve high readability thanks to single expression function. Single-expression functions are also very popular in the functional world. Single expression function notation is also well-pairing with when structure. Here example of their connection, used to get specific data from object according to key (use-case from big Kotlin project): fun valueFromBooking(key: String, booking: Booking?) = when(key) { // 1 "patient.nin" -> booking?.patient?.nin "patient.email" -> booking?.patient?.email "patient.phone" -> booking?.patient?.phone "comment" -> booking?.comment else -> null } We don't need a type, because it is inferred from when expression. Another common Android example is that we can combine when expression with activity method onOptionsItemSelected that handles top bar menu clicks: override fun onOptionsItemSelected(item: MenuItem): Boolean = when { item.itemId == android.R.id.home -> { onBackPressed() true } else -> super.onOptionsItemSelected(item) } As we can see, single expression functions can make our code more concise and improved readability. Single-expression functions are commonly used in Kotlin Android projects and they are really popular for functional programming. As an example. Let's suppose that we need to filter all the odd values from following list: val list = listOf(1, 2, 3, 4, 5) We will use following helper function that returns true if argument is odd otherwise it returns false: fun isOdd(i: Int) = i % 2 == 1 In imperative programming style, we should specify steps of processing, which are connected to execution process (iterate through list, check if value is odd, add value to one list if it's odd). Here is implementation of this functionality, that is typical for imperative style: var oddList = emptyList<Int>() for(i in list) { if(isOdd(i)) { newList += i } } In declarative programming style, the way of thinking about code is different - we should think what is the required result and simply use functions that will give us this result. Kotlin stdlib provides lot of functions supporting declarative programming style. Here is how we could implement the same functionality using one of them, called filter: var oddList = list.filter(::isOdd) filter is function that leaves only elements that are true according to predicate. Here function isOdd is used as an predicate. Different ways of calling a function Sometimes we need to call function and provide only selected arguments. In Java we could create multiple overloads of the same method, but this solution have there are some limitations. First problem is that number of possible method permutations is growing very quickly (2n) making them very difficult to maintain. Second problem is that overloads must be distinguishable from each other, so compiler will know which overload to call, so when method defines few parameters with the same type we can't define all possible overloads. That's why in Java we often need to pass multiple null values to a method: // Java printValue("abc", null, null, "!"); Multiple null parameters provide boilerplate and greatly decrease method readability. In Kotlin there is no such problem, because Kotlin has feature called default arguments and named argument syntax. Default arguments values Default arguments are mostly known from C++, which is one of the oldest languages supporting it. Default argument provides a value for a parameter in case it is not provided during method call. Each function parameter can have default value. It might be any value that is matching specified type including null. This way we can simply define functions that can be called in multiple ways We can use this function the same way as normal function (function without default argument values) by providing values for each parameter (all arguments): printValue("str", true, "","") // Prints: (str) Thanks to default argument values, we can call a function by providing arguments only for parameters without default values: printValue("str") // Prints: (str) We can also provide all parameters without default values, and only some that have a default value: printValue("str", false) // Prints: str Named arguments syntax Sometimes we want only to pass value for last argument. Let's suppose that we define want to define value for suffix, but not for prefix and inBracket (which are defined before suffix). Normally we would have to provide values for all previous parameters including the default parameter values: printValue("str", true, true, "!") // Prints: (str) By using named argument syntax, we can pass specific argument using argument name: printValue("str", suffix = "!") // Prints: (str)! We can also use named argument syntax together with classic call. The only restriction is when we start using named syntax we cannot use classic one for next arguments we are serving: printValue("str", true, "") printValue("str", true, prefix = "") printValue("str", inBracket = true, prefix = "") Summary In this article, we learned about single expression functions as a type of defining functions in application development. We also briefly explained Resources for Article: Further resources on this subject: Getting started with Android Development [article] Android Game Development with Unity3D [article] Kotlin Basics [article]

0
0
30011

How-To Tutorials

Packt

21 Feb 2018

9 min read

API Gateway and its Need

Packt

21 Feb 2018

9 min read

In this article by Umesh R Sharma, author of the book Practical Microservices, we will cover API Gateway and its need with simple and short examples. (For more resources related to this topic, see here.) Dynamic websites show a lot on a single page, and there is a lot of information that needs to be shown on the page. The common success order summary page shows the cart detail and customer address. For this, frontend has to fire a different query to the customer detail service and order detail service. This is a very simple example of having multiple services on a single page. As a single microservice has to deal with only one concern, in result of that to show much information on page, there are many API calls on the same page. So, a website or mobile page can be very chatty in terms of displaying data on the same page. Another problem is that, sometimes, microservice talks on another protocol, then HTTP only, such as thrift call and so on. Outer consumers can't directly deal with microservice in that protocol. As a mobile screen is smaller than a web page, the result of the data required by the mobile or desktop API call is different. A developer would want to give less data to the mobile API or have different versions of the API calls for mobile and desktop. So, you could face a problem such as this: each client is calling different web services and keeping track of their web service and developers have to give backward compatibility because API URLs are embedded in clients like in mobile app. Why do we need the API Gateway? All these preceding problems can be addressed with the API Gateway in place. The API Gateway acts as a proxy between the API consumer and the API servers. To address the first problem in that scenario, there will only be one call, such as /successOrderSummary, to the API Gateway. The API Gateway, on behalf of the consumer, calls the order and user detail, then combines the result and serves to the client. So basically, it acts as a facade or API call, which may internally call many APIs. The API Gateway solves many purposes, some of which are as follows. Authentication API Gateways can take the overhead of authenticating an API call from outside. After that, all the internal calls remove security check. If the request comes from inside the VPC, it can remove the check of security, decrease the network latency a bit, and make the developer focus more on business logic than concerning about security. Different protocol Sometimes, microservice can internally use different protocols to talk to each other; it can be thrift call, TCP, UDP, RMI, SOAP, and so on. For clients, there can be only one rest-based HTTP call. Clients hit the API Gateway with the HTTP protocol and the API Gateway can make the internal call in required protocol and combine the results in the end from all web service. It can respond to the client in required protocol; in most of the cases, that protocol will be HTTP. Load-balancing The API Gateway can work as a load balancer to handle requests in the most efficient manner. It can keep a track of the request load it has sent to different nodes of a particular service. Gateway should be intelligent enough to load balances between different nodes of a particular service. With NGINX Plus coming into the picture, NGINX can be a good candidate for the API Gateway. It has many of the features to address the problem that is usually handled by the API Gateway. Request dispatching (including service discovery) One main feature of the gateway is to make less communication between client and microservcies. So, it initiates the parallel microservices if that is required by the client. From the client side, there will only be one hit. Gateway hits all the required services and waits for the results from all services. After obtaining the response from all the services, it combines the result and sends it back to the client. Reactive microservice designs can help you achieve this. Working with service discovery can give many extra features. It can mention which is the master node of service and which is the slave. Same goes for DB in case any write request can go to the master or read request can go to the slave. This is the basic rule, but users can apply so many rules on the basis of meta information provided by the API Gateway. Gateway can record the basic response time from each node of service instance. For higher priority API calls, it can be routed to the fastest responding node. Again, rules can be defined on the basis of the API Gateway you are using and how it will be implemented. Response transformation Being a first and single point of entry for all API calls, the API Gateway knows which type of client is calling a mobile, web client, or other external consumer; it can make the internal call to the client and give the data to different clients as per needs and configuration. Circuit breaker To handle the partial failure, the API Gateway uses a technique called circuit breaker pattern. A service failure in one service can cause the cascading failure in the flow to all the service calls in stack. The API Gateway can keep an eye on some threshold for any microservice. If any service passes that threshold, it marks that API as open circuit and decides not to make the call for a configured time. Hystrix (by Netflix) served this purpose efficiently. Default value in this is failing of 20 requests in 5 seconds. Developers can also mention the fall back for this open circuit. This fall back can be of dummy service. Once API starts giving results as expected, then gateway marks it as a closed service again. Pros and cons of API Gateway Using the API Gateway itself has its own pros and cons. In the previous section, we have described the advantages of using the API Gateway already. I will still try to make them in points as the pros of the API Gateway. Pros Microservice can focus on business logic Clients can get all the data in a single hit Authentication, logging, and monitoring can be handled by the API Gateway Gives flexibility to use completely independent protocols in which clients and microservice can talk It can give tailor-made results, as per the clients needs It can handle partial failure Addition to the preceding mentioned pros, some of the trade-offs are also to use this pattern. Cons It can cause performance degrade due to lots of happenings on the API Gateway With this, discovery service should be implemented Sometimes, it becomes the single point of failure Managing routing is an overhead of the pattern Adding additional network hope in the call Overall. it increases the complexity of the system Too much logic implementation in this gateway will lead to another dependency problem So, before using the API Gateway, both of the aspects should be considered. Decision of including the API Gateway in the system increases the cost as well. Before putting effort, cost, and management in this pattern, it is recommended to analysis how much you can gain from it. Example of API Gateway In this example, we will try to show only sample product pages that will fetch the data from service product detail to give information about the product. This example can be increased in many aspects. Our focus of this example is to only show how the API Gateway pattern works; so we will try to keep this example simple and small. This example will be using Zuul from Netflix as an API Gateway. Spring also had an implementation of Zuul in it, so we are creating this example with Spring Boot. For a sample API Gateway implementation, we will be using http://start.spring.io/ to generate an initial template of our code. Spring initializer is the project from Spring to help beginners generate basic Spring Boot code. A user has to set a minimum configuration and can hit the Generate Project button. If any user wants to set more specific details regarding the project, then they can see all the configuration settings by clicking on the Switch to the full version button, as shown in the following screenshot: Let's create a controller in the same package of main application class and put the following code in the file: @SpringBootApplication @RestController public class ProductDetailConrtoller { @Resource ProductDetailService pdService; @RequestMapping(value = "/product/{id}") public ProductDetail getAllProduct( @PathParam("id") String id) { return pdService.getProductDetailById(id); } } In the preceding code, there is an assumption of the pdService bean that will interact with Spring data repository for product detail and get the result for the required product ID. Another assumption is that this service is running on port 10000. Just to make sure everything is running, a hit on a URL such as http://localhost:10000/product/1 should give some JSON as response. For the API Gateway, we will create another Spring Boot application with Zuul support. Zuul can be activated by just adding a simple @EnableZuulProxy annotation. The following is a simple code to start the simple Zuul proxy: @SpringBootApplication @EnableZuulProxy public class ApiGatewayExampleInSpring { public static void main(String[] args) { SpringApplication.run(ApiGatewayExampleInSpring.class, args); } } Rest all the things are managed in configuration. In the application.properties file of the API Gateway, the content will be something as follows: zuul.routes.product.path=/product/** zuul.routes.produc.url=http://localhost:10000 ribbon.eureka.enabled=false server.port=8080 With this configuration, we are defining rules such as this: for any request for a URL such as /product/xxx, pass this request to http://localhost:10000. For outer world, it will be like http://localhost:8080/product/1, which will internally be transferred to the 10000 port. If we defined a spring.application.name variable as product in product detail microservice, then we don't need to define the URL path property here (zuul.routes.product.path=/product/** ), as Zuul, by default, will make it a URL/product. The example taken here for an API Gateway is not very intelligent, but this is a very capable API Gateway. Depending on the routes, filter, and caching defined in the Zuul's property, one can make a very powerful API Gateway. Summary In this article, you learned about the API Gateway, its need, and its pros and cons with the code example. Resources for Article: Further resources on this subject: What are Microservices? [article] Microservices and Service Oriented Architecture [article] Breaking into Microservices Architecture [article]

0
0
31663

article-image-react-native-really-native-framework

Packt

21 Feb 2018

11 min read

Is React Native is really Native framework?

Packt

21 Feb 2018

11 min read

This article by Vladimir Novick, author of the book React Native - Building Mobile Apps with JavaScript, introduces the concept of how the the React Native is really a Native framework, it's working, information flow, architecture, and benefits. (For more resources related to this topic, see here.) Introduction So how React Native is different? Well it doesn’t fall under hybrid category because the approach is different. When hybrid apps are trying to make platform specific features reusable between platforms, React Native have platform independent features, but also have lots of specific platform implementations. Meaning that on iOS and on Android code will look different, but somewhere between 70-90 percent of code will be reused. Also React Native does not depend on HTML or CSS. You write in JavaScript, but this JavaScript is compiled to platform specific Native code using React Native bridge. It happens all the time, but it’s optimize to a way, that application will run smoothly in 60fps. So to summarize React Native is not really a Native framework, but It’s much closer to Native code, than hybrid apps. And now let’s dive a bit deeper and understand how JavaScript gets converted into a Native code. How React Native bridge from JavaScript to Native world works? Let’s dive a bit deeper and understand how React Native works under the hood, which will help us understand how JavaScript is compiled to a Native code and how the whole process works. It’s crucial to know how the whole process works, so if you will have performance issues one day, you will understand where it originates. Information flow So we’ve talked about React concepts that power up React Native and one of them is that UI is a function of data. You change the state and React knows what to update. Let’s visualize now how information flows through common React app. Check out the diagram: We have React component, which passes data to three child components Under the hood what is happening is, Virtual DOM tree is created representing these component hierarchy When state of the parent component is updated, React knows how to pass information to the children Since children are basically representation of UI, React figures out how to batch Browser DOM updates and executes them So now let’s remove Browser DOM and think that instead of batching Browser DOM updates, React Native does the same with calls to Native modules. So what about passing information down to Native modules? It can be done in two ways: Shared mutable data Serializable messages exchanged between JavaScript and Native modules React Native is going with the second approach. Instead of mutating data on shareable objects it passes asynchronous serialized batched messages to React Native Bridge. Bridge is the layer that is responsible for glueing together Native and JavaScript environments. Architecture Let’s take a look at the following diagram, which explains how React Native Architecture is structured and walk through the diagram: In diagram, pictured three layers: Native, Bridge and JavaScript. Native layer is pictured at the last in picture, because the layer that is closer to device itself. Bridge is the layer that connects between JavaScript and Native modules and basically is a transport layer that transport asynchronous serialized batched response messages from JavaScript to Native modules. When event is executed on Native layer. It can be touch, timer, network request. Basically any event involving device Native modules, It’s data is collected and is sent to the Bridge as a serialized message. Bridge pass this message to JavaScript layer. JavaScript layer is an event loop. Once Bridge passes Serialized payload to JavaScript, Event is processed and your application logic comes into play. If you update state, triggering your UI to re-render for example, React Native will batch Update UI and send them to the Bridge. Bridge will pass this Serialized batched response to Native layer, which will process all commands, that it can distinguish from serialized batched response and will Update UI accordingly. Threading model Up till now we’ve seen that there are lots of stuff going on under the hood of React Native. It’s important to know that everything is done on three main threads: UI (application main thread) Native modules JavaScript Runtime UI thread is the main Native thread where Native level rendering occurs. It is here, where your platform of choice, iOS or Android, does measures, layouting, drawing. If your application accesses any Native APIs, it’s done on a separate Native modules thread. For example, if you want to access the camera, Geo location, photos, and any other Native API. Panning and gestures in general are also done on this thread. JavaScript Runtime thread is the thread where all your JavaScript code will run. It’s slower than UI thread since it’s based on JavaScript event loop, so if you do complex calculations in your application, that leads to lots of UI changes, these can lead to bad performance. The rule of thumb is that if your UI will change slower than 16.67ms, then UI will appear sluggish. What are benefits of React Native? React Native brings with it lots of advantages for mobile development. We covered some of them briefly before, but let’s go over now in more detail. These advantages are what made React Native so popular and trending nowadays. And most of all it give web developers to start developing Native apps with relatively short learning curve compared to overhead learning Objective-C and Java. Developer experience One of the amazing changes React Native brings to mobile development world is enhancing developer experience. If we check developer experience from the point of view of web developer, it’s awesome. For mobile developer it’s something that every mobile developer have dreamt of. Let’s go over some of the features React Native brings for us out of the box. Chrome DevTools debugging Every web developer is familiar with Chrome Developer tools. These tools give us amazing experience debugging web applications. In mobile development debugging mobile applications can be hard. Also it’s really dependent on your target platform. None of mobile application debugging techniques does not even come near web development experience. In React Native, we already know, that JavaScript event loop is running on a separate thread and it can be connected to Chrome DevTools. By clicking Ctrl/Cmd + D in application simulator, we can attach our JavaScript code to Chrome DevTools and bring web debugging to a mobile world. Let’s take a look at the following screenshot: Here you see a React Native debug tools. By clicking on Debug JS Remotely, a separate Google Chrome window is opened where you can debug your applications by setting breakpoints, profiling CPU and memory usage and much more. Elements tab in Chrome Developer tools won’t be relevant though. For that we have a different option. Let’s take a look at what we will get with Chrome Developer tools Remote debugger. Currently Chrome developer tools are focused on Sources tab. You can notice that JavaScript is written in ECMAScript 2015 syntax. For those of you who are not familiar with React JSX, you see weird XML like syntax. Don’t worry, this syntax will be also covered in the book in the context of React Native. If you put debugger inside your JavaScript code, or a breakpoint in your Chrome development tools, the app will pause on this breakpoint or debugger and you will be able to debug your application while it’s running. Live reload As you can see in React Native debugging menu, the third row says Live Reload. If you enable this option, whenever you change your code and save, the application will be automatically reloaded. This ability to Live reload is something mobile developers only dreamt of. No need to recompile application after each minor code change. Just save and the application will reload itself in simulator. This greatly speed up application development and make it much more fun and easy than conventional mobile development. The workflow for every platform is different while in React Native the experience is the same. Does not matter for which platform you develop. Hot reload Sometimes you develop part of the application which requires several user interactions to get to. Think of, for example logging in, opening menu and choosing some option. When we change our code and save, while live reload is enabled, our application is reloaded and we need to once again do these steps. But it does not have to be like that. React Native gives us amazing experience of hot reloading. If you enable this option in React Native development tools and if you change your React Native component, only the component will be reloaded while you stay on the same screen you were before. This speeds up the development process even more. Component hierarchy inspections I’ve said before, that we cannot use elements panel in Chrome development tools, but how you inspect your component structure in React Native apps? React Native gives us built in option in development tools called Show Inspector. When clicking it, you will get the following window: After inspector is opened, you can select any component on the screen and inspect it. You will get the full hierarchy of your components as well as their styling: In this example I’ve selected Welcome to React Native! text. In the opened pane I can see it’s dimensions, padding margin as well as component hierarchy. As you can see it’s IntroApp/Text/RCTText. RCTText is not a React Native JavaScript component, but a Native text component, connected to React Native bridge. In that way you also can see that component is connected to a Native text component. There are even more dev tools available in React Native, that I will cover later on, but we all can agree, that development experience is outstanding. Web inspired layout techniques Styling for Native mobile apps can be really painful sometimes. Also it’s really different between iOS and Android. React Native brings another solution. As you may’ve seen before the whole concept of React Native is bringing web development experience to mobile app development. That’s also the case for creating layouts. Modern way of creating layout for the web is by using flexbox. React Native decided to adopt this modern technique for web and bring it also to the mobile world with small differences. In addition to layouting, all styling in React Native is very similar to using inline styles in HTML. Let’s take a look at example: const styles = StyleSheet.create({ container: { flex: 1, justifyContent: 'center', alignItems: 'center', backgroundColor: '#F5FCFF', }); As you can see in this example, there are several properties of flexbox used as well as background color. This really reminds CSS properties, however instead of using background-color, justify-content and align-items, CSS properties are named in a camel case manner. In order to apply these styles to text component for example. It’s enough to pass them as following: <Text styles={styles.container}>Welcome to React Native </Text> Styling will be discussed in the book, however as you can see from example before , styling techniques are similar to web. They are not dependant on any platform and the same for both iOS and Android Code reusability across applications In terms of code reuse, if an application is properly architectured (something we will also learn in this book), around 80% to 90% of code can be reused between iOS and Android. This means that in terms of development speed React Native beats mobile development. Sometimes even code used for the web can be reused in React Native environment with small changes. This really brings React Native to top of the list of the best frameworks to develop Native mobile apps. Summary In this article, we learned about the concept of how the React Native is really a Native framework, working, information flow, architecture, and it's benefits briefly. Resources for Article: Further resources on this subject: Building Mobile Apps [article] Web Development with React and Bootstrap [article] Introduction to JavaScript [article]

0
1
64855

article-image-open-and-proprietary-next-generation-networks

Packt

21 Feb 2018

29 min read

Open and Proprietary Next Generation Networks

Packt

21 Feb 2018

29 min read

In this article by Steven Noble, the author of the book Building Modern Networks, we will discuss networking concepts such as hyper-scale networking, software-defined networking, network hardware and software design along with a litany of network design ideas utilized in NGN. (For more resources related to this topic, see here.) The term Next Generation Network (NGN) has been around for over 20 years and refers to the current state of the art network equipment, protocols and features. A big driver in NGN is the constant newer, better, faster forwarding ASICs coming out of companies like Barefoot, Broadcom, Cavium, Nephos (MediaTek) and others. The advent of commodity networking chips has shortened the development time for generic switches, allowing hyper scale networking end users to build equipment upgrades into their network designs. At the time of writing, multiple companies have announced 6.4 Tbps switching chips. In layman terms, a 6.4 Tbps switching chip can handle 64x100GbE of evenly distributed network traffic without losing any packets. To put the number in perspective, the entire internet in 2004 was about 4 Tbps, so all of the internet traffic in 2004 could have crossed this one switching chip without issue. (Internet Traffic 1.3 EB/month http://blogs.cisco.com/sp/the-history-and-future-of-internet-traffic) A hyper-scale network is one that is operated by companies such as Facebook, Google, Twitter and other companies that add hundreds if not thousands of new systems a month to keep up with demand. Examples of next generation networking At the start of the commercial internet age (1994), software routers running on minicomputers such as BBNs PDP-11 based IP routers designed in the 1970's were still in use and hubs were simply dumb hardware devices that broadcast traffic everywhere. At that time, the state of the art in networking was the Cisco 7000 series router, introduced in 1993. The next generation router was the Cisco 7500 (1995), while the Cisco 12000 series (gigabit) routers and the Juniper M40 were only concepts. When we say next generation, we are speaking of the current state of the art and the near future of networking equipment and software. For example, 100 GB Ethernet is the current state of the art, while 400 GB Ethernet is in the pipeline. The definition of a modern network is a network that contains one or more of the following concepts: Software-defined Networking (SDN) Network design concepts Next generation hardware Hyper scale networking Open networking hardware and software Network Function Virtualization (NFV) Highly configurable traffic management Both Open and Closed network hardware vendors have been innovating at a high rate of speed with the help of and due to hyper-scale companies like Google, Facebook and others who have the need for next generation high speed network devices. This provides the network architect with a reasonable pipeline of equipment to be used in designs. Google and Facebook are both companies with hyper scale networks. A hyper scale network is one where the data stored, transferred, and updated on the network grows exponentially. Hyper scale companies deploy new equipment, software, and configurations weekly or even daily to support the needs of their customers. These companies have needs that are outside of the normal networking equipment available, so they must innovate by building their own next generation network devices, designing multi-tiered networks (like a three stage Clos network) and automating the installation and configuration of the next generation networking devices. The need of hyper scalers is well summed up by Google's Amin Vahdat in a 2014 Wired article "We couldn't buy the hardware we needed to build a network of the size and speed we needed to build". Terms and concepts in networking Here you will find the definition of some terms that are important in networking. They have been broken into groups of similar concepts. Routing and switching concepts In network devices and network designs there are many important concepts to understand. Here we begin with the way data is handled. The easiest way to discuss networking is to look at the OSI layer and point out where each device sits. OSI Layer with respect to routers and switches: Layer 1 (Physical): Layer 1 includes cables, hub, and switch ports. This is how all of the devices connect to each other including copper cables (CatX), fiber optics and Direct Attach Cables (DAC) which connect SFP ports without fiber. Layer 2 (Data link Layer): Layer 2 includes the raw data sent over the links and manages the Media Access Control (MAC) addresses for Ethernet Layer 3 (Network layer): Layer 3 includes packets that have more than just layer 2 data, such as IP, IPX (Novell Networks protocol), AFP (Apple's protocol) Routers and switches In a network you will have equipment that switches and/or routes traffic. A switch is a networking device that connects multiple devices such as servers, provides local connectivity and provides an uplink to the core network. A router is a network device that computes paths to remote and local devices, providing connectivity to devices across a network. Both switches and routers can use copper and fiber connections to interconnect. There are a few parts to a networking device, the forwarding chip, the TCAM, and the network processor. Some newer switches have Baseboard Management Controllers (BMCs) which manage the power, fans and other hardware, lessening the burden on the NOS to manage these devices. Currently routers and switches are very similar as there are many Layer 3 forwarding capable switches and some Layer 2 forwarding capable routers. Making a switch Layer 3 capable is less of an issue than making a router Layer 2 forwarding as the switch already is doing Layer 2 and adding Layer 3 is not an issue. A router does not do Layer 2 forwarding in general, so it has to be modified to allow for ports to switch rather than route. Control plane The control plane is where all of the information about how packets should be handled is kept. Routing protocols live in the control plane and are constantly scanning information received to determine the best path for traffic to flow. This data is then packed into a simple table and pushed down to the data plane. Data plane The data plane is where forwarding happens. In a software router, this would be done in the devices CPU, in a hardware router, this would be done using the forwarding chip and associated memories. VLAN/VXLAN A Virtual Local Area Network (VLAN) is a way of creating separate logical networks within a physical network. VLANs are generally used to separate/combine different users, or network elements such as phones, servers, workstations, and so on. You can have up to 4,096 VLANs on a network segment. A Virtual Extensible LAN (VXLAN) was created to all for large, dynamic isolated logical networks for virtualized and multiple tenant networks. You can have up to 16 million VXLANs on a network segment. A VXLAN Tunnel Endpoint (VTEP) is a set of two logical interfaces inbound which encapsulates incoming traffic into VXLANs and outbound which removes the encapsulation of outgoing traffic from VXLAN back to its original state. Network design concepts Network design requires the knowledge of the physical structure of the network so that the proper design choices are made. For example, in data center you would have a local area network, if you have multiple data centers near each other, they would be considered a metro area network. LAN A Local Area Network (LAN), generally considered to be within the same building. These networks can be bridged (switched) or routed. In general LANs are segmented into areas to avoid large broadcast domains. MAN A Metro Area Network (MAN), generally defined as multiple sites in the same geographic area or city, that is, metropolitan area. A MAN generally runs at the same speed as a LAN but is able to cover larger distances. WAN A Wide Area Network (WAN), essentially everything that is not a LAN or MAN is a WAN. WANs generally use fiber optic cables to transmit data from one location to another. WAN circuits can be provided via multiple connections and data encapsulations including MPLS, ATM, and Ethernet. Most large network providers utilize Dense Wavelength Division Multiplexing (DWDM) to put more bits on their fiber networks. DWDM puts multiple colors of light onto the fiber, allowing up to 128 different wavelengths to be sent down a single fiber. DWDM has just entered open networking with the introduction of Facebook's Voyager system. Leaf-Spine design In a Leaf-Spine network design, there are Leaf switches (the switches that connect to the servers) sometimes called Top of Rack (ToR) switches connected to a set of Spine (switches that connect leafs together) sometimes called End of Rack (EoR) switches. Clos network A Clos network is one of the ways to design a multi-stage network. Based on the switching network design by Charles Clos in 1952, a three stage Clos is the smallest version of a Clos network. It has an ingress, a middle, and an egress stage. Some hyper scale networks are using five stage Clos where the middle is replaced with another three stage Clos. In a five stage Clos there is an ingress, a middle ingress, a middle, a middle egress and an egress stage. All stages are connected to their neighbor, so in the example shown, Ingress 1 is connected to all four of the middle stages just as Egress 1 is connected to all four of the middle stages. A Clos network can be built in odd numbers starting with 3, so a 5, 7, and so on stage Clos is possible. For even numbered designs, Benes designs are usable. Benes network A Benes design is a non-blocking Clos design where the middle stage is 2x2 instead of NxN. A Benes network can have even numbers of stages. Here is a four stage Benes network. Network controller concepts Here we will discuss the concepts of network controllers. Every networking device has a controller, whether built in or external to manage the forwarding of the system. Controller A controller is a computer that sits on the network and manages one or more network devices. A controller can be built into a device, like the Cisco Supervisor module or standalone like an OpenFlow controller. The controller is responsible for managing all of the control plane data and deciding what should be sent down to the data plane. Generally, a controller will have a Command-line Interface (CLI) and more recently a web configuration interface. Some controllers will even have an Application Programming Interface (API). OpenFlow controller An OpenFlow controller, as it sounds is a controller that uses the OpenFlow protocol to communicate with network devices. The most common OpenFlow controllers that people hear about are OpenDaylight and ONOS. People who are working with OpenFlow would also know of Floodlight and RYU. Supervisor module A route processor is a computer that sits inside of the chassis of the network device you are managing. Sometimes the route processor is built in to the system, while other times it is a module that can be replaced/upgraded. Many vendor multi-slot systems have multiple route processors for redundancy. An example of a removable route processor is the Cisco 9500 series Supervisor module. There are multiple versions available including revision A, with a 4 core processor and 16 GB of RAM and revision B with a 6 core processor and 24 GB of RAM. Previous systems such as the Cisco Catalyst 7600 had options such as the SUP720 (Supervisor Module 720) of which they offered multiple versions. The standard SUP720 had a limited number of routes that it could support (256k) versus the SUP720 XL which could support up to 1M routes. Juniper Route Engine In Juniper terminology, the controller is called a Route Engine. They are similar to the Cisco Route Processor/Supervisor modules. Unlike Cisco Supervisor modules which utilize special CPUS, Juniper's REs generally use common x86 CPUs. Like Cisco, Juniper multi-slot systems can have redundant processors. Juniper has recently released the information about the NG-REs or Next Generation Route Engines. One example is the new RE-S-X6-64G, a 6-core x86 CPU based routing engine with 64 GB DRAM and 2x 64 GB SSD storage available for the MX240/MX480/MX960. These NG-REs allow for containers and other virtual machines to be run directly. Built in processor When looking at single rack unit (1 RU) or pizza box design switches there are some important design considerations. Most 1 RU switches do not have redundant processors, or field replaceable route processors. In general the field replaceable units (FRUs) that the customer can replace are power supplies and fans. If the failure is outside of the available FRUs, the entire switch must be replaced in the event of a failure. With white box switches this can be a simple process as white box switches can be used in multiple locations of your network including the customer edge, provider edge and core. Sparing (keeping a spare switch) is easy when you have the same hardware in multiple parts of the network. Recently commodity switch fabric chips have come with built-in low power ARM CPUs that can be used to manage the entire system, leading to cheaper and less power hungry designs. Facebook Wedge microserver The Facebook Wedge is different from most white box switches as it has its controller as an add in module, the same board that is used in some of the OCP servers. By separating the controller board from the switch, different boards can be put in place, such as higher memory, faster CPUs, different CPU types, and so on. Routing protocols A routing protocol is a daemon that runs on a controller and communicates with other network devices to exchange route information. For this section we will use common words to demonstrate the way the routing protocol is working, these should not be construed as the actual way that the protocols talk. BGP Border Gateway Protocol (BGP) is a path vector based External Gateway Protocol (EGP) protocol that makes routing decisions based on paths, network policies, or rules (route-maps on Cisco). Though designed as a EGP, BGP can be used as both an interior (iboga) and exterior (eBGP) routing protocol. BGP uses keep alive packets (are you there?) to confirm that neighbors are still accessible. BGP is the protocol that is utilized to route traffic across the internet, exchanging routing information between different Autonomous Systems (AS). An AS is all of the connected networks under the control of a single entity such as Level 3 (AS1) or Sprint (AS1239). When two different ASes interconnect, BGP peering sessions are setup between two or more network devices that have direct connections to each other. In an eBGP scenario, AS1 and AS1239 would setup BGP peering sessions that would allow traffic to route between their AS. In an iBGP scenario, the same AS would peer with other routers with the same AS and transfer the routes that are defined on the system. While iBGP is used internally in most networks, iBGP is used in large corporate networks because other Interior Gateway Protocols (IGPs) may not scale. Examples: iBGP next hop self In this scenario AS1 and AS2 are peered with each other and exchanging one prefix each. AS1 advertises 192.168.1.0/24 and AS2 advertises 192.168.2.0/24. Each network has two routers, one border router, which connects to other ASes and one internal router which gets its routes from the border router. The routes are advertised internally with the next-hop set as the border router. This is a standard scenario when you are not running an IGP inside to distribute the routes for the border router external interfaces. The conversation goes like this: AS1 -> AS2: Hi AS2, I am AS1 AS2 -> AS1: Hi AS1, I am AS2 AS1 -> AS2: I have the following route, 192.168.1.0/24 AS2 - AS1: I have received the route, I have 192.168.2.0/24 AS1 - AS2: I have received the route AS1 -> Internal Router AS1: I have this route, 192.168.2.0/24, you can reach it through me at 10.1.1.1 AS2 -> Internal Router AS2: I have this route, 192.168.1.0/24, you can reach it through me at 10.1.1.1 iBGP next-hop unmodified In the next scenario the border routers are the same, but the internal routers are given a next-hop of the external (Other AS) border router. The last scenario is where you peer with a router server, a system that handles peering, filtering the routes based on what you have specified you send. The routes are then forwarded onto your peers with your IP as the next hop. OSPF Open Shortest Path First (OSPF) is a relatively simple protocol. Different links on the same router are put into the same or different areas. For example, you would use area 1 for the interconnects between campuses but you would use another area, such as area 10 for the campus itself. By separating areas, you can reduce the amount of cross talk that happens between devices. There are two versions of OSPF, v2 and v3. The main difference between v2 and v3 is that v2 is for IPv4 networks and v3 is for IPv6 networks. When there are multiple paths that can be taken, the cost of the links must be taken into account. Below you can see where there are two paths, one has a total cost of 20 (5+5+10), the other 16 (8+8) so the traffic will take the lowest cost link. IS-IS IS-IS is a link-state routing protocol, operating by flooding link state information throughout a network of routers using NETs (Network Entity Title). Each IS-IS router has its own database of the network topology, built by aggregating the flooded network information. IS-IS is used by companies who are looking for Fast convergence, scalability and Rapid flooding of new information. IS-IS uses the concept of levels instead of areas as in OSPF. There are two levels in IS-IS, Level 1 - area and Level2 - backbone. A Level 1 Intermediate System (IS), keeps track of the destinations within its area, while a Level 2 IS keep track of paths to the Level 1 areas. EIGRP Enhanced Interior Gateway Routing Protocol (EIGRP) is Cisco's proprietary routing protocol. It is hardly ever seen in current networks but if you see it in yours, then you need to plan accordingly. Replacing EIGRP with OSPF is suggested so that you can interoperate with non-cisco devices. RIP If Routing Information Protocol (RIP) is being used in your network, it must be replaced during the design. Most newer routing stacks do not support RIP. RIP is one of the original routing protocols, using the number of hops (routed ports) between the device and remote location to determine the optimal path. RIP sends its entire routing database out every 30 seconds. When routing tables were small, many years ago, RIP worked fine. With larger tables, the traffic bursts and resulting re-computing by other routers in the network causes routers to run at almost 100 percent CPU all the time. Cables Here we will review the major types of cables. Copper Copper cables have been around for a very long time, originally network devices were connected together using coax cable (the same cable used for television antennas and cable). These days there are a few standard cables that are used. RJ45 Cables Cat5 - A 100Mb capable cable, used for both 10Mb and 100Mb connections Cat5E - 1GbE capable cable but not suggested for 1GbE networks (Cat6 is better and the price difference is nominal). Cat6 - A 1GbE capable cable, can be used for any speed at or below 1GbE including 100Mb and 10Mb. SFPs SFP - Small Form-factor Pluggable port. Capable of up to 1GbE connections SFP+ - Same size as the SFP, capable of up to 10Gb connections SFP28 - Same size as the SFP, capable of up to 25Gb connections QSFP - Quad Small Form-factor Pluggable - A bit wider than the SFP but capable of multiple GbE connections QSFP+ - Same size as the QSFP - capable of 40GbE as 4x10GbE on the same cable QSFP28 - Same size as the QSFP - capable of 100GbE DAC - A direct attach cable that fits into a SFP or QSFP port Fiber/Hot pluggable Breakout Cables As routers and switches continue to become more dense, where the number of ports on the front of the device can no longer fit in the space, manufacturers have moved to what we call breakout cables. For example, if you have a switch that can handle 3.2Tb/s of traffic, you need to provide 3200Gbp/s of port capacity. The easiest way to do that is to use 32 100Gb ports which will fit on the front of a 1U device. You cannot fit 128 10Gb ports without using either a breakout patch panel (which will then use another few rack units (RUs), or a breakout cable. For a period of time in the 1990's, Cisco used RJ21 connectors to provide up to 96 ethernet ports per slot Network engineers would then create breakout cables to go from the RJ21 to RJ45. These days, we have both DAC (Direct Attach Cable) and Fiber breakout cables. For example, here you can see a 1x4 breakout cable, providing 4 10g or 25G ports from a single 40G or 100G port. If you build a LAN network that only includes switches that provide layer 2 connectivity, any devices you want to connect together need to be in the same IP block. If you have a router in your network, it can route traffic between IP blocks. Part 1: What defines a modern network There is a litany of concepts that define a modern network, from simple principles to full feature sets. In general, a next-generation data center design enables you to move to a widely distributed non-blocking fabric with uniform chipset, bandwidth, and buffering characteristics in a simple architecture. In one example, to support these requirements, you would begin with a true three-tier Clos switching architecture with Top of Rack (ToR), spine, and fabric layers to build a data center network. Each ToR would have access to multiple fabrics and have the ability to select a desired path based on application requirement or network availability. Following the definition of a modern network from the introduction, here we layout the general definition of the parts. Modern network pieces Here we will discuss the concepts that build a Next Generation Network (NGN). Software Defined Networks Software defined networks can be defined in multiple ways. The general definition of a Software defined network is one which can be controlled as a singular unit instead of at a system by system basis. The control-plane which would normally be in the device and using routing protocols is replaced with a controller. Software defined networks can be built using many different technologies including OpenFlow, overlay networks and automation tools. Next generation networking and hyper scale networks As we mention in the introduction, twenty years ago NGN hardware would have been the Cisco GSR (officially introduced in 1997) or the Juniper M40 (officially released in 1998). Large Cisco and Juniper customers would have been working with the companies to help come up with the specifications and determining how to deploy the devices (possibly Alpha or Beta versions) in their networks. Today we can look at the hyper scale networking companies to see what a modern network looks like. A hyper scale network is one where the data stored, transferred and updated on the network grows exponentially. Technology such as 100Gb Ethernet, software defined networking, Open networking equipment and software are being deployed by hyper scale companies. Open networking hardware overview Open Hardware has been around for about 10 years, first in the consumer space and more recently in the enterprise space. Enterprise open networking hardware companies such as Quanta and Accton provide a significant amount of the hardware currently utilized in networks today. Companies such as Google and Facebook have been building their own hardware for many years. Facebook's routers such as the Wedge 100 and Backpack are available publicly for end users to utilize. Some examples of Open Networking hardware are: The Dell S6000-ON - a 32x40G switch with 32 QSFP ports on the front. The Quanta LY8 - a 48x10G + 6x40G switch with 48 SFP+ ports and 6 QSFP ports. The Facebook Wedge 100 - a 32x100G switch with 32 QSFP28 ports on the front. Open networking software overview To use open networking hardware, you need an operating system. The operating system manages the system devices such as fans, power, LEDs and temperature. On top of the operating system you will run a forwarding agent, examples of forwarding agents are Indigo, the open source OpenFlow daemon and Quagga, an open source routing agent. Closed networking hardware overview Cisco and Juniper are the leaders in the Closed Hardware and Software space. Cisco produces switches like the Nexus series (3000, 7000, 9000) with the 9000 programmable by ACI. Juniper provides the MX series (480, 960, 2020) with the 2020 being the highest end forwarding system they sell. Closed networking software overview Cisco has multiple network operating systems including IOS, NX-OS, IOS-XR. All Cisco NOSs are closed source and proprietary to the system that they run on. Cisco has what the industry calls a "industry standard CLI" which is emulated by many other companies. Juniper ships a single NOS, JunOS which can install on multiple different systems. JunOS is a closed source BSD based NOS. The JunOS CLI is significantly different from IOS and is more focused on engineers who program. Network Virtualization Not to be confused with Network Function Virtualization (NFV), Network virtualization is the concept of re-creating the hardware interfaces that exist in a traditional network in software. By creating a software counterpart to the hardware interfaces, you decouple the network forwarding from the hardware. There are a few companies and software projects that allow the end user to enable network virtualization. The first one is NSX which comes from the same team that developed OvS (Open Virtual Switch) Nicira, which was acquired by VMWare in 2012. Another project is Big Switch Networks Big Cloud Fabric, which utilizes a heavily modified version of Indigo, an OpenFlow controller. Network Function Virtualization Network Function Virtualization can be summed up by the statement that: "Due to recent network focused advancements in PC hardware, any service able to be delivered on proprietary, application specific hardware should be able to be done on a virtual machine". Essentially: routers, firewalls, load balancers and other network devices all running virtualized on commodity hardware. Traffic Engineering Traffic engineering is a method of optimizing the performance of a telecommunications network by dynamically analyzing, predicting and regulating the behavior of data transmitted over that network. Part 2: Next generation networking examples In my 25 or so years of networking, I have dealt with a lot of different networking technologies, each iteration (supposedly) better than the last. Starting with Thin Net (10BASE2), moving through ArcNet, 10BASE-T, Token Ring, ATM to the Desktop, FDDI and onwards. Generally, the technology improved for each system until it was swapped out. A good example is the change from a literal ring for token ring to a switching design where devices hung off of a hub (as in 10BASE-T). ATM to the desktop was a novel idea, providing up to 25Mbps to connected devices, but the complexity of configuring and managing it was not worth the gain. Today almost everything is Ethernet as shown by the Facebook Voyager DWDM system, which uses Ethernet over both traditional SFP ports and the DWDM interfaces. Ethernet is simple, well supported and easy to manage. Example 1 - Migration from FDDI to 100Base-T In late 1996, early 1997, the Exodus network used FDDI rings (Fiber Distributed Data Interface) to connect the main routers together at 100Mbps. As the network grew we had to decide between two competing technologies, FDDI switches and Fast Ethernet (100Base-T) both providing 100Mbp/s. FDDI switches from companies like DEC (FDDI Gigaswitch) were used in most of the Internet Exchange Points (IXPs) and worked reasonably well with one minor issue, head of line blocking (HOLB), which also impacted other technologies. Head of line blocking occurs when a packet is destined for an interface that is already full, so a queue is built, if the interface continues to be full, eventually the queue will be dropped. While we were testing the DEC FDDI Gigaswitches, we were also in deep discussions with Cisco about the availability of Fast Ethernet (FE) and working on designs. Because FE was new, there were concerns about how it would perform and how we would be able to build a redundant network design. In the end, we decided to use FE, connect the main routers in a full mesh and use routing protocols to manage fail-over. Example 2 - NGN Failure - LANE (LAN Emulation) During the high growth period at Exodus communications, there was a request to connect a new data center to the original one and allow customers to put servers in both locations using the same address space. To do this, we chose LAN Emulation or LANE which allows a ATM network to be used like a LAN. On paper, LANE looked like a great idea, the ability to extend the LAN so that customers could use the same IP space in two different locations. In reality, it was very different. For hardware, we were using Cisco 5513 switches which provided a combination of Ethernet and ATM ports. There were multiple issues with this design: First, the customer is provided with an ethernet interface, which runs over an ATM optical interface. Any error on the physical connection between switches or the ATM layer would cause errors on the Ethernet layer. Second, monitoring was very hard, when there were network issues, you had to look in multiple locations to determine where the errors were happening. After a few weeks, we did a midnight swap putting Cisco 7500 routers in to replace the 5500 switches and moving customers onto new blocks for the new data center. Part 3: Designing a modern network When designing a new network, some of the following might be important to you: Simple, focused yet non-blocking IP fabric Multistage parallel fabrics based on Clos network concept Simple merchant silicon Distributed control plane with some centralized controls Wide multi-path (ECMP) Uniform chipset, bandwidth, and buffering 1:1 oversubscribed (non-blocking fabric) Minimize the hardware necessary to carry east–west traffic Ability to support a large number of bare metal servers without adding an additional layer Limit fabric to a 5 stage Clos within the data center to minimize lookups and switching latency. Support host attachment at 10G, 25G, 50G and 100G Ethernet Traffic management In a modern network one of the first decisions is whether you will use a centralized controller or not. If you use a centralized controller, you will be able to see and control the entire network from one location. If you do not use a centralized controller, you will need to either manage each system directly or via automation. There is a middle space where you can use some software defined network pieces to manage parts of the network, such as an OpenFlow controller for the WAN or VMware NSX for your virtualized workloads. Once you know what the general management goal is, the next decision is whether to use open, proprietary, or a combination of both open and proprietary networking equipment. Open networking equipment is a concept that has been around less than a decade and started when very large network operators decided that they wanted a better control of the cost and features of the equipment in their networks. Google is a good example. In the following figure, you can see how Facebook used both their own hardware, 6-Pack/Backpack and legacy vendor hardware for their interoperability and performance testing. Google wanted to build a high-speed backbone, but was not looking to pay the prices that the incumbent proprietary vendors such as Cisco and Juniper wanted. Google set a price per port (1G/10G/40G) that they wanted to hit and designed equipment around that. Later companies like Facebook decided to go the same direction and contracted with commodity manufacturers to build network switches that met their needs. Proprietary vendors can offer the same level of performance or better using their massive teams of engineers to design and optimize hardware. This distinction even applies on the software side where companies like VMware and Cisco have created software defined networking tools such as NSX and ACI. With the large amount of networking gear available, designing and building a modern network can appear to be a complex concept. Designing a modern network requires research and a good understanding of networking equipment. While complex, the task is not hard if you follow the guidelines. These are a few of the stages of planning that need to be followed before the modern network design is started: The first step is to understand the scope of the project (single site, multi-site, multi-continent, multi-planet). The second step is to determine if the project is a green field (new) or brown field deployment (how many of the sites already exist and will/will not be upgraded). The third step is to determine if there will be any software defined networking (SDN), next generation networking (NGN) or Open Networking pieces. Finally, it is key that the equipment to be used is assembled and tested to determine if the equipment meets the needs of the network. Summary In this article, we have discussed many different concepts that tie NGN together. The term NGN refers to the latest and near-term networking equipment and designs. We looked at networking concepts such as local, metro and wide area networks, network controllers, routers and switches. Routing protocols such as BGP, IS-IS, OSPF and RIP. Then we discussed many pieces that are used either singularly or together that create a modern network. In the end, we also learned some guidelines that should be followed while designing a network. Resources for Article: Further resources on this subject: Analyzing Social Networks with Facebook [article] Social Networks [article] Point-to-Point Networks [article]

0
0
21370

article-image-getting-started-raspberry-pi

Packt

21 Feb 2018

7 min read

Getting Started on the Raspberry Pi

Packt

21 Feb 2018

7 min read

In this article, by Soham Chetan Kamani, author of the book Full Stack Web Development with Raspberry Pi 3, we will cover the marvel of the Raspberry Pi, however, doesn’t end here. It’s extreme portability means we can now do things which were not previously possible with traditional desktop computers. The GPIO pins give us easy access to interface with external devices. This allows the Pi to act as a bridge between embedded electronics and sensors, and the power that linux gives us. In essence, we can run any code in our favorite programming language (which can run on linux), and interface it directly to outside hardware quickly and easily. Once we couple this with the wireless networking capabilities introduced in the Raspberry Pi 3, we gain the ability to make applications that would not have been feasible to make before this device existed.and Scar de Courcier, authors of Windows Forensics Cookbook The Raspberry Pi has become hugely popular as a portable computer, and for good reason. When it comes to what you can do with this tiny piece of technology, the sky’s the limit. Back in the day, computers used to be the size of entire neighborhood blocks, and only large corporations doing expensive research could afford them. After that we went on to embrace personal computers, which were still a bit expensive, but, for the most part, could be bought by the common man. This brings us to where we are today, where we can buy a fully functioning Linux computer, which is as big as a credit card, for under 30$. It is truly a huge leap in making computers available to anyone and everyone. (For more resources related to this topic, see here.) Web development and portable computing have come a long way. A few years ago we couldn’t dream of making a rich, interactive, and performant application which runs on the browser. Today, not only can we do that, but also do it all in the palm of our hands (quite literally). When we think of developing an application that uses databases, application servers, sockets, and cloud APIs, the picture that normally comes to mind is that of many server racks sitting in a huge room. In this book however, we are going to implement all of that using only the Raspberry Pi. In this article, we will go through the concept of the internet of things, and discuss how web development on the Raspberry Pi can help us get there. Following this, we will also learn how to set up our Raspberry Pi and access it from our computer. We will cover the following topics: The internet of things Our application Setting up Raspberry Pi Remote access The Internet of things (IOT) The web has until today been a network of computers exchanging data. The limitation of this was that it was a closed loop. People could send and receive data from other people via their computers, but rarely much else. The internet of things, in contrast, is a network of devices or sensors that connect the outside world to the internet. Superficially, nothing is different: the internet is still a network of computers. What has changed, is that now, these computers are collecting and uploading data from things instead of people. This now allows anyone who is connected to obtain information that is not collected by a human. The internet of things as a concept has been around for a long time, but it is only now that almost anyone can connect a sensor or device to the cloud, and the IOT revolution was hugely enabled by the advent of portable computing, which was led by the Raspberry Pi. A brief look at our application Throughout this book, we are going to go through different components and aspects of web development and embedded systems. These are all going to be held together by our central goal of making an entire web application capable of sensing and displaying the surrounding temperature and humidity. In order to make a properly functioning system, we have to first build out the individual parts. More difficult still, is making sure all the parts work well together. Keeping this in mind, let's take a look at the different components of our technology stack, and the problems that each of them solve : The sensor interface - Perception The sensor is what connects our otherwise isolated application to the outside world. The sensor will be connected to the GPIO pins of the Raspberry pi. We can interface with the sensor through various different native libraries. This is the starting point of our data. It is where all the data that is used by our application is created. If you think about it, every other component of our technology stack only exists to manage, manipulate, and display the data collected from the sensor. The database - Persistence "Data" is the term we give to raw information, which is information that we cannot easily aggregate or understand. Without a way to store and meaningfully process and retrieve this data, it will always remain "data" and never "information", which is what we actually want. If we just hook up a sensor and display whatever data it reads, we are missing out on a lot of additional information. Let's take the example of temperature: What if we wanted to find out how the temperature was changing over time? What if we wanted to find the maximum and minimum temperatures for a particular day, or a particular week, or even within a custom duration of time? What if we wanted to see temperature variation across locations? There is no way we could do any of this with only the sensor. We also need some sort of persistence and structure to our data, and this is exactly what the database provides for us. If we structure our data correctly, getting the answers to the above questions is just a matter of a simple database query. The user interface - Presentation The user interface is the layer which connects our application to the end user. One of the most challenging aspects of software development is to make information meaningful and understandable to regular users of our application. The UI layer serves exactly this purpose: it takes relevant information and shows it in such a way that it is easily understandable to humans. How do we achieve such a level of understandability with such a large amount of data? We use visual aids: like colors, charts and diagrams (just like how the diagrams in this book make its information easier to understand). An important thing for any developer to understand is that your end user actually doesn't care about any of the the back-end stuff. The only thing that matters to them is a good experience. Of course, all the 0ther components serve to make the users experience better, but it's really the user facing interface that leaves the first impression, and that's why it's so important to do it well. The application server - Middleware This layer consists of the actual server side code we are going to write to get the application running. It is also called "middleware". In addition to being in the exact center of the architecture diagram, this layer also acts as the controller and middle-man for the other layers. The HTML pages that form the UI are served through this layer. All the database queries that we were talking about earlier are made here. The code that runs in this layer is responsible for retrieving the sensor readings from our external pins and storing the data in our database. Summary We are just warming up! In this article we got a brief introduction to the concept of the internet of things. We then went on to look at an overview of what we were going to build throughout the rest of this book, and saw how the Raspberry Pi can help us get there. Resources for Article: Further resources on this subject: Clusters, Parallel Computing, and Raspberry Pi – A Brief Background [article] Setting up your Raspberry Pi [article] Welcome to JavaScript in the full stack [article]

0
0
10346

How-To Tutorials

article-image-vmware-vsphere-storage-datastores-snapshots

Packt

21 Feb 2018

9 min read

VMware vSphere storage, datastores, snapshots

Packt

21 Feb 2018

9 min read

VMware vSphere storage, datastores, snapshotsIn this article, byAbhilash G B, author of the book,VMware vSphere 6.5 CookBook - Third Edition, we will cover the following:Managing VMFS volumes detected as snapshotsCreating NFSv4.1 datastores with Kerberos authenticationEnabling storage I/O control (For more resources related to this topic, see here.) IntroductionStorage is an integral part of any infrastructure. It is used to store the files backing your virtual machines.Â The most common way to refer to a type of storage presented to a VMware environment is based on the protocol used and the connection type.Â NFS are storage solutions that can leverage the existing TCP/IP network infrastructure. Hence, they are referred to as IP-based storage.Â Storage IO Control (SIOC) is one of the mechanisms to use ensure a fair share of storage bandwidth allocation to all Virtual Machines running on shared storage, regardless of the ESXi host the Virtual Machines are running on. Managing VMFS volumes detected as snapshotsSome environments maintain copies of the production LUNs as a backup, by replicating them. These replicas are exact copies of the LUNs that were already presented to the ESXi hosts. If for any reason a replicated LUN is presented to an ESXi host, then the host will not mount the VMFS volume on the LUN. This is a precaution to prevent data corruption.Â ESXi identifies each VMFS volume using its signature denoted by aUniversally Unique IdentifierÂ (UUID). The UUID is generated when the volume is first created or resignatured and is stored in the LVM header of the VMFS volume.Â When an ESXi host scans for new LUN ;devices and VMFS volumes on it, it compares the physical device ID (NAA ID) of the LUN with the device ID (NAA ID) value stored in the VMFS volumes LVM header. If it finds a mismatch, then it flags the volume as a snapshot volume.Volumes detected as snapshots are not mounted by default.Â There are two ways to mount such volumes/datastore:Mount by Keeping the Existing Signature Intact -Â This is used when you are attempting to temporarily mount the snapshot volume on an ESXi that doesn't see the original volume. If you were to attempt mounting the VMFS volume by keeping the existing signature and if the host sees the original volume, then you will not be allowed to mount the volume and will be warned about the presence of another VMFS volume with the same UUID:Mount by generating a new VMFS Signature -Â This has to be used if you are mounting a clone or a snapshot of an existing VMFS datastore to the same host/s.Â The process of assigning a new signature will not only update the LVM header with the newly generated UUID, but all the Physical Device ID (NAA ID) of the snapshot LUN. Here, the VMFS volume/datastore will be renamed by prefixing the wordsnap followed by a random number and the name of the original datastore: Getting readyMake sure that the original datastore and its LUN is no longer seen by the ESXi host the snapshot is being mounted to. How to do it...The following procedure will help mount a VMFS volume from a LUN detected as a snapshot:Log in to the vCenter Server using the vSphere Web Client and use the key combination Ctrl+Alt+2Â to switch to the Host and ClustersÂ view.Right click on the ESXi host the snapshot LUN is mapped to and go to Storage | New Datastore.On the New Datastore wizard, select VMFS as the filesystem type and click Next to continue.On the Name and Device selection screen, select the LUN detected as a snaphsot and click Next to continue:On the Mount Option screen, choose to either mount by assigning a new signature or by keeping the existing signature, and click Next to continue:On the Ready to Complete screen, review the setting and click Finish to initiate the operation. Creating NFSv4.1 datastores with Kerberos authenticationVMware introduced support for NFS 4.1 with vSphere 6.0. The vSphere 6.5 added several enhancements:It now supports AES encryptionSupport for IP version 6Support Kerberos's integrity checking mechanismHere, we will learn how to create NFS 4.1 datastores. Although the procedure is similar to NFSv3, there are a few additional steps that needs to be performed. Getting readyFor Kerberos authentication to work, you need to make sure that the ESXi hosts and the NFS Server is joined to the Active Directory domainCreate a new or select an existing AD user for NFS Kerberos authenticationConfigure the NFS Server/Share to Allow access to the AD user chosen for NFS Kerberos authentication How to do it...The following procedure will help you mount an NFS datasture using the NFSv4.1 client with Kerberos authentication enabled:Log in to the vCenter Server using the vSphere Web Client and use the key combination Ctrl+Alt+2 to switch to the Host and Clusters view, select the desired ESXi host andÂ navigate to it Â Configure | System | AuthenticationÂ Services section and supply the credentials of the Active Directory user that was chosen for NFS Kerberon Authentication:Right-click on the desired ESXi host and go to Storage | New DatastoreÂ to bring-up the Add Storage wizard.On the New Datastore wizard, select the Type as NFS and click Next to continue.On the Select NFS version screen, select NFS 4.1Â and click Next to continue. Keep in mind that it is not recommended to mount an NFS Export using both NFS3 and NFS4.1 client.Â On the Name and Configuration screen, supply a Name for the Datastore, the NFS export's folder path and NFS Server's IP Address or FQDN. You can also choose to mount the share as ready-only if desired:On the Configure Kerberos Authentication screen, check theÂ Enable Kerberos-based authentication box and choose the type of authentication required and click Next to continue:On the Ready to Complete screen review the settings and click Finish to mount the NFS export. Enabling storage I/O controlThe use of disk shares will work just fine as long as the datastore is seen by a single ESXi host. Unfortunately, that is not a common case. Datastores are often shared among multiple ESXi hosts. When datastores are shared, you bring in more than one local host scheduler into the process of balancing the I/O among the virtual machines. However, these lost host schedules cannot talk to each other and their visibility is limited to the ESXi hosts they are running on. This easily contributes to a serious problem called thenoisy neighbor situation.Â The job of SIOC is to enable some form of communication between local host schedulers so that I/O can be balanced between virtual machines running on separate hosts.Â How to do it...The following procedure will help you enable SIOC on a datastore:Connect to the vCenter Server using the Web Client and switch to the StorageÂ view using the key combination Ctrl+Alt+4.Right-click on the desired datastore and go to Configure Storage I/O Control:On the Configure Storage I/O Control window, select the checkbox Enable Storage I/O Control, set a custom congestion threshold (only if needed) and click OK to confirm the settings:Â With the Virtual Machine selected from the inventory, navigate to its Configure | General tab and review its datastore capability settings to ensure that SIOC is enabled:Â How it works...As mentioned earlier, SIOC enables communication between these local host schedulers so that I/O can be balanced between virtual machines running on separate hosts. It does so by maintaining a shared file in the datastore that all hosts can read/write/update. When SIOC is enabled on a datastore, it starts monitoring the device latency on the LUN backing the datastore. If the latency crosses the threshold, it throttles the LUN's queue depth on each of the ESXi hosts in an attempt to distribute a fair share of access to the LUN for all the Virtual Machines issuing the I/O.The local scheduler on each of the ESXi hosts maintains an iostatsÂ file to keep its companion hosts aware of the device I/O statistics observed on the LUN. The file is placed in a directory (naa.xxxxxxxxx) on the same datastore.For example, if there are six virtual machines running on three different ESXi hosts, accessing a shared LUN. Among the six VMs, four of them have a normal share value of 1000 and the remaining two have high (2000) disk share value sets on them. These virtual machines have only a single VMDK attached to them. VM-C on host ESX-02 is issuing a large number of I/O operations. Since that is the only VM accessing the shared LUN from that host, it gets the entire queue's bandwidth. This can induce latency on the I/O operations performed by the other VMs: ESX-01 and ESX-03. If the SIOC detects the latency value to be greater than the dynamic threshold, then it will start throttling the queue depth:Â The throttled DQLEN for a VM is calculated as follows:DQLEN for the VM = (VM's Percent of Shares) of (Queue Depth)Example: 12.5 % of 64 â (12.5 * 64)/100 = 8The throttled DQLEN per host is calculated as follows:DQLEN of the Host = Sum of the DQLEN of the VMs on itExample: VM-A (8) + VM-B(16) = 24The following diagram shows the effect of SIOC throttling the queue depth: SummaryIn this article we learnt, how toÂ mount a VMFS volume from a LUN detected as a snapshot, how to mount an NFS datasture using the NFSv4.1 client with Kerberos authentication enabled, and how toÂ enable SIOC on a datastore. Resources for Article: Further resources on this subject: Essentials of VMware vSphere [article] Working with VMware Infrastructure [article] Network Virtualization and vSphere [article]

0
0
37975

article-image-create-conversational-assistant-chatbot-using-python

Savia Lobo

21 Feb 2018

5 min read

How to create a conversational assistant or chatbot using Python

Savia Lobo

21 Feb 2018

5 min read

[box type="note" align="" class="" width=""]This article is an excerpt taken from a book Natural Language Processing with Python Cookbook written by Krishna Bhavsar, Naresh Kumar, and Pratap Dangeti. This book includes unique recipes to teach various aspects of performing Natural Language Processing with NLTK—the leading Python platform for the task.[/box] Today we will learn to create a conversational assistant or chatbot using Python programming language. Conversational assistants or chatbots are not very new. One of the foremost of this kind is ELIZA, which was created in the early 1960s and is worth exploring. In order to successfully build a conversational engine, it should take care of the following things: 1. Understand the target audience 2. Understand the natural language in which communication happens. 3. Understand the intent of the user 4. Come up with responses that can answer the user and give further clues NLTK has a module, nltk.chat, which simplifies building these engines by providing a generic framework. Let's see the available engines in NLTK: Engines Modules Eliza nltk.chat.eliza Python module Iesha nltk.chat.iesha Python module Rude nltk.chat.rudep ython module Suntsu Suntsu nltk.chat.suntsu module Zen nltk.chat.zen module In order to interact with these engines we can just load these modules in our Python program and invoke the demo() function. This recipe will show us how to use built-in engines and also write our own simple conversational engine using the framework provided by the nltk.chat module. Getting ready You should have Python installed, along with the nltk library. Having an understanding of regular expressions also helps. How to do it... Open atom editor (or your favorite programming editor). Create a new file called Conversational.py. Type the following source code: Save the file. Run the program using the Python interpreter. You will see the following output: How it works... Let's try to understand what we are trying to achieve here. import nltk This instruction imports the nltk library into the current program. def builtinEngines(whichOne): This instruction defines a new function called builtinEngines that takes a string parameter, whichOne: if whichOne == 'eliza': nltk.chat.eliza.demo() elif whichOne == 'iesha': nltk.chat.iesha.demo() elif whichOne == 'rude': nltk.chat.rude.demo() elif whichOne == 'suntsu': nltk.chat.suntsu.demo() elif whichOne == 'zen': nltk.chat.zen.demo() else: print("unknown built-in chat engine {}".format(whichOne)) These if, elif, else instructions are typical branching instructions that decide which chat engine's demo() function is to be invoked depending on the argument that is present in the whichOne variable. When the user passes an unknown engine name, it displays a message to the user that it's not aware of this engine. It's a good practice to handle all known and unknown cases also; it makes our programs more robust in handling unknown situations def myEngine():. This instruction defines a new function called myEngine(); this function does not take any parameters. chatpairs = ( (r"(.*?)Stock price(.*)", ("Today stock price is 100", "I am unable to find out the stock price.")), (r"(.*?)not well(.*)", ("Oh, take care. May be you should visit a doctor", "Did you take some medicine ?")), (r"(.*?)raining(.*)", ("Its monsoon season, what more do you expect ?", "Yes, its good for farmers")), (r"How(.*?)health(.*)", ("I am always healthy.", "I am a program, super healthy!")), (r".*", ("I am good. How are you today ?", "What brings you here ?")) ) This is a single instruction where we are defining a nested tuple data structure and assigning it to chat pairs. Let's pay close attention to the data structure: We are defining a tuple of tuples Each subtuple consists of two elements: The first member is a regular expression (this is the user's question in regex format) The second member of the tuple is another set of tuples (these are the answers) def chat(): print("!"*80) print(" >> my Engine << ") print("Talk to the program using normal english") print("="*80) print("Enter 'quit' when done") chatbot = nltk.chat.util.Chat(chatpairs, nltk.chat.util.reflections) chatbot.converse() We are defining a subfunction called chat()inside the myEngine() function. This is permitted in Python. This chat() function displays some information to the user on the screen and calls the nltk built-in nltk.chat.util.Chat() class with the chatpairs variable. It passes nltk.chat.util.reflections as the second argument. Finally we call the chatbot.converse() function on the object that's created using the chat() class. chat() This instruction calls the chat() function, which shows a prompt on the screen and accepts the user's requests. It shows responses according to the regular expressions that we have built before: if name == ' main ': for engine in ['eliza', 'iesha', 'rude', 'suntsu', 'zen']: print("=== demo of {} ===".format(engine)) builtinEngines(engine) print() myEngine() These instructions will be called when the program is invoked as a standalone program (not using import). They do these two things: Invoke the built-in engines one after another (so that we can experience them) Once all the five built-in engines are excited, they call our myEngine(), where our customer engine comes into play We have learned to create a chatbot of our own using the easiest programming language ‘Python’. To know more about how to efficiently use NLTK and implement text classification, identify parts of speech, tag words, etc check out Natural Language Processing with Python Cookbook.

0
0
50958

article-image-develop-stock-price-predictive-model-using-reinforcement-learning-tensorflow

Aaron Lazar

20 Feb 2018

12 min read

How to develop a stock price predictive model using Reinforcement Learning and TensorFlow

Aaron Lazar

20 Feb 2018

12 min read

[box type="note" align="" class="" width=""]This article is an extract from the book Predictive Analytics with TensorFlow, authored by Md. Rezaul Karim. This book helps you build, tune, and deploy predictive models with TensorFlow.[/box] In this article we’ll show you how to create a predictive model to predict stock prices, using TensorFlow and Reinforcement Learning. An emerging area for applying Reinforcement Learning is the stock market trading, where a trader acts like a reinforcement agent since buying and selling (that is, action) particular stock changes the state of the trader by generating profit or loss, that is, reward. The following figure shows some of the most active stocks on July 15, 2017 (for an example): Now, we want to develop an intelligent agent that will predict stock prices such that a trader will buy at a low price and sell at a high price. However, this type of prediction is not so easy and is dependent on several parameters such as the current number of stocks, recent historical prices, and most importantly, on the available budget to be invested for buying and selling. The states in this situation are a vector containing information about the current budget, current number of stocks, and a recent history of stock prices (the last 200 stock prices). So each state is a 202-dimensional vector. For simplicity, there are only three actions to be performed by a stock market agent: buy, sell, and hold. So, we have the state and action, what else do you need? Policy, right? Yes, we should have a good policy, so based on that an action will be performed in a state. A simple policy can consist of the following rules: Buying (that is, action) a stock at the current stock price (that is, state) decreases the budget while incrementing the current stock count Selling a stock trades it in for money at the current share price Holding does neither, and performing this action simply waits for a particular time period and yields no reward To find the stock prices, we can use the yahoo_finance library in Python. A general warning you might experience is "HTTPError: HTTP Error 400: Bad Request". But keep trying. Now, let's try to get familiar with this module: >>> from yahoo_finance import Share >>> msoft = Share('MSFT') >>> print(msoft.get_open()) 72.24= >>> print(msoft.get_price()) 72.78 >>> print(msoft.get_trade_datetime()) 2017-07-14 20:00:00 UTC+0000 >>> So as of July 14, 2017, the stock price of Microsoft Inc. went higher, from 72.24 to 72.78, which means about a 7.5% increase. However, this small and just one-day data doesn't give us any significant information. But, at least we got to know the present state for this particular stock or instrument. To install yahoo_finance, issue the following command: $ sudo pip3 install yahoo_finance Now it would be worth looking at the historical data. The following function helps us get the historical data for Microsoft Inc: def get_prices(share_symbol, start_date, end_date, cache_filename): try: stock_prices = np.load(cache_filename) except IOError: share = Share(share_symbol) stock_hist = share.get_historical(start_date, end_date) stock_prices = [stock_price['Open'] for stock_price in stock_ hist] np.save(cache_filename, stock_prices) return stock_prices The get_prices() method takes several parameters such as the share symbol of an instrument in the stock market, the opening date, and the end date. You will also like to specify and cache the historical data to avoid repeated downloading. Once you have downloaded the data, it's time to plot the data to get some insights. The following function helps us to plot the price: def plot_prices(prices): plt.title('Opening stock prices') plt.xlabel('day') plt.ylabel('price ($)') plt.plot(prices) plt.savefig('prices.png') Now we can call these two functions by specifying a real argument as follows: if __name__ == '__main__': prices = get_prices('MSFT', '2000-07-01', '2017-07-01', 'historical_stock_prices.npy') plot_prices(prices) Here I have chosen a wide range for the historical data of 17 years to get a better insight. Now, let's take a look at the output of this data: The goal is to learn a policy that gains the maximum net worth from trading in the stock market. So what will a trading agent be achieving in the end? Figure 8 gives you some clue: Well, figure 8 shows that if the agent buys a certain instrument with price $20 and sells at a peak price say at $180, it will be able to make $160 reward, that is, profit. So, implementing such an intelligent agent using RL algorithms is a cool idea? From the previous example, we have seen that for a successful RL agent, we need two operations well defined, which are as follows: How to select an action How to improve the utility Q-function To be more specific, given a state, the decision policy will calculate the next action to take. On the other hand, improve Q-function from a new experience of taking an action. Also, most reinforcement learning algorithms boil down to just three main steps: infer, perform, and learn. During the first step, the algorithm selects the best action (a) given a state (s) using the knowledge it has so far. Next, it performs the action to find out the reward (r) as well as the next state (s'). Then, it improves its understanding of the world using the newly acquired knowledge (s, r, a, s') as shown in the following figure: Now, let's start implementing the decision policy based on which action will be taken for buying, selling, or holding a stock item. Again, we will do it an incremental way. At first, we will create a random decision policy and evaluate the agent's performance. But before that, let's create an abstract class so that we can implement it accordingly: class DecisionPolicy: def select_action(self, current_state, step): pass def update_q(self, state, action, reward, next_state): pass The next task that can be performed is to inherit from this superclass to implement a random decision policy: class RandomDecisionPolicy(DecisionPolicy): def __init__(self, actions): self.actions = actions def select_action(self, current_state, step): action = self.actions[random.randint(0, len(self.actions) - 1)] return action The previous class did nothing except defi ning a function named select_action (), which will randomly pick an action without even looking at the state. Now, if you would like to use this policy, you can run it on the real-world stock price data. This function takes care of exploration and exploitation at each interval of time, as shown in the following figure that form states S1, S2, and S3. The policy suggests an action to be taken, which we may either choose to exploit or otherwise randomly explore another action. As we get rewards for performing an action, we can update the policy function over time: Fantastic, so we have the policy and now it's time to utilize this policy to make decisions and return the performance. Now, imagine a real scenario—suppose you're trading on Forex or ForTrade platform, then you can recall that you also need to compute the portfolio and the current profit or loss, that is, reward. Typically, these can be calculated as follows: portfolio = budget + number of stocks * share value reward = new_portfolio - current_portfolio At first, we can initialize values that depend on computing the net worth of a portfolio, where the state is a hist+2 dimensional vector. In our case, it would be 202 dimensional. Then we define the range of tuning the range up to: Length of the prices selected by the user query – (history + 1), since we start from 0, we subtract 1 instead. Then, we should calculate the updated value of the portfolio and from the portfolio, we can calculate the value of the reward, that is, profit. Also, we have already defined our random policy, so we can then select an action from the current policy. Then, we repeatedly update the portfolio values based on the action in each iteration and the new portfolio value after taking the action can be calculated. Then, we need to compute the reward from taking an action at a state. Nevertheless, we also need to update the policy after experiencing a new action. Finally, we compute the final portfolio worth: def run_simulation(policy, initial_budget, initial_num_stocks, prices, hist, debug=False): budget = initial_budget num_stocks = initial_num_stocks share_value = 0 transitions = list() for i in range(len(prices) - hist - 1): if i % 100 == 0: print('progress {:.2f}%'.format(float(100*i) / (len(prices) - hist - 1))) current_state = np.asmatrix(np.hstack((prices[i:i+hist], budget, num_stocks))) current_portfolio = budget + num_stocks * share_value action = policy.select_action(current_state, i) share_value = float(prices[i + hist + 1]) if action == 'Buy' and budget >= share_value: budget -= share_value num_stocks += 1 elif action == 'Sell' and num_stocks > 0: budget += share_value num_stocks -= 1 else: action = 'Hold' new_portfolio = budget + num_stocks * share_value reward = new_portfolio - current_portfolio next_state = np.asmatrix(np.hstack((prices[i+1:i+hist+1], budget, num_stocks))) transitions.append((current_state, action, reward, next_ state)) policy.update_q(current_state, action, reward, next_state) portfolio = budget + num_stocks * share_value if debug: print('${}t{} shares'.format(budget, num_stocks)) return portfolio The previous simulation predicts a somewhat good result; however, it produces random results too often. Thus, to obtain a more robust measurement of success, let's run the simulation a couple of times and average the results. Doing so may take a while to complete, say 100 times, but the results will be more reliable: def run_simulations(policy, budget, num_stocks, prices, hist): num_tries = 100 final_portfolios = list() for i in range(num_tries): final_portfolio = run_simulation(policy, budget, num_stocks, prices, hist) final_portfolios.append(final_portfolio) avg, std = np.mean(final_portfolios), np.std(final_portfolios) return avg, std The previous function computes the average portfolio and the standard deviation by iterating the previous simulation function 100 times. Now, it's time to evaluate the previous agent. As already stated, there will be three possible actions to be taken by the stock trading agent such as buy, sell, and hold. We have a state vector of 202 dimension and budget only $1000. Then, the evaluation goes as follows: actions = ['Buy', 'Sell', 'Hold'] hist = 200 policy = RandomDecisionPolicy(actions) budget = 1000.0 num_stocks = 0 avg,std=run_simulations(policy,budget,num_stocks,prices, hist) print(avg, std) >>> 1512.87102405 682.427384814 The first one is the mean and the second one is the standard deviation of the final portfolio. So, our stock prediction agent predicts that as a trader you/we could make a profit about $513. Not bad. However, the problem is that since we have utilized a random decision policy, the result is not so reliable. To be more specific, the second execution will definitely produce a different result: >>> 1518.12039077 603.15350649 Therefore, we should develop a more robust decision policy. Here comes the use of neural network-based QLearning for decision policy. Next, we will see a new hyperparameter epsilon to keep the solution from getting stuck when applying the same action over and over. The lesser its value, the more often it will randomly explore new actions: Next, I am going to write a class containing their functions: Constructor: This helps to set the hyperparameters from the Q-function. It also helps to set the number of hidden nodes in the neural networks. Once we have these two, it helps to define the input and output tensors. It then defines the structure of the neural network. Further, it defines the operations to compute the utility. Then, it uses an optimizer to update model parameters to minimize the loss and sets up the session and initializes variables. select_action: This function exploits the best option with probability 1-epsilon. update_q: This updates the Q-function by updating its model parameters. Refer to the following code: class QLearningDecisionPolicy(DecisionPolicy): def __init__(self, actions, input_dim): self.epsilon = 0.9 self.gamma = 0.001 self.actions = actions output_dim = len(actions) h1_dim = 200 self.x = tf.placeholder(tf.float32, [None, input_dim]) self.y = tf.placeholder(tf.float32, [output_dim]) W1 = tf.Variable(tf.random_normal([input_dim, h1_dim])) b1 = tf.Variable(tf.constant(0.1, shape=[h1_dim])) h1 = tf.nn.relu(tf.matmul(self.x, W1) + b1) W2 = tf.Variable(tf.random_normal([h1_dim, output_dim])) b2 = tf.Variable(tf.constant(0.1, shape=[output_dim])) self.q = tf.nn.relu(tf.matmul(h1, W2) + b2) loss = tf.square(self.y - self.q) self.train_op = tf.train.GradientDescentOptimizer(0.01). minimize(loss) self.sess = tf.Session() self.sess.run(tf.initialize_all_variables()) def select_action(self, current_state, step): threshold = min(self.epsilon, step / 1000.) if random.random() < threshold: # Exploit best option with probability epsilon action_q_vals = self.sess.run(self.q, feed_dict={self.x: current_state}) action_idx = np.argmax(action_q_vals) action = self.actions[action_idx] else: # Random option with probability 1 - epsilon action = self.actions[random.randint(0, len(self.actions) - 1)] return action def update_q(self, state, action, reward, next_state): action_q_vals = self.sess.run(self.q, feed_dict={self.x: state}) next_action_q_vals = self.sess.run(self.q, feed_dict={self.x: next_state}) next_action_idx = np.argmax(next_action_q_vals) action_q_vals[0, next_action_idx] = reward + self.gamma * next_action_q_vals[0, next_action_idx] action_q_vals = np.squeeze(np.asarray(action_q_vals)) self.sess.run(self.train_op, feed_dict={self.x: state, self.y: action_q_vals}) There you go! We have a stock price predictive model running and we’ve built it using Reinforcement Learning and TensorFlow. If you found this tutorial interesting and would like to learn more, head over to grab this book, Predictive Analytics with TensorFlow, by Md. Rezaul Karim.

0
1
19813

Packt

20 Feb 2018

10 min read

K Nearest Neighbors

Packt

20 Feb 2018

10 min read

In this article by Gavin Hackeling, author of book Mastering Machine Learning with scikit-learn - Second Edition, we will start with K Nearest Neighbors (KNN) which is a simple model for regression and classification tasks. It is so simple that its name describes most of its learning algorithm. The titular neighbors are representations of training instances in a metric space. A metric space is a feature space in which the distances between all members of a set are defined. (For more resources related to this topic, see here.) For classification tasks, a set of tuples of feature vectors and class labels comprise the training set. KNN is a capable of binary, multi-class, and multi-label classification. We will focus on binary classification in this article. The simplest KNN classifiers use the mode of the KNN labels to classify test instances, but other strategies can be used. k is often set to an odd number to prevent ties. In regression tasks, the features vectors are each associated with a response variable that takes a real-valued scalar instead of a label. The prediction is the mean or weighted mean of the k nearest neighbors’ response variables. Lazy learning and non-parametric models KNN is a lazy learner. Also known as instance-based learners, lazy learners simply store the training data set with little or no processing. In contrast to eager learners, such as simple linear regression, KNN does not estimate the parameters of a model that generalizes the training data during a training phase. Lazy learning has advantages and disadvantages. Training an eager learner is often computationally costly, but prediction with the resulting model is often inexpensive. For simple linear regression, prediction consists only of multiplying the learned coefficient by the feature, and adding the learned intercept parameter. A lazy learner can predict almost immediately, but making predictions can be costly. In the simplest implementation of KNN, prediction requires calculating the distances between a test instance and all of the training instances. In contrast to most of the other models we will discuss, KNN is a non-parametric model. A parametric model uses a fixed number of parameters, or coefficients, to define the model that summarizes the data. The number of parameters is independent of the number of training instances. Non-parametric may seem to be a misnomer, as it does not mean that the model has no parameters; rather, non-parametric means that the number of parameters of the model is not fixed, and may grow with the number of training instances. Non-parametric models can be useful when training data is abundant and you have little prior knowledge about the relationship between the response and explanatory variables. KNN makes only one assumption: instances that are near each other are likely to have similar values of the response variable. The flexibility provided by non-parametric models is not always desirable; a model that makes assumptions about the relationship can be useful if training data is scarce or if you already know about the relationship. Classification with KNN The goal of classification tasks is to use one or more features to predict the value of a discrete response variable. Let’s work through a toy classification problem. Assume that you must use a person’s height and weight to predict his or her sex. This problem is called binary classification because the response variable can take one of two labels. The following table records nine training instances. height weight label 158 cm 64 kg male 170 cm 66 kg male 183 cm 84 kg male 191 cm 80 kg male 155 cm 49 kg female 163 cm 59 kg female 180 cm 67 kg female 158 cm 54 kg female 178 cm 77 kg female We are now using features from two explanatory variables to predict the value of the response variable. KNN is not limited to two features; the algorithm can use an arbitrary number of features, but more than three features cannot be visualized. Let’s visualize the data by creating a scatter plot with matplotlib. # In[1]: import numpy as np import matplotlib.pyplot as plt X_train = np.array([ [158, 64], [170, 86], [183, 84], [191, 80], [155, 49], [163, 59], [180, 67], [158, 54], [170, 67] ]) y_train = ['male', 'male', 'male', 'male', 'female', 'female', 'female', 'female', 'female'] plt.figure() plt.title('Human Heights and Weights by Sex') plt.xlabel('Height in cm') plt.ylabel('Weight in kg') for i, x in enumerate(X_train): # Use 'x' markers for instances that are male and diamond markers for instances that are female plt.scatter(x[0], x[1], c='k', marker='x' if y_train[i] == 'male' else 'D') plt.grid(True) plt.show() From the plot we can see that men, denoted by the x markers, tend to be taller and weigh more than women. This observation is probably consistent with your experience. Now let’s use KNN to predict whether a person with a given height and weight is a man or a woman. Let’s assume that we want to predict the sex of a person who is 155 cm tall and who weighs 70 kg. First, we must define our distance measure. In this case we will use Euclidean distance, the straight line distance between points in a Euclidean space. Euclidean distance in a two-dimensional space is given by the following: Next we must calculate the distances between the query instance and all of the training instances. height weight label Distance from test instance 158 cm 64 kg male 170 cm 66 kg male 183 cm 84 kg male 191 cm 80 kg male 155 cm 49 kg female 163 cm 59 kg female 180 cm 67 kg female 158 cm 54 kg female 178 cm 77 kg female We will set k to 3, and select the three nearest training instances. The following script calculates the distances between the test instance and the training instances, and identifies the most common sex of the nearest neighbors. # In[2]: x = np.array([[155, 70]]) distances = np.sqrt(np.sum((X_train - x)**2, axis=1)) distances # Out[2]: array([ 6.70820393, 21.9317122 , 31.30495168, 37.36308338, 21. , 13.60147051, 25.17935662, 16.2788206 , 15.29705854]) # In[3]: nearest_neighbor_indices = distances.argsort()[:3] nearest_neighbor_genders = np.take(y_train, nearest_neighbor_indices) nearest_neighbor_genders # Out[3]: array(['male', 'female', 'female'], dtype='|S6') # In[4]: from collections import Counter b = Counter(np.take(y_train, distances.argsort()[:3])) b.most_common(1)[0][0] # Out[4]: 'female' The following plots the query instance, indicated by the circle, and its three nearest neighbors, indicated by the enlarged markers: Two of the neighbors are female, and one is male. We therefore predict that the test instance is female. Now let’s implement a KNN classifier using scikit-learn. # In[5]: from sklearn.preprocessing import LabelBinarizer from sklearn.neighbors import KNeighborsClassifier lb = LabelBinarizer() y_train_binarized = lb.fit_transform(y_train) y_train_binarized # Out[5]: array([[1], [1], [1], [1], [0], [0], [0], [0], [0]]) # In[6]: K = 3 clf = KNeighborsClassifier(n_neighbors=K) clf.fit(X_train, y_train_binarized.reshape(-1)) prediction_binarized = clf.predict(np.array([155, 70]).reshape(1, -1))[0] predicted_label = lb.inverse_transform(prediction_binarized) predicted_label # Out[6]: array(['female'], dtype='|S6') Our labels are strings; we first use LabelBinarizer to convert them to integers. LabelBinarizer implements the transformer interface, which consists of the methods fit, transform, and fit_transform. fit prepares the transformer; in this case, it creates a mapping from label strings to integers. transform applies the mapping to input labels. fit_transform is a convenience method that calls fit and transform. A transformer should be fit only on the training set. Independently fitting and transforming the training and testing sets could result in inconsistent mappings from labels to integers; in this case, male might be mapped to 1 in the training set and 0 in the testing set. Fitting on the entire dataset should also be avoided, as for some transformers it will leak information about the testing set in to the model. This advantage won't be available in production, so performance measures on the test set may be optimistic. We wil discuss this pitfall more when we extract features from text. Next, we initialize a KNeighborsClassifier. Even through KNN is a lazy learner, it still implements the estimator interface. We call fit and predict just as we did with our simple linear regression object. Finally, we can use our fit LabelBinarizer to reverse the transformation and return a string label. Now let’s use our classifier to make predictions for a test set, and evaluate the performance of our classifier. height weight label 168 cm 65 kg male 170 cm 61 kg male 160 cm 52 kg female 169 cm 67 kg female # In[7]: X_test = np.array([ [168, 65], [180, 96], [160, 52], [169, 67] ]) y_test = ['male', 'male', 'female', 'female'] y_test_binarized = lb.transform(y_test) print('Binarized labels: %s' % y_test_binarized.T[0]) predictions_binarized = clf.predict(X_test) print('Binarized predictions: %s' % predictions_binarized) print('Predicted labels: %s' % lb.inverse_transform(predictions_binarized)) # Out[7]: Binarized labels: [1 1 0 0] Binarized predictions: [0 1 0 0] Predicted labels: ['female' 'male' 'female' 'female'] By comparing our test labels to our classifier's predictions, we find that it incorrectly predicted that one of the male test instances was female. There are two types of errors in binary classification tasks: false positives and false negatives. There are many performance measures for classifiers; some measures may be more appropriate than others depending on the consequences of the types of errors in your application. We will assess our classifier using several common performance measures, including accuracy, precision, and recall. Accuracy is the proportion of test instances that were classified correctly. Our model classified one of the four instances incorrectly, so the accuracy is 75%. # In[8]: from sklearn.metrics import accuracy_score print('Accuracy: %s' % accuracy_score(y_test_binarized, predictions_binarized)) # Out[8]: Accuracy: 0.75 Precision is the proportion of test instances that were predicted to be positive that are truly positive. In this example the positive class is male. The assignment of male and “female” to the positive and negative classes is arbitrary, and could be reversed. Our classifier predicted that one of the test instances is the positive class. This instance is truly the positive class, so the classifier’s precision is 100%. # In[9]: from sklearn.metrics import precision_score print('Precision: %s' % precision_score(y_test_binarized, predictions_binarized)) # Out[9]: Precision: 1.0 Recall is the proportion of truly positive test instances that were predicted to be positive. Our classifier predicted that one of the two truly positive test instances is positive. Its recall is therefore 50%. # In[10]: from sklearn.metrics import recall_score print('Recall: %s' % recall_score(y_test_binarized, predictions_binarized)) # Out[10]: Recall: 0.5 Sometimes it is useful to summarize precision and recall with a single statistic, called the F1-score or F1-measure. The F1-score is the harmonic mean of precision and recall. # In[11]: from sklearn.metrics import f1_score print('F1 score: %s' % f1_score(y_test_binarized, predictions_binarized)) # Out[11]: F1 score: 0.666666666667 Note that the arithmetic mean of the precision and recall scores is the upper bound of the F1 score. The F1 score penalizes classifiers more as the difference between their precision and recall scores increases. Finally, the Matthews correlation coefficient is an alternative to the F1 score for measuring the performance of binary classifiers. A perfect classifier’s MCC is 1. A trivial classifier that predicts randomly will score 0, and a perfectly wrong classifier will score -1. MCC is useful even when the proportions of the classes in the test set is severely imbalanced. # In[12]: from sklearn.metrics import matthews_corrcoef print('Matthews correlation coefficient: %s' % matthews_corrcoef(y_test_binarized, predictions_binarized)) # Out[12]: Matthews correlation coefficient: 0.57735026919 scikit-learn also provides a convenience function, classification_report, that reports the precision, recall and F1 score. # In[13]: from sklearn.metrics import classification_report print(classification_report(y_test_binarized, predictions_binarized, target_names=['male'], labels=[1])) # Out[13]: precision recall f1-score support male 1.00 0.50 0.67 2 avg / total 1.00 0.50 0.67 2 Summary In this article we learned about K Nearest Neighbors in which we saw that KNN is lazy learner as well as non-parametric model. We also saw about the classification of KNN. Resources for Article: Further resources on this subject: Introduction to Scikit-Learn [article] Machine Learning in IPython with scikit-learn [article] Machine Learning Models [article]

0
0
11248

Installing TensorFlow in Windows, Ubuntu and Mac OS

IPv6, Unix Domain Sockets, and Network Interfaces

Tree Test and Surveys

Some Basic Concepts of Theano

Exchange Management Shell Common Tasks

How to use Standard Macro in Workflows

Play With Functions

API Gateway and its Need

Is React Native is really Native framework?

Open and Proprietary Next Generation Networks

Trending Topics

Getting Started on the Raspberry Pi

VMware vSphere storage, datastores, snapshots

How to create a conversational assistant or chatbot using Python

How to develop a stock price predictive model using Reinforcement Learning and TensorFlow

K Nearest Neighbors

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access